TU Berlin

Fachgebiet Datenbanksysteme und InformationsmanagementFast failure recovery for iterative algorithms in distributed dataflow systems

Logo FG DIMA-new  65px


zur Navigation

Es gibt keine deutsche Übersetzung dieser Webseite.

Basic Information

Student: Markus Holzemer

Advisor: Chen Xu

Degree: Master


When scaling out clusters to compute complex insights in long-running iterative jobs failures become quite frequent.
Therefore, the goal of this thesis was to find a recovery mechanism for distributed dataflow systems that minimizes the recovery time of iterative jobs while keeping the runtime overhead during normal execution as low as possible.
To achieve this we propose a non-blocking way of taking checkpoints and analyse the three different recovery methods simple checkpointing, confined recovery and replication based recovery both theoretical and with extensive experiments.



Schnellnavigation zur Seite über Nummerneingabe