TU Berlin

Fachgebiet Datenbanksysteme und InformationsmanagementPublikationen

Logo FG DIMA-new  65px

Inhalt

zur Navigation

Publikationen

Distributed Graph Analytics with Datalog Queries in Flink
Zitatschlüssel ImranGM20
Autor Muhammad Imran, Gábor E. Gévay, Volker Markl
Jahr 2020
Journal LSGDA 2020 - International Workshop on Large Scale Graph Data Analytics
Notiz A recording of the presentation is available here: https://www.youtube.com/watch?v=Ozvr1wrQcy4

Presentation slides are available here: https://www.redaktion.tu-berlin.de/fileadmin/fg131/Conferences/Presentations/Imran-LSGDA-2020.pdf
Zusammenfassung Large-scale, parallel graph processing has been in demand over the past decade. Succinct program structure and efficient execution are among the essential requirements of graph processing frameworks. In this paper, we present Cog, which executes Datalog programs on the Apache Flink distributed dataflow system. We chose Datalog for its compact program structure and Flink for its efficiency. We implemented a parallel semi-naive evaluation algorithm exploiting Flink's delta iteration to propagate only the tuples that need to be further processed to the subsequent iterations. Flink's delta iteration feature reduces the overhead present in acyclic dataflow systems, such as Spark, when evaluating recursive queries, hence making it more efficient. We demonstrated in our experiments that Cog outperformed BigDatalog, the state-of-the-art distributed Datalog evaluation system, in most of the tests.
Link zur Publikation Download Bibtex Eintrag

Navigation

Direktzugang

Schnellnavigation zur Seite über Nummerneingabe