TU Berlin

Database Systems and Information Management GroupPublications

Logo FG DIMA-new  65px

Page Content

to Navigation


Distributed Graph Analytics with Datalog Queries in Flink
Citation key ImranGM20
Author Muhammad Imran and Gábor E. Gévay and Volker Markl
Year 2020
Journal LSGDA 2020 - International Workshop on Large Scale Graph Data Analytics
Note A recording of the presentation is available here: https://www.youtube.com/watch?v=Ozvr1wrQcy4

Presentation slides are available here: https://www.redaktion.tu-berlin.de/fileadmin/fg131/Conferences/Presentations/Imran-LSGDA-2020.pdf
Abstract Large-scale, parallel graph processing has been in demand over the past decade. Succinct program structure and efficient execution are among the essential requirements of graph processing frameworks. In this paper, we present Cog, which executes Datalog programs on the Apache Flink distributed dataflow system. We chose Datalog for its compact program structure and Flink for its efficiency. We implemented a parallel semi-naive evaluation algorithm exploiting Flink's delta iteration to propagate only the tuples that need to be further processed to the subsequent iterations. Flink's delta iteration feature reduces the overhead present in acyclic dataflow systems, such as Spark, when evaluating recursive queries, hence making it more efficient. We demonstrated in our experiments that Cog outperformed BigDatalog, the state-of-the-art distributed Datalog evaluation system, in most of the tests.
Link to publication Link to original publication Download Bibtex entry


Quick Access

Schnellnavigation zur Seite über Nummerneingabe