direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content

Publications

Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines
Citation key MonteZRM20
Author Bonaventura Del Monte, Steffen Zeuch, Tilmann Rabl, Volker Markl
Year 2020
Journal SIGMOD
Note A recording of the presentation ist available here: https://www.youtube.com/watch?v=PBwq8xxIfZ0

Presentation slides are available here: https://www.redaktion.tu-berlin.de/fileadmin/fg131/Conferences/Presentations/DelMonte_SIGMOD-2020_Rhino.pdf
Abstract Scale-out stream processing engines (SPEs) are powering large big data applications on high velocity data streams. Industrial setups require SPEs to sustain outages, varying data rates, and low-latency processing. SPEs need to transparently reconfigure stateful queries during runtime. However, state-of-the-art SPEs are not ready yet to handle on-the-fly reconfigurations of queries with terabytes of state due to three problems. These are network overhead for state migration, consistency, and overhead on data processing. In this paper, we propose Rhino, a library for efficient reconfigurations of running queries in the presence of very large distributed state. Rhino provides a handover protocol and a state migration protocol to consistently and efficiently migrate stream processing among servers. Overall, our evaluation shows that Rhino scales with state sizes of up to TBs, reconfigures a running query 15 times faster than the state-of- the-art, and reduces latency by three orders of magnitude upon a reconfiguration.
Link to publication Download Bibtex entry

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions