direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Managing Very Large Distributed State for Scalable Stream Processing

Problem

Data stream processing systems are unable to fully cope with the massive amounts of complex-data generated at high-rates in Big Data-, Industry 4.0-, and IoT-based applications.

 

Challenge

Guaranteeing fault-tolerance, resource elasticity, and dynamic load balance requires the transfer of state, which in turn introduces latency, proportional to its size. Exactly-once stream processing engines (SPEs) require consistent state (i.e., results must be accurate, regardless of system failure, and the rescaling & rebalancing of operations on the state). SPEs must be able to continuously process stream tuples despite any of these operations. To the best of our knowledge, there is no stream processing system that fully offers robust state management, in order to efficiently handle very large, distributed state.

 

Existing SPEs (e.g., Apache Flink, Apache Spark, Apache Samza, and Timely Dataflow) offer fast stateful processing of data streams with low-latency and high-throughput, despite fluctuations in the data rate. However, stateful processing would further benefit from on-demand resource elasticity. Today, academic and industry researchers are able to address resource elasticity for stateful processing, while ensuring fault tolerance solely for partitioned or partially-distributed large state. Many streaming applications require stateful processing and generate large state that pushes SPEs to their limits (e.g., multimedia services, online marketplaces). In these applications, the size of the state can swell (up to many TBs). Current SPEs fail to use computing resources efficiently for large state sizes.

 

Objectives

1. To address scalable data stream processing and analytics challenges arising in Big Data, Cloud Computing, Industry 4.0, and IoT (Internet of Things) applications.

2. To develop a novel state management solution for scalable (i.e., low-latency, high-throughput) stream processing that enables fine-grained fault-tolerance, on-demand resource scaling, and load balancing in the presence of very large (e.g., hundreds of GBs) distributed state.

To devise a technological framework that seamlessly provides fault-tolerance, resource-scaling with zero downtime, and offers high-resource efficiency, lower operational costs, and reduced time-to-knowledge to end-users working on large-scale data applications.

Lupe

Das Rhino Projekt wird als Teil des Software Campus Programms vom Bundesministerium für Bildung und Forschung gefördert und von Huawei Technologies unterstützt.

[1] Software Campus

Gefördert vom Bundesministerium für Bildung und Forschung (BMBF) ist Software Campus (SC) ein Führungskräfteentwicklungsprogramm um die IT–Führungskräfte von morgen auszubilden. SC verknüpft Spitzenforschung und Managementpraxis auf eine neuartige, innovative Art und Weise. Es richtet sich an hervorangende Doktorandinnen und Doktoranden der Informatik, die daran interessiert sind, zukünftig Führungsaufgaben in der Wirtschaft zu übernehmen. Die Teilnemenden setzen in Kooperation mit Industriepartnern während ein bis zwei Jahren ihr eigenes Forschungsprojekt um.

Kickoff des Jahrgangs 2017

Projektdauer: 03/2019 - 02/2021

Supervisor: Prof. Dr. Volker Markl

Industriepartner

Lupe
Lupe

Zusatzinformationen / Extras

Direktzugang:

Schnellnavigation zur Seite über Nummerneingabe