TU Berlin

Database Systems and Information Management GroupPublications

Logo FG DIMA-new  65px

Page Content

to Navigation


AStream: Ad-hoc Shared Stream Processing
Citation key KarimovRM19
Author Jeyhun Karimov, Tilmann Rabl, Volker Markl
Title of Book Proceedings of the 2019 International Conference on Management of Data (SIGMOD 2019)
Pages 607–622
Year 2019
DOI https://doi.org/10.1145/3299869.3319884
Journal ACM SIGMOD/PODS, Amsterdam, NL. 2019
Volume 2019
Organization ACM
Abstract In the last decade, many distributed stream processing engines (SPEs) were developed to perform continuous queries on massive online data. The central design principle behind these engines is to handle queries that potentially run forever on data streams with a query-at-a-time model, i.e., each query is optimized and executed separately. In many real applications, streams are not only processed with long-running queries, but also thousands of short-running ad-hoc queries. To support this efficiently, it is essential to share resources and computation for stream ad-hoc queries in a multi-user environment. The goal of this paper is to bridge the gap between stream processing and ad-hoc queries in SPEs by sharing computation and resources. We define three main requirements for ad-hoc shared stream processing: (1) Integration: Ad-hoc query processing should be a composable layer which can extend stream operators, such as join, aggregation, and window operators; (2) Consistency: Ad-hoc query creation and deletion must be performed in a consistent manner and ensure exactly-once semantics and correctness; (3) Performance: In contrast to state-of-the-art SPEs, ad-hoc SPE should not only maximize data throughput but also query throughout via incremental computation and resource sharing. Based on the requirements above, we have developed AStream, an ad-hoc, shared computation stream processing framework. To the best of our knowledge, AStream is the first system that supports distributed ad-hoc stream processing. We develop AStream on top of Apache Flink. Our experiments show that AStream shows comparable results to Flink for single query deployments and outperforms it in orders of magnitude with multiple queries.
Link to publication Link to original publication Download Bibtex entry


Quick Access

Schnellnavigation zur Seite über Nummerneingabe