Abstract |
Window aggregation is a core operation in data stream processing. Existing aggregation techniques focus on reducing latency, elim-inating redundant computations, and minimizing memory usage. However, each technique operates under different assumptions with respect to workload characteristics such as properties of ag-gregation functions (e.g., invertible, associative), window types (e.g., sliding, sessions), windowing measures (e.g., time- or count-based), and stream (dis)order. Violating the assumptions of a tech-nique can deem it unusable or drastically reduce its performance.
In this paper, we present the first general stream slicing tech-nique for window aggregation. General stream slicing automat-ically adapts to workload characteristics to improve performance without sacrificing its general applicability. As a prerequisite, we identify workload characteristics which affect the performance and applicability of aggregation techniques. Our experiments show that general stream slicing outperforms alternative con-cepts by up to one order of magnitude. |