Scotty: Efficient Window Aggregation for out-of-order
Link to publication 
Link to original publication 
Download Bibtex entry
Alejandro Rodríıguez Cuéllar
||International Conference on Data Engineering (ICDE 2018)
aggregates over windows is at the core of virtually every stream
processing job. Typical stream processing applications involve
overlapping windows and, therefore, cause redundant computations.
Several techniques prevent this redundancy by sharing partial
aggregates among windows. However, these techniques do not support
out-of-order processing and session windows. Out-of-order processing
is a key requirement to deal with delayed tuples in case of source
failures such as temporary sensor outages. Session windows are widely
used to separate different periods of user activity from each other.
In this paper, we present Scotty, a high throughput operator for
window discretization and aggregation. Scotty splits streams into
non-overlapping slices and computes partial aggregates per slice.
These partial aggregates are shared among all concurrent queries with
arbitrary combinations of tumbling, sliding, and session windows.
Scotty introduces the first slicing technique which (1) enables stream
slicing for session windows in addition to umbling and sliding windows
and (2) processes out-of-order tuples efficiently. Our technique is
generally applicable to a broad group of dataflow systems which use a
unified batch and stream processing model. Our experiments show that
we achieve a throughput an order of magnitude higher than alternative
------ Links: ------