Time Series Similarity Search for Streaming Data in
Link to publication 
Link to original publication 
Download Bibtex entry
Ziehn, Marcela Charfuelan, Holmer Hemsen, Volker Markl
Erratum is available below under "link to publication".
Workshop, co-located with EDBT/ICDT 2019
paper we propose a practical study and demonstration of time series
similarity search in modern distributed data processing platforms for
stream data. After an intensive literature review, we implement a
flexible similarity search application in Apache Flink, which includes
the most commonly used distance measurements: Euclidean distance and
Dynamic Time Warping. For efficient and accurate similarity search we
evaluate normalization and pruning techniques developed for single
machine processing and demonstrate that they can be adapted and
leveraged for those distributed platforms. Our final implementation is
capable of monitoring many time series in real-time and parallel.
Further, we demonstrate that the number of required parameters can be
reduced and optimally derived from data properties. We evaluate our
application by comparing its performance with electrocardiogram data
on a cluster with several nodes. We reach average response times of
less than 0,1 ms for windows of 2 s of data, which allow fast
reactions on matching sequences.
------ Links: ------