TU Berlin

Fachgebiet Datenbanksysteme und InformationsmanagementPublikationen

Logo FG DIMA-new  65px

Inhalt

zur Navigation

Publikationen

Continuous Deployment of Machine Learning Pipelines
Zitatschlüssel DerakhshanMRM2019
Autor Behrouz Derakhshan, Alireza Rezaei Mahdiraji, Tilmann Rabl, Volker Markl
Seiten 397-408
Jahr 2019
Journal proceedings of the 22nd International Conference on Extending Database Technology (EDBT 2019)
Jahrgang 2019
Herausgeber OpenProceedings.org, ISBN: 978-3-89318-081-3
Edition Electronic Edition Series ISSN: 2367-2005
Zusammenfassung Today machine learning is entering many business and scientiÿc applications. The life cycle of machine learning applications con-sists of data preprocessing for transforming the raw data into features, training a model using the features, and deploying the model for answering prediction queries. In order to guarantee accurate predictions, one has to continuously monitor and update the deployed model and pipeline. Current deployment platforms update the model using online learning methods. When online learning alone is not adequate to guarantee the prediction ac-curacy, some deployment platforms provide a mechanism for automatic or manual retraining of the model. While the online training is fast, the retraining of the model is time-consuming and adds extra overhead and complexity to the process of deployment. We propose a novel continuous deployment approach for updat-ing the deployed model using a combination of the incoming real-time data and the historical data. We utilize sampling techniques to include the historical data in the training process, thus eliminating the need for retraining the deployed model. We also o˛er online statistics computation and dynamic materialization of the prepro-cessed features, which further reduces the total training and data preprocessing time. In our experiments, we design and deploy two pipelines and models to process two real-world datasets. The experiments show that continuous deployment reduces the total training cost up to 15 times while providing the same level of qual-ity when compared to the state-of-the-art deployment approaches.
Link zur Publikation Link zur Originalpublikation Download Bibtex Eintrag

Navigation

Direktzugang

Schnellnavigation zur Seite über Nummerneingabe