Page Content
Publications
Citation key | KieferHBM17 |
---|---|
Author | Martin Kiefer, Max Heimel, Sebastian Breß, Volker Markl |
Year | 2017 |
DOI | 10.14778/3151106.3151112 |
Journal | Proceedings of the VLDB Endowment, Volume 10, No. 13. 2017 (to be presented in VLDB 2018) |
Volume | 2016-2017 |
Abstract | Accurately predicting the cardinality of intermediate plan operations is an essential part of any modern relational query optimizer. The accuracy of said estimates has a strong and direct impact on the quality of the generated plans, and incorrect estimates can have a negative impact on query performance. One of the biggest challenges in this field is to predict the result size of join operations. Kernel Density Estimation (KDE) is a statistical method to estimate multivariate probability distributions from a data sample. Previously, we introduced a modern, self-tuning selectivity estimator for range scans based on KDE that outperforms state-of-the-art multidimensional histograms and is efficient to evaluate on graphics cards. In this paper, we extend these bandwidth-optimized KDE models to estimate the result size of single and multiple joins. In particular, we propose two approaches: (1) Building a KDE model from a sample drawn from the join result. (2) Efficiently combining the information from base table KDE models. We evaluated our KDE-based join estimators on a variety of synthetic and real-world datasets, demonstrating that they are superior to state-of-the art join estimators based on sketching or sampling. |
Zusatzinformationen / Extras
Quick Access:
Schnellnavigation zur Seite über Nummerneingabe