TU Berlin

Fachgebiet Datenbanksysteme und InformationsmanagementVLDB 2022 acceptances

Logo FG DIMA-new  65px

Inhalt

zur Navigation

Zwei DIMA-Forschungsarbeiten zur Präsentation auf der VLDB 2022 angenommen

Zwei Forschungsarbeiten von Forscher:innen des Forschungsbereichs Datenbanksysteme und Informationsmanagement (DIMA) der TU Berlin und des DFKI-Forschungsbereichs Intelligente Analytik für Massendaten (IAM) wurden zur Präsentation auf der 48th International Conference on Very Large Data Bases (VLDB) angenommen, die vom 05. bis 09. September 2022 in Sydney, Australien, stattfinden wird.

Die Publikationen im Detail:

Kajetan Maliszewski, Jorge Arnulfo Quiane Ruiz, Jonas Traub, Volker Markl: What Is the Price for Joining Securely? Benchmarking Equi-Joins in Trusted Execution Environments. Proc. VLDB Endow. 15(2) (2022)
Preprint [PDF]

Abstract:
Protection of personal data has been raised to be among the top requirements of modern systems. At the same time, it is now frequent that the owner of the data and the owner of the computing infrastructure are two entities with limited trust between them (e. g., volunteer computing or the hybrid-cloud). Recently, trusted execution environments (TEEs) became a viable solution to ensure the security of systems in such environments. However, the performance of relational operators in TEEs remains an open problem. We conduct a comprehensive experimental study to identify the main bottlenecks and challenges when executing relational equijoins in TEEs. For this, we introduce TEEbench, a framework for unified benchmarking of relational operators in TEEs, and use it for conducting our experimental evaluation. In a nutshell, we perform the following experimental analysis for eight core join algorithms: off-the-shelf performance; the performance implications of data sealing and obliviousness; sensitivity and scalability. The results show that all eight join algorithms significantly suffer from different performance bottlenecks in TEEs. They can be up to three orders of magnitude slower in TEEs than on plain CPUs. Our study also indicates that existing join algorithms need a complete, hardware aware redesign to be efficient in TEEs, and that, in secure query plans, managing TEE features is equally important to join selection.

 

Philipp M. Grulich, Steffen Zeuch, Volker Markl: Babelfish: Efficient Execution of Polyglot Queries. Proc. VLDB Endow. 15(2) (2022)
[PDF]

Abstract:
Today’s users of data processing systems come from different domains, have different levels of expertise, and prefer different programming languages. As a result, analytical workload requirements shifted from relational to polyglot queries involving user-defined functions (UDFs). Although some data processing systems support polyglot queries, they often embed third-party language runtimes. This embedding induces a high-performance overhead, as it causes additional data materialization between execution engines. In this paper, we present Babelfish, a novel data processing engine designed for polyglot queries. Babelfish introduces an intermediate representation that unifies queries from different implementation languages. This enables new, holistic optimizations across operator and language boundaries, e.g., operator fusion and workload specialization. As a result, Babelfish avoids data transfers and enablesefficient utilization of hardware resources. Our evaluation shows that Babelfish outperforms state-of-the-art data processing systems by up to one order of magnitude and reaches the performance of handwritten code. With Babelfish, we bridge the performance gap between relational and multi-language UDFs and lay the foundation for the efficient execution of future polyglot workloads.

Navigation

Direktzugang

Schnellnavigation zur Seite über Nummerneingabe