direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Two DIMA Research Papers Accepted for Presentation at VLDB 2022

Two research papers, authored by researchers of the Database Systems and Information Management (DIMA) Group and the DFKI research department Intelligent Analytics for Massive Data (IAM) were accepted for presentation at the 48th International Conference on Very Large Data Bases (VLDB) in Sydney, Australia - September 05-09, 2022.

The publications in detail

Kajetan Maliszewski, Jorge Arnulfo Quiane Ruiz, Jonas Traub, Volker Markl: What Is the Price for Joining Securely? Benchmarking Equi-Joins in Trusted Execution Environments. Proc. VLDB Endow. 15(2) (2022)
Preprint [PDF]

Protection of personal data has been raised to be among the top requirements of modern systems. At the same time, it is now frequent that the owner of the data and the owner of the computing infrastructure are two entities with limited trust between them (e. g., volunteer computing or the hybrid-cloud). Recently, trusted execution environments (TEEs) became a viable solution to ensure the security of systems in such environments. However, the performance of relational operators in TEEs remains an open problem. We conduct a comprehensive experimental study to identify the main bottlenecks and challenges when executing relational equijoins in TEEs. For this, we introduce TEEbench, a framework for unified benchmarking of relational operators in TEEs, and use it for conducting our experimental evaluation. In a nutshell, we perform the following experimental analysis for eight core join algorithms: off-the-shelf performance; the performance implications of data sealing and obliviousness; sensitivity and scalability. The results show that all eight join algorithms significantly suffer from different performance bottlenecks in TEEs. They can be up to three orders of magnitude slower in TEEs than on plain CPUs. Our study also indicates that existing join algorithms need a complete, hardware aware redesign to be efficient in TEEs, and that, in secure query plans, managing TEE features is equally important to join selection.


Philipp M. Grulich, Steffen Zeuch, Volker Markl: Babelfish: Efficient Execution of Polyglot Queries. Proc. VLDB Endow. 15(2) (2022)

Today’s users of data processing systems come from different domains, have different levels of expertise, and prefer different programming languages. As a result, analytical workload requirements shifted from relational to polyglot queries involving user-defined functions (UDFs). Although some data processing systems support polyglot queries, they often embed third-party language runtimes. This embedding induces a high-performance overhead, as it causes additional data materialization between execution engines. In this paper, we present Babelfish, a novel data processing engine designed for polyglot queries. Babelfish introduces an intermediate representation that unifies queries from different implementation languages. This enables new, holistic optimizations across operator and language boundaries, e.g., operator fusion and workload specialization. As a result, Babelfish avoids data transfers and enablesefficient utilization of hardware resources. Our evaluation shows that Babelfish outperforms state-of-the-art data processing systems by up to one order of magnitude and reaches the performance of handwritten code. With Babelfish, we bridge the performance gap between relational and multi-language UDFs and lay the foundation for the efficient execution of future polyglot workloads.

Zusatzinformationen / Extras