direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Drei DIMA-Paper zur Präsentation auf der SIGMOD 2022 angenommen

Drei Forschungsarbeiten von Forscher:innen des Forschungsbereichs Datenbanksysteme und Informationsmanagement (DIMA) der TU Berlin und des DFKI-Forschungsbereichs Intelligente Analytik für Massendaten (IAM) wurden zur Präsentation auf der 2022 International Conference on Management of Data (SIGMOD) angenommen, welche vom 12. - 17. Juni in Philadelphia, USA, stattfinden wird.

Die Publikationen im Detail:

Bonaventura Del Monte, Steffen Zeuch, Tilmann Rabl, Volker Markl: Rethinking Stateful Stream Processing with RDMA. SIGMOD 2022, to appear
Preprint [PDF]

Remote Direct Memory Access (RDMA) hardware has bridged the gap between network and main memory speed and thus invalidated the common assumption that network is a bottleneck in distributed data processing systems. However, high-speed networks do not provide "plug-and-play" performance (e.g., using IP-over-InfiniBand) and require a careful co-design of system and application logic.
As a result, system designers need to rethink the architecture of their data management systems to benefit from RDMA acceleration.
In this paper, we focus on the acceleration of stream processing engines, which is challenged by real-time constraints and state consistency guarantees. To this end, we propose Slash, a novel stream processing engine that uses high-speed networks and RDMA to efficiently execute distributed streaming computations. Slash embraces a processing model suited for RDMA acceleration and omits expensive data pre-partitioning. Overall, Slash achieves a throughput improvement up to two orders of magnitude over existing systems deployed on an InfiniBand network. Furthermore, it is up to a factor of 22 faster than a self-developed solution that relies on RDMA-based data pre-partitioning to scale out query processing.


Clemens Lutz, Sebastian Breß,Steffen Zeuch, Tilmann Rabl, Volker Markl: Triton Join: Efficiently Scaling the Operator State on GPUs with Fast Interconnects. SIGMOD 2022, to appear

Database management systems are facing growing data volumes. Previous research suggests that GPUs are well-equipped to quickly process joins and similar stateful operators, as GPUs feature high-bandwidth on-board memory. However, GPUs cannot scale joins and similar stateful operators to large data volumes due to two limiting factors: (1) large state does not fit into the on-board memory, and (2) spilling state to main memory is constrained by the interconnect bandwidth. Thus, CPUs are often still the better choice for scalable data processing.
In this paper, we propose a new join algorithm that scales to large data volumes by taking advantage of fast interconnects. Fast interconnects such as NVLink 2.0 are a new technology that connect the GPU to main memory at a high bandwidth, and thus enable us to design our join to efficiently spill its operator state. Our evaluation shows that our Triton join outperforms a no-partitioning hash join by more than 100× on the same GPU, and a radix-partitioned join on the CPU by up to 2.5×. As a result, GPU-enabled DBMSs are able to scale beyond the GPU memory capacity.


Alexander Renz-Wieland, Rainer Gemulla, Zoi Kaoudi, Volker Markl: NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter Access. SIGMOD 2022, to appear
Preprint [PDF]

Parameter servers (PSs) facilitate the implementation of distributed training for large machine learning tasks. In this paper, we argue that existing PSs are inefficient for tasks that exhibit non-uniform parameter access; their performance may even fall behind that of single node baselines. We identify two major sources of such non-uniform access: skew and sampling. Existing PSs are ill-suited for managing skew because they uniformly apply the same parameter management technique to all parameters. They are inefficient for sampling because the PS is oblivious to the associated randomized accesses and cannot exploit locality. To overcome these performance limitations, we introduce NuPS, a novel PS architecture that (i) integrates multiple management techniques and employs a suitable technique for each parameter and (ii) supports sampling directly via suitable sampling primitives and sampling schemes that allow for a controlled quality--efficiency trade-off. In our experimental study, NuPS outperformed existing PSs by up to one order of magnitude and provided up to linear scalability across multiple machine learning tasks. 

Zusatzinformationen / Extras