direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Short info

Candidate: Moritz Ringler

Advisor: Fabian Hüske

Desired degree: Diploma

Abstract

Data parallel systems like MapReduce or Stratosphere usually have low level language interfaces, demanding the user to both know a complex programming language, and to be experienced with the underlying data flows to some extend. Also, in order to get well-performing plans, extensive knowledge of the data is necessary. Compared to classical relational database systems, this makes it considerably harder to use such systems.

We propose an additional layer on top of Stratosphere, which processes queries in an SQL-like language. It performs optimizations as known from classic database systems to define join ordering and operator selection. The implemented prototype allows convenient generation of complex PACT plans with dozends of operators in less than a second. Even with only little available statistics, the optimizer finds fairly good plans.

Zusatzinformationen / Extras