TU Berlin

Database Systems and Information Management GroupStraQL - An SQL Optimizer for the Parallel Data Processor Stratosphere

Logo FG DIMA-new  65px

Page Content

to Navigation

Short info

Candidate: Moritz Ringler

Advisor: Fabian Hüske

Desired degree: Diploma

Abstract

Data parallel systems like MapReduce or Stratosphere usually have low level language interfaces, demanding the user to both know a complex programming language, and to be experienced with the underlying data flows to some extend. Also, in order to get well-performing plans, extensive knowledge of the data is necessary. Compared to classical relational database systems, this makes it considerably harder to use such systems.

We propose an additional layer on top of Stratosphere, which processes queries in an SQL-like language. It performs optimizations as known from classic database systems to define join ordering and operator selection. The implemented prototype allows convenient generation of complex PACT plans with dozends of operators in less than a second. Even with only little available statistics, the optimizer finds fairly good plans.

Navigation

Quick Access

Schnellnavigation zur Seite über Nummerneingabe