Page Content
to Navigation
Abstract
Within the past decade, industrial and academic organizations have built many
software stacks for data-intensive analysis. More recently, there have been efforts
to create new data programming languages on top of these stacks to better suit
various target audiences. The current trend is to ease this process by employing
compiler frameworks that provide language building blocks and a common base
for optimizations known from different fields, such as relational database systems,
information extraction, machine learning and programming language compilers.
In this thesis, we present extensions to RelL, which is one such compiler framework.
On the one hand, we extend the imperative RelL programming interface (API)
for the specification of a more complete subset of the extended relational algebra.
One the other hand, we add an extensible rule-based optimization phase with
rules to reorder relational as well as user-defined operations to the RelL compiler.
As shown in case studies, the extended version of the RelL API is capable to express
analysis programs common in relational OLAP (Online Analytical Processing).
As shown through experimental evaluations, our approach to optimization is able
to reorder relational operations in this imperative setting, thereby improving
performance by orders of magnitude for an increasingly important program class.
software stacks for data-intensive analysis. More recently, there have been efforts
to create new data programming languages on top of these stacks to better suit
various target audiences. The current trend is to ease this process by employing
compiler frameworks that provide language building blocks and a common base
for optimizations known from different fields, such as relational database systems,
information extraction, machine learning and programming language compilers.
In this thesis, we present extensions to RelL, which is one such compiler framework.
On the one hand, we extend the imperative RelL programming interface (API)
for the specification of a more complete subset of the extended relational algebra.
One the other hand, we add an extensible rule-based optimization phase with
rules to reorder relational as well as user-defined operations to the RelL compiler.
As shown in case studies, the extended version of the RelL API is capable to express
analysis programs common in relational OLAP (Online Analytical Processing).
As shown through experimental evaluations, our approach to optimization is able
to reorder relational operations in this imperative setting, thereby improving
performance by orders of magnitude for an increasingly important program class.