direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Es gibt keine deutsche Übersetzung dieser Webseite.

Open Source Projects

The following open-source projects have come out of the DIMA group at TU Berlin.



Emma is a quotation-based Scala DSL that enables holistic optimizations of data flow programs for scalable data analysis on Apache Flink and Spark.




Flink / Stratosphere


"Apache Flink" [1] is a stream-processing framework for distributed, high-performing, always-available, and accurate data streaming applications. It originated from the joined research project "Stratosphere" [2], funded by the Deutsche Forschungsgemeinschaft (DFG). After a successful incubator phase, Flink graduated to a top-level project of the Apache Foundation [3] and became one of the most important and promising projects within the Apache Big Data Stack. Flink has a big and lively community, numerous well-known users, such as Zalando, Alibaba, and Netflix, and features it's own annually conference "FlinkForward" [4] taking place in Berlin and San Francisco.

[1] https://flink.apache.org

[2] http://stratosphere.eu/

[3] https://www.apache.org/

[4] https://flink-forward.org/

Hawk - A Hardware Adaptive Query Compiler


The performance of modern processors is primarily bound by a fixed energy budget. This power wall forces processor vendors to specialize their processors to certain applications to provide the speedups users expect.






The Myriad Toolkit facilitates the specification of scalable data generation programs with complex statistical constraints via a special XML data generator prototyping language.

The Myriad Toolkit uses advanced PRNG algorithms to implement offset-based access to the elements of the generated domain type sequences within a bounded time. This feature facilitates an efficient data-parallel execution mode. Data generation programs created with the Myriad Toolkit therefore can be scaled-out in a massively parallel manner in order to quickly generate large synthetic datatets with complex statistical dependencies.





Peel is a framework that helps you to define, execute, analyze, and share experiments for distributed systems and algorithms. A Peel package bundles together the configuration data, datasets, and workload applications required for the execution of a particular collection of experiments. Peel bundles can be largely decoupled from the underlying operational environment and easily migrated and reproduced to new environments.​


Zusatzinformationen / Extras


Schnellnavigation zur Seite über Nummerneingabe