TU Berlin

Database Systems and Information Management GroupSS10

Logo FG DIMA-new  65px

Page Content

to Navigation

Talks Research Colloquium

Talks SS2010
Mo. 19.07.2010
4 pm
Claudia Ermel
TU Berlin, Fak. IV, TFS
"Tool Support for Graph Modelling and Transformation based on the Eclipse Modeling Framework"
Mo. 05.07.2010
4 pm
Wolf-Tilo Balke
TU Braunschweig
"Intuitive Querying with Preferences: Exploiting Conceptual Knowledge"
Mo. 28.06.2010
4 pm
Andreas Hoheisel
Fraunhofer FIRST
"Verwendung von High-Level-Petrinetzen für die Automatisierung und Ausführung von Prozessen in SOA- und Grid-Umgebungen"
Mo. 14.06.2010
4 p.m.
Geerd-Dietger Hoffmann
UC London
"Making a thunderstorm with little clouds"
Mo. 31.05.2010
Fabian Hüske
TU-Berlin, DIMA
"Massively Parallel Analytics beyond Map/Reduce"

Mo. 17.05.2010
4 p.m.
Andranik Khachatyran
Karlsruhe Institute of Technology
"Quantifying Uncertainty in Multi-Dimensional Cardinality Estimations"

Claudia Ermel, TU Berlin, TFS


"Tool Support for Graph Modelling and Transformation based on the Eclipse Modeling Framework"


Throughout the history of software engineering, graph models have been used for software system design, such as the diagram types offered by the UML to model different static and dynamic system aspects. Yet, when it comes to programming, often enough a gap opens between what the modellers mean when designing their graph models and what the programmers encode using standard textual programming languages.

Hence, the objective of model-driven development (MDD) is generating code from a higher-level visual system model. This means that, for software developers the abstraction level is now raised. No longer do they need to worry about technical details and features of programming languages but can concentrate on more creative parts of software engineering: analysis, design and validation, all based on models. The MDD perspective raises the importance of graph models and calls for rigorous methods to capture the semantics of models and their evolution over time.

In this talk we will discuss graph transformation as a formal basis for model transformation, a key technique of MDD.

We present the EMF model transformation tool Henshin (an Eclipse EMF Technology subproject), where the concepts of graph transformation are applied to model, visualize and analyze complex transformations of EMF models.


Claudia Ermel received her diploma and PhD in computer science at TU Berlin.

She is currently working in the group "Theoretical Computer Science - Formal Specification (TFS)" of Prof. Ehrig at TU Berlin.

Her research interests are visual modeling, model driven development, graph and model transformation, and Petri nets. She is involved in two current DFG projects, forMAlNET (Formal Modeling and Analysis of Flexible Processes in Mobile Ad-hoc Networks) and BehaviorGT (Behaviour Simulation and Equivalence of Systems Modelled by Graph Transformation).

In her visual language student projects she teaches the development of visual modeling tools based on Eclipse EMF.

Wolf-Tilo Balke TU Braunschweig

"Intuitive Querying with Preferences: Exploiting Conceptual Knowledge"


Information gathering by accessing structured data in Web databases has changed our daily life. Typical examples like reading digital product review pages or using price comparison pages for the area of online shopping show that the available information can indeed be used effectively. Unfortunately, users face flood of information on the Web and are bound to experience problems when accessing heterogeneous sources in what is known as the Hidden Web. Thus, information gathering cannot always be performed efficiently. This is especially true for preference-based retrieval, where users only have a vague information need that should be satisfied. Many real world concepts are rather hard to describe: it is intuitively clear what a good solution to the need would be, but putting all important attributes and desired values into a specific database query is difficult. We will discuss how results from cognitive psychology can be applied to ease the load of querying databases and information systems.


Wolf-Tilo Balke currently is the chair of information systems at TU Braunschweig, Germany and a director of L3S Research Center of University of Hannover, Germany. Before that he was a research fellow at the University of California at Berkeley. His research is in the area of information systems and service provisioning, including preference-based retrieval algorithms and ontology-based discovery and selection of services. Wolf-Tilo Balke is the recipient of two Emmy-Noether-Grants of Excellence by the German Research Foundation (DFG) and the Scientific Award of the University Foundation Augsburg. He has received his B.A and MS degree in mathematics and a PhD in computer science from University of Augsburg, Germany.

Andreas Hoheisel Fraunhofer FIRST

"Verwendung von High-Level-Petrinetzen für die Automatisierung und Ausführung von Prozessen in SOA- und Grid-Umgebungen"


Die Modellierung, Automatisierung und Ausführung von Prozessen sind die
Kernaufgaben von Geschäftsprozess-Management-Systemen (BPM-Systemen). Der
Vortrag beschreibt in diesem Kontext eigene Ansätze zur Erweiterung und
Konkretisierung von High-Level-Petrinetzen für die Modellierung von
IT-Prozessen und deren Ausführung in verteilten Systemen, wie zum Beispiel
SOA- oder Grid-Umgebungen. Der Schwerpunkt der Arbeiten liegt hierbei in der
Abbildung von fachlichen Prozessmodellen auf ausführbare Prozesse sowie dem
effizienten Scheduling von nebenläufigen Aktivitäten auf verteilte
Ressourcen. Hierzu wurde eine petrinetzbasierte Prozessbeschreibungssprache
(GWorkflowDL) sowie eine dazugehörige Prozess-Engine (Generic Workflow
Execution Service - GWES) entwickelt, welche die Zuordnung und Zeitplanung
von Ressourcen an externe Komponenten delegiert. Die Technologie wird
derzeit insbesondere in mehreren Projekten der Deutschen Grid-Initiative
(D-Grid) eingesetzt.


Andreas Hoheisel hat an der Universität zu Köln Geophysik studiert und sein Diplom im Jahr 2000 mit Auszeichnung erhalten. Seit dem Jahr 2000 arbeitet er als wissenschaftlicher Mitarbeiter beim Fraunhofer-Institut für Rechnerarchitektur und Softwaretechnik (FIRST) in Berlin/Adlershof. Seine Forschungsthematik entwickelte sich dabei von der Untersuchung von Kopplungsmechanismen für physikalische Simulationsmodelle hin zu allgemeinen Methoden des Prozessmanagements und der Ressourcenplanung in verteilten IT-Systemen (SOA, Grid, Cloud). Seine Forschungsergebnisse sind in zahlreiche nationale und internationale Projekte eingeflossen, wie zum Beispiel in die EU-Projekte K-Wf Grid und CoreGRID, in denen er als Arbeitspaketleiter die Forschungs- und Entwicklungsaktivitäten im Bereich Workflow-Management geleitet hat. Zudem war und ist Andreas Hoheisel in mehreren Verbundprojekten der Deutschen Grid-Initiative "D-Grid" tätig.

Geerd-Dietger Hoffmann

Title: "Making a thunderstorm with little clouds"


Cloud Computing has moved passed the stage of a mere buzzword and very
interesting trends are emerging from this hype. More and more
companies are starting to think about and actively using this novel
technology for daily operations. In this talk Hoffmann Geerd-Dietger
will present some of the problems the future might pose and show how
they could be solved on a program language level. This includes
scalability and vendor lock-in concerns which are amplified through
the growing number of client devices which are recently flooding the
consumer market.


Hoffmann Geerd-Dietger is currently completing his MSc in Software
Engineering at UCL in London. Switching between industry, including
CERN as well as IBM, and completing his BSc in Bournemouth with a 1st
class honors he has engaged in a wide range of topics. He is the
creator of the award winning Objic programming language and involved
in various Open Source projects. In recent presentations and papers he
is showing a growing interest in Cloud Computing.

Fabian Hueske, TU-Berlin, DIMA

Titel: "Massively Parallel Analytics beyond Map/Reduce"


"The map/reduce programming model and its open source execution framework
Hadoop have gained a lot of attention over the last years. Today, they
are widely used to implement and execute data analysis tasks in
parallel. Two important reasons for the popularity of map/reduce
programming model are its simplicity and the abstraction of parallelism.
To define a data processing task, a developer only must implement two
functions without providing any code for parallelization. When executing
the task, the Hadoop framework takes care of parallelization and ensures
fault tolerance.
While some data processing tasks, such as filter-group-aggregate
perfectly fit into the map/reduce model, more complex tasks cannot be
nicely expressed using map/reduce. Some of these tasks can be pushed
into map/reduce using workarounds. However, this often requires breaking
the programming model and comes at cost of loosing the abstraction of
parallelism. Custom parallelization code which hard wires the routing of
data must be written by the developer. Consequently, Hadoop is not able
to reason about or optimize and adapt the parallelization strategy
In this talk we propose the PACT programming model which is based on a
generalization of map/reduce. The PACT model provides the same level of
parallel abstraction as map/reduce but is more expressive which
significantly eases the definition of more complex data processing
tasks. We will present a selection of data processing tasks and their
implementations using the PACT programming model. Furthermore, we
briefly discuss how PACT programs can be optimized and executed.
The talk concludes with a video demo of our prototype."

Bio: Fabian Hüske

Andranik Khachatyran Karlsruhe Institute of Technology (KIT)

"Quantifying Uncertainty in Multi-Dimensional Cardinality Estimations"

"The database query optimizer needs precise cardinality estimates to
come up with a good execution plan. For this, the system often uses
concise data summaries, e.g. histograms. Much research has been
conducted to make these histograms as precise as possible. However, due
to the fact that histograms usually have to be small (several
kilobytes), they cannot capture the underlying data distribution very
precisely, i.e., regardless of how you construct them, they will produce
certain estimation errors. This is particularly true when the
cardinality space is multi-dimensional.
To avoid bad plan choices which result from imprecise cardinality
estimates, the optimizer needs to take into account the uncertainty
which comes with the estimates. This can be done by generalizing
cardinality estimates. Instead of conventional estimates, the optimizer
can use a probability distribution of possible cardinality values. We
show how to derive such distributions using only a conventional
histogram, in a multi-dimensional space. We propose two approaches which
produce cardinality distributions: both approaches are backward
compatible with conventional estimates, do not require memory additional
to the histogram, are computationally inexpensive and have some
important theoretical properties."

"Andranik Khachatryan is a PhD student in Karlsruhe Institute of
Technology (KIT). His research topic is query optimization and
processing in relational systems."


Quick Access

Schnellnavigation zur Seite über Nummerneingabe