direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Talks Research Colloquium

Talks WS09/10
Talk/Location
Lecturer/Subject
Fr. 12.02.2010
12.00
DIMA EN719
Isabell Drost
"Open source at DIMA"
Mo. 08.02.2010
4 p.m.
DIMA EN 719
Jörg Bienert, Nobert Heußer, Empulse GmbH
"Fast search and analysis of large data volumes in business applications with a novel DBMS architecture"
Di. 19.01.2010
t.b.a.
DIMA
Dr. Stephan Manegold
CWI, Amsterdam, NL
"Performance Evaluation in Database Research: Principles and Experience (PDF, 1,8 MB)"
Mo. 14.12.2009
4 p.m.
DIMA EN 719
Prof. Dr. Robert Tolksdorff
FU Berlin, AG NBI
"Swarming und semantische Speicher"
Mo. 07.12.2009
4 p.m.
DIMA EN 719
Daniel Bößwetter
HU Berlin, DBIS
"From Rows to Columns and Back Again"
Mo. 23.11.2009
11.00 a.m.
DIMA EN 053
Dr. Melanie Herrschel
Univ. Tübingen
"Anfrageanalyse mit Nautilus""
Mo. 26.10.2009
4 p.m.
DIMA EN 719
Frank Huber
HU Berlin, DBIS
"Query Processing on Multi-Core Architectures"
Mo. 19.10.2009
4 p.m.
DIMA EN 719

Prof. Dr. Christian Bizer
FU Berlin, FB Wirtschaftswissenschaft
"The Emerging Web of Linked Data"

Isabell Drost, co-founder of and committer at Apache Mahout.

Abstract

"Open source at DIMA"

Last semester a project was started at DIMA in collaboration with Apache Mahout committer Isabel Drost to raise the understanding of working with and on open source projects among students. The goal was to build an integrated system to crawl blog postings, group them by similarity and make them searchable.
The students worked in a group to get the system up and running, theycollaborated with the open source community to solve problems, ask questions andget feedback on the system.
As asked for by the participants of the project, the presentation will give avery brief overview of the course: Its goals, its general setup and the results.Students will include their view on the course, what they liked in particular and what was new to them. In addition it will include insights into how studentscan benefit from working on open source projects. Finally the talk will give anoverview of the Apache Software foundation, its structure and its values.


Bio:

Isabel Drost is co-founder of and committer at Apache Mahout. She is memberof the PMC of the community development project at Apache. In her spare time sheis organising the Apache Hadoop Get Together in Berlin. Isabel is a frequent speaker on Mahout and Hadoop at international FOSS conferences.

When: Friday, Feb 12 12:00 noon till 13:00Where: computer science lab DIMA, EN 719

Jörg Bienert, Nobert Heußer, Empulse GmbH

Abstract:

"Fast search and analysis of large data volumes in business applications with a novel DBMS architecture"

This talk presents how to use modern graphics hardware and CUDA for fast search and analysis of large data volumes in business applications.

Empulse GmbH has built a special purpose database system for the travel industry with a unique architecture, where indexing takes place on the graphics chip. We will learn about this architecture, design, and challenges.

Dr. Stephan Manegold, CWI, Amsterdam, NL

"Performance Evaluation in Database Research: Principles and Experience (PDF, 1,8 MB)"

Abstract:

A significant part of today's database research focuses on improving
performance of a specific system. Quantitative experiments are the best way
to validate such results. However, performing experiments is not always
easy. Besides the complexity of the system under test, designing an
experiment, choosing the right environment and parameter values, analyzing
the data which is gathered, and reporting it to a third party in an
expressive and intelligible way is hard.
In this talk, we present a general road-map to the above steps, including
tips and tricks on how to organize and present code that performs
experiments, so that an outsider can repeat them.


Bio:

Stefan Manegold is a tenured researcher in the database architecture
research group at CWI in Amsterdam, The Netherlands. He received his PhD
from the University of Amsterdam, The Netherlands, in 2002 and his Master
(Diplom) in computer science from the Technical University of Clausthal,
Germany, in 1994.

Manegold's research work comprises database architectures, query processing
algorithms and data management on modern hardware, as well as leveraging
column-store database technology for efficient and scalable XML/XQuery
processing, with a particular focus on optimization, performance,
benchmarking and testing. Manegold co-authored of more than 40 scientific
publications, and recently received the VLDB 2009 10-year Best Paper Award
together with his co-authors Peter Boncz & Martin Kersten.

Stefan Manegold is a core member of the developers team of the open-source
column-oriented database system MonetDB, co-founder of the DaMoN workshop
series (co-located with SIGMOD since 2005), and co-chair of the
Repeatability and Workability Evaluation for SIGMOD 2009 & 2010.

URL: http://homepages.cwi.nl/~manegold/

Prof. Dr. Robert Tolksdorf, FU Berlin, Informatik, AG NBI

Abstract:

"Swarming und Semantische Speicher"


Für die verteilte Speicherung grosser Informationsmengen im Semantic
Web gibt es noch keine überzeugend skalierbaren Lösungen. In RDFSpaces
verfolgen wir einen naturinspirierten Ansatz, bei dem einzelne RDF-Triple
von virtuellen Ameisen unter Berücksichtigung von semantischen
Informationen verteilt, geclustert und abgefragt werden. In dem
Vortrag berichten wir über diese Verfahren und erste Simulations- und
Meßergebnisse.

Bio:

Prof. Dr.-Ing. Robert Tolksdorf leitet seit 2002 die Arbeitsgruppe
Netzbasierte Informationssysteme (NBI) am Institut für Informatik an
der Freien Universität Berlin. Die Schwerpunkte der Forschungen sind
unter anderem: XML Technologien, Semantic Web und Selbstorganisierte
Systeme. Weitere Informationen sind unter www.ag-nbi.de zu finden.

Daniel Bößwetter, HU Berlin, DBIS

Abstract:

"From Rows to Columns and Back Again"

In recent years, we have seen the emergence of a vast number of new
database systems in research and academia, most of which are specialized
on distinct domains of data management, e.g. transaction processing
or analytical workloads. This specialization contradicts the idea of a
single database architecture for a wide range of tasks, but most people
would agree that traditional general-purpose databases are not the
optimal solution for many problems at hand. This talk deals with the
question if a trade-off between read- and write-optimization is still
possible and introduces a physical data model that can be adapted to
varying workloads. As we go, existing column-stores are surveyed and
classified into a unified coordinate system.

Bio:

Daniel Bößwetter received his Diploma in Computer Science from the
Munich University of Technology in 2004. His work experience includes
positions as lead developer at PEPPERMIND in Munich (1998-2003)
and as team leader of a data center at Jamba! in Berlin (2004-2005).
He is currently member of the Database and Information Systems Group of
the Freie Universität Berlin where he is writing his dissertation on
column-oriented databases.

Dr. Melanie Herrschel, Univ. Tübingen

"Anfrageanalyse mit Nautilus"

Abstract:

Beim Entwickeln von Datenbankanwendungen ist die Definition von SQL-
Anfragen, oder noch allgemeiner, von Datentransformationen, ein
wichtiger Bestandteil. Bei komplexen Anfragen kommt es dabei häufig
vor, dass die Anfrage nicht auf Anhieb semantisch korrekt ist. In
solchen Fällen durchlaufen Entwickler mehrere manuelle Analyse-
Korrektur-Test-Zyklen: Zunächst wird die Anfrage analysiert, um den
möglichen Fehler zu identifizieren. Die Anfrage wird daraufhin
verändert und neu ausgeführt und das Ergebnis wird überprüft.
Entspricht es wieder nicht der Erwartung, so ist ein weiterer Analyse-
Korrektur-Test-Zyklus nötig.

Bisher ist dieser Entwicklungsansatz vollständig manuell und dadurch
sehr zeitaufwändig und fehleranfällig. Ziel des Nautilus-Projekts ist
die Entwicklung von Algorithmen und Tools zur Unterstützung des
Entwicklers in allen Phasen des Analyse-Korrektur-Test-Zyklus.

In diesem Vortrag beleuchten wir Algorithmen zur Lösung eines
spezifischen Problems in diesem Kontext: der Analyse fehlender Daten
in einem Anfrageergebnis. Dies ist z.B. hilfreich, wenn eine Anfrage
mit vielen Joins unerwartet ein leeres Ergebnis liefert. Wir stellen
zunächst verschiedene Lösungsansätze vor, bevor wir im Detail einen
von uns entwickelten Algorithmus näher betrachten. Der Vortrag endet
mit einem Ausblick auf weitere Forschungsthemen innerhalb des Nautilus-
Projekts. Bei Interesse und Zeit ist ebenfalls eine Demonstration des
bisherigen Systems möglich.

Bio:

Melanie Herschel schloss 2003 ihr Studium der Informationstechnik an
der Berufsakademie Stuttgart ab und promovierte 2007 an der Humboldt
Universität zu Berlin zum Thema XML Dublettenerkennung und
Datenreinigung. Von Sept. 2006 bis Mai 2008 war sie am Hasso-Plattner-
Institut Potsdam tätig, wo sie unter Anderem in Kooperation mit der
Schufa Holding AG effiziente Algorithmen zur Dublettenerkennung im
Schufa Datenbestand entwickelte. Von Mai 2008 bis Mai 2009 arbeitete
sie als Forscherin am IBM Almaden Research Center, USA. Aus ihrem
dortigen Forschungsschwerpunkt, der Berechnung der Datenherkunft
fehlender Daten, entwickelte sie nach Ihrer Rückkehr nach Deutschland
die Idee für Nautilus, ein System zur Analyse und Fehlerbehebung von
komplexen Datentransformationen. Derzeit arbeitet Melanie Herschel am
Lehrstuhl für Datenbanksysteme der Universität Tübingen.

Frank Huber, HU Berlin, DBIS

Abstract:

"Query Processing on Multi-Core Architectures"


The upcoming generation of computer hardware poses several new
challenges for database developers and engineers. Database management
systems will no longer benefit from performance gains of future hardware
due to increase clock speed, as it was the case for the last 35 years;
instead, the number of cores per CPU will increase steadily. This
observation results in several important research questions on how to
use the new multi-core CPU architecture for improving the performance of
DBMSs? In this presentation I give an overview of my ideas for query
processing on multi-core CPU architectures. I will present an abstract
architecture model view for multi-core CPUs, a meta language to control
and interact with the hardware, and a query operator model which makes
use of a meta language to control the parallel execution of a query on
different cores.

Bio:

Frank Huber received the Diplom Informatiker degree in computer science
from the Humboldt-Universität zu Berlin in 2004. His is currently a
research staff member of the Database and Information Systems Group at
Humboldt-Universität zu Berlin. Under supervision of Prof. Freytag he
took part on a research project called DirXQuE3 which was founded by the
Siemens Cooperation which last for more than three years.
In 2008 he joined the Microsoft SQL Server Group, Redmond, WA, USA, for
a 3-month internship. The internship focused on studying future hardware
trends and their impacts on database management systems. The internship
gave him the opportunity to get hands on the query execution and
optimization parts of Microsoft's SQL Server.

Prof. Dr. Christian Bizer

Abstract:

"
The Emerging Web of Linked Data"

The World Wide Web is a global information space based on the idea to set
hyperlinks between documents. In a similar fashion, Linked Data technologies
provide for setting data links between records in distinct databases and
thus connect these databases into a global data space. Linked Data thus
provides a means to overcome the fragmentation of the Data Web into separate
data islands accessible though proprietary Web-APIs. Linked Data
technologies have been adopted by an increasing number of data providers
over the last three years, leading to the creation of a global data space
containing billions of assertions, the Web of Linked Data. In his talk,
Professor Christian Bizer will introduce the principle ideas behind Linked
Data, give an overview of the emerging Web of Linked Data, explain the
state-of-the-art in applications that consume Linked Data from the Web, and
outline current research topics that arise from the evolution of the Web
into a global data space.

Bio:

Professor Christian Bizer is the head of the Web-based Systems Group at
Freie Universität Berlin. The group explores technical and economic
questions concerning the development of global, decentralized information
environments. The results of his work include the Named Graphs data model,
the Fresnel display vocabulary, and the D2RQ mapping language which is
widely used for mapping relational databases to the Web of Linked Data. He
initialized the Linking Open Data community project and the DBpedia project.

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions