direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Talks DIMA Research Seminar

Talks WS17/18
Talk/Location
Lecturer/Subject
11.04.2018
11.30 am
EN 719
Prof. Themis Palpanas, Senior Member of the French University Institute (IUF) France
"End-to-End Entity Resolution for Structured and Semi-Structured Data"
06.02.2018
5 pm
Smart Data Forum
Salzufer 6, Entrance
Otto-Dibelius-Strasse,
10587 Berlin

Prof. Rudolf Bayer, Ph.D., TU München
"C-chain: the simple, scalable, transparent blockchain"
12.12.2017
4 pm
EN 719
Dominik Moritz, University of Washington
"Vega-Lite: A Grammar of Interactive Graphics"
05.12.2017
2.15 pm
Volkswagen-Universitätsbibliothek,
Room BIB 014, Fasanenstr. 88, 10623 Berlin

Prof. Dr. Francesca Bugiotti, CentraleSupelec, Paris-Saclay University & Moditha Hewasinghage, UPC, Barcelona
"Modeling Methodology for a uniform access to NoSQL systems (short intro for the students)"
04.12.2017
4 pm
EN 719
Prof. Dr. Francesca Bugiotti, CentraleSupelec, Paris-Saclay University & Moditha Hewasinghage, UPC, Barcelona
"Database Design for NoSQL Systems (long research pres) & "Modeling Strategies for Storing Data in Distributed Heterogeneous NoSQL databases (short pres, about his IT4BI Master‘s thesis)"
27.11.2017
4 pm
EN 719
Dr. Kaiwen Zhang, TU München
"Deconstructing Blockchains: Concepts, Applications, and Systems"
22.11.2017
4 pm
EN 719
Dr. Britta Meixner, CWI Amsterdam
"
Enhance, Enjoy, Engage: Improving the Video Playback Experience"
02.11.2017
3pm
EN719
Prof. Guillaume Pierre, University of Rennes 1
"From data centers to fog computing: the evaporating cloud"

Prof. Themis Palpanas, Senior Member of the French University Institute (IUF) France

Title: End-to-End Entity Resolution for Structured and Semi-Structured Data

Abstract:

Entity Resolution (ER) lies at the core of data integration, with a bulk of research focusing on both its effectiveness and time efficiency. Initially, most relevant works were crafted for structured (relational) data that are described by a schema of well-known quality and meaning. With the advent of Big Data, though, these early schema-based approaches became inapplicable, as the scope of ER moved to semi-structured data collections, which abound in noisy, semi-structured, voluminous and highly heterogeneous information.
In this talk, we take a close look on the entire ER workflow (from schema matching to entity clustering), covering both the schema-based and schema-agnostic cases. We will highlight recent works that significantly boost the efficiency of the overall workflow, especially meta-blocking, which cuts down on the computational cost by discarding comparisons that are repeated or lack sufficient evidence for producing duplicates. We will conclude with a brief demonstration of JedAI, our open-source reference toolbox for ER, which incorporates most of the state of the art techniques in the area.


Short bio:

Themis Palpanas is Senior Member of the Institut Universitaire de France (IUF), a distinction that recognizes excellence across all academic disciplines, and professor of computer science at the Paris Descartes University (France), where he is director of diNo, the data management group. He received the BS degree from the National Technical University of Athens, Greece, and the MSc and PhD degrees from the University of Toronto, Canada. He has previously held positions at the University of Trento, and at IBM T.J. Watson Research Center, and visited Microsoft Research, and the IBM Almaden Research Center.
His interests include problems related to data science (big data analytics and machine learning applications). He is the author of nine US patents, three of which have been implemented in world-leading commercial data management products. He is the recipient of three Best Paper awards, and the IBM Shared University Research (SUR) Award.
He is curently serving on the VLDB Endowment Board of Trustees, as an Editor in Chief for the BDR Journal, Associate Editor for VLDB 2019, Associate Editor in the TKDE, and IDA journals, as well as on the Editorial Advisory Board of the IS journal, and the Editorial Board of the TLDKS Journal. He has served as General Chair for VLDB 2013, Associate Editor for VLDB 2017, and Workshop Chair for EDBT 2016, ADBIS 2013, and ADBIS 2014, General Chair for the PDA@IOT International Workshop (in conjunction with VLDB 2014), and General Chair for the Event Processing Symposium 2009.
 
Prof. Themis Palpanas, Senior Member of the French University Institute (IUF) France
http://www.mi.parisdescartes.fr/~themisp/ [1]

 

 

Prof. Rudolf Bayer, Ph.D., TU München

Title: C-chain: the simple, scalable, transparent blockchain

Abstract:

Summary:
The talk questions some of the fundamental design decisions of blockchain, namely:

1. Proof of work
2. blockchain datastructure
3. miner
4. consensus
and argues, that they must be replaced by other desing decisions resulting in the C-chain method, which avoids all disadvantages of blockchain, in particular:

1. C-chain is easy to understand and to use
2. C-chain scales perfectly
3. guarantees immediate final settlement
4. has very low transaction costs

At the end of the talk there will be a handson experiment of the C-chain demonstrator, bring an Android device.

short bio:

Rudolf Bayer ist emeritierter Professor für Informatik an der TU München. Er studierte Mathematik in München und promovierte an der University of Illinois 1966. Nach Aufenthalten am Boeing Research Lab in Seattle und als Associate Professor an der Purdue University begründete er 1972 den Lehrstuhl für Datenbanksysteme an der TU München. Bekannt ist er vor allem für die Erfindung und Weiterentwicklung des B-Baums und des UB-Baums. Er leitete mehrere Forschungsgruppen und viele Projekte.
Er erhielt 1999 das Bundesverdienstkreuz und 2001 den SIGMOD Innovations Award der ACM.

 

 

 

Dominik Moritz, University of Washington

Title: "Vega-Lite: A Grammar of Interactive Graphics"

 

Abstract:
Vega-Lite is a declarative format for rapidly creating interactive visualizations. The simplest form of a Vega-Lite specification describes a single view–a mapping between data values and the visual properties for a single mark type. These single views can be composed of more complex layered and multi-view displays, or made interactive through a novel grammar of interaction. With Vega-Lite, a diverse range of interactive visualizations–from brushing & linking a scatterplot matrix, to cross-filtering and interactive index charts–can be built with only a few dozen lines of JSON. In these concise specifications, users can omit low-level details such as scale, axes, and legends properties as well as event handling logic, letting the Vega-Lite compiler infer sensible defaults. Under the hood, Vega-Lite leverages Vega’s high-performance dataflow architecture and cross-platform renderers for both SVG and Canvas.

Bio:
Dominik is a PhD student in Computer Science at the University of Washington. He is advised by Bill Howe from the eScience Institute and the Database Group and Jeffrey Heer from the Interactive Data Lab. Before coming to the US, Dominik has completed his undergraduate studies at Hasso-Plattner-Institute in Germany. In his research, he combines large-scale systems for data analysis with interactive data visualization to enable novel insights into large multi-dimensional data.
 
Dominik is a co-author of various libraries and tools in the Vega stack, including Vega-Lite, Voyager, and Polestar. He has worked for the Open Knowledge Foundation, Google, and Microsoft Research and has been awarded fellowships by the German National Academic Foundation and the Fulbright Committee.
 
When he is not working on research or coding, Dominik likes to travel, sail, hike in the mountains around Seattle, or bake bread.

 

 

Prof. Dr. Francesca Bugiotti, CentraleSupelec, Paris-Saclay University & Moditha

Location:
Volkswagen-Universitätsbibliothek, Room BIB 014, Fasanenstr. 88, 10623 Berlin

Title:
Modeling Methodology for a uniform access to NoSQL systems (short intro for the students)

Abstract:
The absence of a schema in NoSQL databases can disorient traditional database specialists and can make the design activity in this context a leap of faith.   
In this context traditional notions related to data modeling are still useful: first because data models provide a basis to the definition of generic approaches to logical and physical design;
second because the presence of a unified data modelling technique is the first step to provide uniform access to multiple NoSQL heterogeneous systems.

Bio:
Dr. Ing. Francesca Bugiotti holds a position as assistant professor at CentraleSupélec in Paris. She received her „Dr. Ing.“ degree in Computer Engineering from Università „Roma Tre“ (under supervision of prof. Paolo Azteni) in 2012, with a thesis on heterogeneity in databases. She worked as an intern and as a post-doc at Inria Saclay studying the problem of indexing RDF datasets in a cloud infrastructure and studying efficient data storage mechanisms for heterogeneous data in the cloud, supported by Inria in connection with the KIC EIT ICT Labs Europa activity on scalable cloud-based data management.
Her research activity focuses on heterogeneous data integration, conceptual models, NoSQL storage systems integration, NoSQL data model characteristics and query expressive power.

Title:
Modeling Strategies for Storing Data in Distributed Heterogeneous NoSQL databases (long pres about the details of his Master‘s thesis, same abstract as Dec 04)

Abstract:
Data management has become an essential functionality of modern information systems.
With the birth of the digital environments, the volume of data generated and available has grown up giving start to the Big Data era. NoSQL systems has been introduced to handle this large volume of data with providing availability, scalability, and efficiency. There is a considerable heterogeneity among the various NoSQL systems: different data models, different APIs, different implementations. Moreover, data modeling for NoSQL systems is not formalized mainly due to the flexible semi structured nature of their models.  Recent research results have shown how modeling decisions impact the quality requirements such as scalability and performance.
In this work we propose HerM (Heterogeneous Distributed Model), a NoSQL data modeling approach which supports the usage of multiple heterogeneous NoSQL systems in a distributed environment. We define the conceptual elements necessary for data modeling and we identify optimized data distribution patterns. We also map HerM into a physical model that increases performances for distributed Joins.
We implemented a flexible framework, where we deployed our proposed modeling strategies. The framework provides a transparent interface to access the underlying heterogeneous systems in an efficient manner and provides the ability to easily configure different use cases. We provide a detailed evaluation of our framework comparing native MongoDB implementation on different scenarios for a large dataset considering performance and stability.

Bio:
MSc Moditha Hewasinghage is a PhD student at Universitat Politècnica de Catalunya (UPC) Barcelona and Université libre de Bruxelles (ULB) in the IT4BI-DC program, under the supervision of prof. Alberto Abelló and prof. Esteban Zimányi. Modhita received his Bachelor’s degree in Computer Science from University of Colombo, School of Computing. He worked as a Senior Software Engineer for 99X Technology, Sri Lanka. He was a part of IT4BI program and successfully completed the masters in CentraleSupelec in Paris in 2017. His master thesis was “Modelling strategies for storing data in distributed heterogeneous NoSQL databases” under the supervision of ass. prof. Francesca Bugiotti and prof. Nacéra Bennacer.   
His research activity involves conceptual modelling and heterogeneous data integration.

Prof. Dr. Francesca Bugiotti, CentraleSupelec, Paris-Saclay University & Moditha Hewasinghage, UPC, Barcelona

Title:
Database Design for NoSQL Systems (long research pres)

Abstract:
The heterogeneity of NoSQL data models led to a little use of traditional modeling techniques, as opposed to what has happened with databases for decades. Although NoSQL databases are claimed to be flexible and without a static schema the design of data organization requires important decisions, to map data to the modeling elements (collections, documents, tables, columns, keys, key-value pairs) available in the target datastore. These decisions are significant, because of their impact on the above major quality requirements.
An effective design methodology for NoSQL systems supporting those quality requirements criticall for next-generation Web applications can be indeed devised. The presented approach is based on NoAM (NoSQL Abstract Model), a novel abstract data model for NoSQL databases, which is used to specify a system-independent representation of the application data and which exploits the commonalities of the various NoSQL datastores.

Bio:
Dr. Ing. Francesca Bugiotti holds a position as assistant professor at CentraleSupélec in Paris. She received her „Dr. Ing.“ degree in Computer Engineering from Università „Roma Tre“ (under supervision of prof. Paolo Azteni) in 2012, with a thesis on heterogeneity in databases. She worked as an intern and as a post-doc at Inria Saclay studying the problem of indexing RDF datasets in a cloud infrastructure and studying efficient data storage mechanisms for heterogeneous data in the cloud, supported by Inria in connection with the KIC EIT ICT Labs Europa activity on scalable cloud-based data management.
Her research activity focuses on heterogeneous data integration, conceptual models, NoSQL storage systems integration, NoSQL data model characteristics and query expressive power.

Title:
Modeling Strategies for Storing Data in Distributed Heterogeneous NoSQL databases (short pres, about his IT4BI Master‘s thesis)

Abstract:
Data management has become an essential functionality of modern information systems.
With the birth of the digital environments, the volume of data generated and available has grown up giving start to the Big Data era. NoSQL systems has been introduced to handle this large volume of data with providing availability, scalability, and efficiency. There is a considerable heterogeneity among the various NoSQL systems: different data models, different APIs, different implementations. Moreover, data modeling for NoSQL systems is not formalized mainly due to the flexible semi structured nature of their models.  Recent research results have shown how modeling decisions impact the quality requirements such as scalability and performance.
In this work we propose HerM (Heterogeneous Distributed Model), a NoSQL data modeling approach which supports the usage of multiple heterogeneous NoSQL systems in a distributed environment. We define the conceptual elements necessary for data modeling and we identify optimized data distribution patterns. We also map HerM into a physical model that increases performances for distributed Joins.
We implemented a flexible framework, where we deployed our proposed modeling strategies. The framework provides a transparent interface to access the underlying heterogeneous systems in an efficient manner and provides the ability to easily configure different use cases. We provide a detailed evaluation of our framework comparing native MongoDB implementation on different scenarios for a large dataset considering performance and stability.

Bio:
MSc Moditha Hewasinghage is a PhD student at Universitat Politècnica de Catalunya (UPC) Barcelona and Université libre de Bruxelles (ULB) in the IT4BI-DC program, under the supervision of prof. Alberto Abelló and prof. Esteban Zimányi. Modhita received his Bachelor’s degree in Computer Science from University of Colombo, School of Computing. He worked as a Senior Software Engineer for 99X Technology, Sri Lanka. He was a part of IT4BI program and successfully completed the masters in CentraleSupelec in Paris in 2017. His master thesis was “Modelling strategies for storing data in distributed heterogeneous NoSQL databases” under the supervision of ass. prof. Francesca Bugiotti and prof. Nacéra Bennacer.   
His research activity involves conceptual modelling and heterogeneous data integration.

Dr. Kaiwen Zhang, TU München

Title:
Deconstructing Blockchains: Concepts, Applications, and Systems

Abstract:
Popularly known for powering cryptocurrencies such as Bitcoin and Ethereum, blockchains is seen as a disruptive technology capable of impacting a wide variety of domains, ranging from finance to governance, by offering superior security, reliability, and transparency in a decentralized manner. In this tutorial presentation, we first study the original Bitcoin design from an academic perspective. We then take a comprehensive look at all aspects related to blockchains by deconstructing the system into 6 layers: Application, Modeling, Contract, System, Data, and Network. We will review potential applications which can benefit from blockchains, and describe the associated research challenges. Finally, we will conclude with a report on ongoing research, providing a decentralized messaging service using blockchains.

Dr. Britta Meixner, CWI Amsterdam

Title:
Enhance, Enjoy, Engage: Improving the Video Playback Experience


Abstract:
Current web technologies make it simpler than ever to both stream videos
and create complex constructs of interlinked videos with additional
information or parallel presentations of contents. We show typical use
cases for these types of videos. While a highly enjoyable presentation
is offered, additional data may lead to excessive waiting times
interrupting the playback. In this presentation, we show solutions for
both, traditional linear videos and hypervideos, which reduce startup
delays, stalling events, and quality switches during playback. We show
how using the HTML5 <video> tag, Media Source Extensions (MSE), or DASH
can be used and improved to accomplish this goal and satisfy the user's
expectations.

Kurzbio:
I am a Researcher at Centrum Wiskunde & Informatica (CWI) in Amsterdam.
I received my Ph.D. (German Dr. rer. nat) degree (magna cum laude) from the University of Passau, Germany, in 2014. The title of my thesis is “Annotated Interactive Non-linear Video - Software Suite, Download and Cache Management.” I am an award winner of the 2015 Award “Women + Media Technology,” granted by Germany’s public broadcasters ARD and ZDF (ARD/ZDF Förderpreis “Frauen + Medientechnologie” 2015). My Ph.D. also was presented a Honorable Mention for the SIGMM Outstanding Ph.D. Thesis Award in 2015. The paper “Download and Cache Management for HTML5 Hypervideo Players” was awarded with the Hypertext Ted Nelson Newcomer Award in 2016. My research interests are hypermedia and video streaming.
I am a reviewer for Springer Multimedia Tools and Applications (MTAP) Journal, the ACM TOMM Journal, and other journals. I am/was an Associate Chair at ACM TVX (2015-2017), an Area Chair at ACM Multimedia 2017, a member of the organization committee of TVX 2017-2019 and MMSys 2018, and I served as a PC member for several other conferences. From 2014 to
2016 I was a co-organizer of the “International Workshop on Interactive Content Consumption (WSICC)” at ACM TVX.

 

 

Prof. Guillaume Pierre, University of Rennes 1

Title:  "From data centers to fog computing: the evaporating cloud"

Abstact:
Cloud computing data centers are composed of very powerful computing nodes connected by reliable backbone networks. However, these resources are concentrated in a small number of a data centers. The latency between an end user and the closest available cloud data center comes in the range of 20-150 ms. A number of latency-sensitive applications (e.g., augmented reality) require extremely low end-to-end latencies and therefore cannot make use of traditional cloud platforms. Fog computing therefore aims to complement traditional cloud infrastructures with additional resources located extremely close to the user, within a couple of network hops. This requires one to distribute machines in a very large number of geographical locations so computation capacity is always available in immediate proximity of any end user. In this presentation I will discuss the application scenarios where fog computing is or isn‘t useful, and the architectural challenges one needs to face when designing the next-generation fog computing architectures.


------ Links: ------

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions

Copyright TU Berlin 2008