direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Termine DIMA Kolloquium

Termine WS19/20
11.00 Uhr
MA 004
BBDC Talk: "Agency + Automation", Jeffrey Heer (University of Washington/Trifacta)
10.00 Uhr
DFKI Projektbuero Berlin, 4th Floor, Room: Weizenbaum, Alt-Moabit 91c, 10559 Berlin
Nantia Makrynioti, Athens University of Economics and Business
"Declarative specification and automatic compilation of machine learning algorithms"
16.00 Uhr
EN 719
Fabio Porto, The National Laboratory of Scientific Computing (LNCC) Rio de Janeiro, Brazil
"Managing and Analysing Simulation Data"

Nantia Makrynioti, Athens University of Economics and Business

Title: Declarative specification and automatic compilation of machine learning algorithms

Declarative programming is usually summarised in the phrase "describing what needs to be done, instead of telling the program how to do it". As the adoption of data science grows rapidly, a need has emerged for democratising data analysis tasks by making their development more approachable and less tedious through high-level languages. Inspired by the success of the declarative paradigm in relational database systems, researchers have recently started exploring whether the use of declarative languages in the machine learning (ML) domain can provide a productivity leap for developers.

In this talk I will give a brief overview of efforts in the area of declarative data analytics and machine learning and describe the design of sql4ml, a system that aims at democratising  ML tasks for database users. It allows the user to express ML models in SQL following the "model + solver" approach, where there is a description of the objective function (a.k.a. loss or cost function) of an ML model and a solver that provides the optimal solution for it. Sql4ml translates the SQL code defining the model to an appropriate representation for training inside an ML framework. After training, the computed solution is stored back to the database, which allows for more robust model management and generation of future predictions inside the database.


Short Bio:

Nantia Makrynioti is a PhD student in Computer Science at the Athens University of Economics and Business supervised by Professor Vasilis Vassalos. Her research focuses on integrating machine learning functionality with relational databases, which aligns with her interests in declarative machine learning (a paradigm well-known from databases applied on the area of machine learning). In the context of this effort, she has also worked with the LogicBlox team on expressing and optimising machine learning problems using the company's relational platform.

In the past, she did research on the interesting topic of sentiment analysis, which resulted in the development of a related component for a commercialised platform in Greece.

She holds a BSc in Computer Science from the University of Ioannina and a MSc in Information Systems from her current University.


Jeffrey Heer (University of Washington/Trifacta)

Location: Hörsaal: MA 004, Straße des 17. Juni 136, 10623 Berlin

Agency + Automation

Much contemporary rhetoric regards the prospects and pitfalls of using artificial intelligence techniques to automate an increasing range of tasks, especially those once considered the purview of people alone. These accounts are often wildly optimistic, understating outstanding challenges while turning a blind eye to the human labor that undergirds and sustains ostensibly “automated” services. This long-standing focus on purely automated methods unnecessarily cedes a promising design space: one in which computational assistance augments and enriches, rather than replaces, people’s intellectual work. This tension between agency and automation poses vital challenges for design and engineering. In this talk we will consider the design of interactive systems that enable rich, adaptive collaboration among people and computational agents. We seek to balance the often complementary strengths and weaknesses of each, while promoting human control and skillful action. We will review case studies in three arenas—data wrangling, exploratory visualization, and natural language translation—that integrate proactive computational support into interactive systems. To improve outcomes and support learning by both people and machines, I will describe the use of shared representations of tasks augmented with predictive models of human capabilities and actions.

Jeffrey Heer is the Jerre D. Noe Endowed Professor of Computer Science & Engineering at the University of Washington, where he directs the Interactive Data Lab and conducts research on data visualization, human-computer interaction, and social computing. The visualization tools developed by Jeff and his collaborators (Vega, D3.js, Protovis, Prefuse) are used by researchers, companies, and thousands of data enthusiasts around the world. Jeff's research papers have received awards at the premier venues in Human-Computer Interaction and Visualization (ACM CHI, ACM UIST, IEEE InfoVis, IEEE VAST, EuroVis). Other honors include MIT Technology Review's TR35 (2009), a Sloan Fellowship (2012), an Allen Distinguished Investigator Award (2014), a Moore Foundation Data-Driven Discovery Investigator Award (2014), and the ACM Grace Murray Hopper Award (2016). Jeff holds B.S., M.S., and Ph.D. degrees in Computer Science from UC Berkeley, whom he then "betrayed" to join the Stanford faculty (2009–2013). He is also a co-founder of Trifacta, a provider of interactive tools for scalable data transformation.



Fabio Porto, The National Laboratory of Scientific Computing (LNCC) Rio de Janeiro, Brazil

TU Berlin, EN building, seminar room EN 719 (7th floor), Einsteinufer 17, 10587 Berlin

Managing and Analysing Simulation Data

The increasing processing power of HPC systems has enabled the development of realistic simulations of phenomena in different areas, such as oil and gas, engineering, medicine, and meteorology. As simulation quality improves and HPC systems approach exaflop performance, scientists use of simulation output evolve to complex data analytics tasks. Unfortunately, data management systems have completely neglected the domain of numerical simulations leading scientists to express complex analysis using ad-hoc programs on top of proprietary file formats or libraries, such as NETCDF and HDF5. In this talk, we present the work we have being developing on data management to support numerical simulations. We will first discuss a technique to answer spatial queries about the uncertainty in simulation results. Next, we will present the SAVIME (Simulation & Visualization in-memory) system, a multidimensional array DBMS designed with the following principles: to incur in minimum data ingestion overhead;  to support complex data structures, such as meshes, data geometry and simulation metadata; to support data visualization; and  to offer users a declarative query interface and query optimization.
Fabio Porto is a Senior Researcher at the Brazilian National Laboratory of Scientific Computing (LNCC). He is the founder of the Data Extreme Lab (DEXL) and Co-director of the National Institute of Science and Technology on Data Science. He conducted doctoral studies at PUC-Rio and a doctoral research stay abroad at INRIA and went on to earn his PhD in Informatics from PUC-Rio in 2001. Between 2004-2007, he was a Postdoc at EPFL. His main research interests involve Big Data analytical algorithms; dataflow optimization and the confluence of Machine Learning and databases. He has more than 80 research papers published in international conferences and scientific journals, including PVLDB, SIGMOD, SSDBM, and ICDE. He was the General Chair of both VLDB 2018 and SBBD 2015, the Brazilian Symposium on Databases. Since 2018 he has been a member of the SBBD Steering Committee and a member of both SBC (the Brazilian Computer Society) and ACM.

Zusatzinformationen / Extras


Schnellnavigation zur Seite über Nummerneingabe