A Track Record of Successful International Collaboration
Researchers in the Chair of Database Systems and Information Management (DIMA) at TU Berlin (TUB) and the Intelligent Analytics for Massive Data Research Department at the German Research Center for Artificial Intelligence (DFKI)  continue to remain actively engaged in international collaboration on data science and engineering problems. Among the more recent successes are the following three publications.
In joint work with Zihao Chen, Chen Xu, Weining Quian, and Aoying Zhou (all from ECNU), Juan Soto and Volker Markl (both from TUB) propose HyMAC, a system that enables iterative Machine Learning algorithms to run more efficiently on distributed dataflow systems. Their approach has the potential to speed up the process of Machine Learning with data from billions of datapoints by reducing the communication cost in dataflow systems, such as Apache Flink. Their paper, Hybrid Evaluation for Distributed Iterative Matrix Computation  will be presented at the upcoming 2021 ACM SIGMOD conference .
In joint work with Yancan Mao, Bingsheng He, and Richard Ma (all from the NUS), Jihong He (with ByteDance), Shuhao Zhang from SUTD (formerly TUB) as well as Philipp Grulich, Steffen Zeuch, and Volker Markl (all from TUB) benchmark different intra-window join algorithms on multicore architectures. An intra-window join is an important stream processing operation widely used in modern stream applications. This work’s results lead to a decision tree that guides users to select a suitable intra-window join algorithm given different application workloads, performance metrics and hardware architectures. Their paper, Parallelizing Intra-Window Join on Multicores: An Experimental Study  will be presented at the upcoming 2021 ACM SIGMOD conference.
To analyze publicly available geospatial data, city planners often use visual analytics systems, which rely on spatial queries that are computation-heavy. This poses a big challenge for visual analytics systems, making interactive responses hard to achieve. In joint work with Andreas Kipf and Ibrahim Sabek (both from MIT), Varun Pandey (from TUM), Harish Doraiswamy (formerly, with NYU) as well as Eleni Tzirita-Zacharatou and Volker Markl (both from TUB) introduce a new geospatial data processing paradigm that enables fast response times for spatial queries running on commodity hardware. Spatial data is ubiquitous, and thus this work has multiple applications beyond urban planning. Their paper, The Case for Distance-Bounded Spatial Approximations  was presented at the 2021 Conference on Innovative Data Systems Research (CIDR)  in January.