Alessandro Margara
Politecnico di Milano Italy
Tutorial Title: A Unifying Model for Distributed Data-Intensive Systems
Abstract
Modern applications handle increasingly larger volumes of data, generated at an unprecedented and constantly growing pace. They introduce challenges that are transforming all research fields that gravitate around data management and processing, resulting in a blooming of distributed data-intensive systems. Each data-intensive system comes with its specific assumptions, data and processing model, design choices, implementation strategies, and guarantees. Yet, the problems data-intensive systems face and the solutions they propose are frequently overlapping. This tutorial presents a unifying model for data-intensive systems that dissects them into core building blocks, enabling a precise and unambiguous description and a detailed comparison. We show a list of classification criteria that derive from the model and we use them to build a taxonomy of state-of-the-art systems. The tutorial aims to offer a global view of the vast research field of data-intensive systems, highlight interesting observations on the current state of things, and suggest promising research directions.
Biography
Alessandro Margara is associate professor at Politecnico di Milano. He obtained his PhD from Politecnico di Milano and worked as a post- doctoral researcher at the Vrije Universiteit (VU) Amsterdam and Università della Svizzera italiana (USI). Alessandro’s research interests are in the area of software engineering and distributed systems. his research focuses on defining abstractions and building systems to simplify the design, development, and operation of complex distributed applications. Alessandro is a long-term member of the DEBS community and a regular member of the DEBS Program Committee. His DEBS 2010 paper received the DEBS 2020 Test of Time award. His DEBS 2014 paper received the Best Paper award. Alessandro was DEBS 2021 General Co-Chair. Alessandro has already presented two tutorials in DEBS 2011 and in DEBS 2016. Both tutorials had a similar goal and format as the proposed one. They were presenting a model and classification of heterogeneous software systems.
Alexander Artikis
University of Piraeus
Tutorial Title: Formal Models of Complex Event Recognition
Abstract
Complex Event Recognition (CER) refers to the activity of detecting patterns in streams of continuously arriving “event” data over (geographically) distributed sources. CER is a key ingredient of many contemporary Big Data applications that require the processing of such event streams in order to obtain timely insights and implement reactive and proactive measures. Examples of such applications include the recognition of attacks in computer network nodes, human activities on video content, emerging stories and trends on the Social Web, traffic and transport incidents in smart cities, error conditions in smart energy grids, violations of maritime regulations, cardiac arrhythmias and epidemic spread. In each application, CER allows to make sense of streaming data, react accordingly, and prepare for counter-measures. In this tutorial, we will present the formal methods for CER, as they have been developed in the artificial intelligence community. To illustrate the reviewed approaches, we will use the domain of maritime situational awareness.
Biography
Alexander Artikis is an Associate Professor at the University of Piraeus (GR), and a Research Associate at NCSR Demokritos, leading the complex event recognition group1. He holds a PhD from Imperial College London on Multi-Agent Systems, while his research interests lie in the area of Artificial Intelligence. He has published over 100 papers in related journals and conferences. Alexander has been developing complex event processing techniques in the context of several EU-funded Big Data projects, and he was the scientific coordinator in some of them. He has given tutorials on complex event processing in IJCAI, KR, VLDB and ECAI. In 2020, he co-organised the Dagstuhl seminar on the “Foundations of Composite Event Recognition”.
Jim Dowling
KTH Royal Institute of Technology
Tutorial Title: Hopsworks Feature Store
Abstract
Feature stores for machine learning are a new category of systems software that centralize the management of data for AI - for both training and serving data to models. They solve problems related to ensuring consistent transformations of data between training and serving, the reuse of pre-engineered features across different models, ensuring no future leakage in training data through point-in-time correct joins, and enabling collaboration between different personas putting AI in production, including data engineers, data scientists, and ML engineers. In this tutorial, we will present the historical evolution of feature stores and the needs that drove their development. We will deep dive into the first open-source feature store, Hopsworks, and how you can use Hopsworks to build both an analytical ML application and an operational ML application. This will involve showing an end-to-end system that includes feature engineering, the feature store, model training, and pipeline orchestration. You will need experience in programming in Python, some knowledge of Pandas will be helpful, but no prior experience of machine learning is needed - just enthusiasm for building real-world machine learning systems.
Biography
Jim Dowling is CEO of Hopsworks and an Associate Professor at KTH Royal Institute of Technology. He is one of the main developers of the open-source Hopsworks platform, a horizontally scalable data platform for machine learning that includes the industry’s first Feature Store. His research interests are in the areas of distributed file systems, decentralized systems, and systems support for real-time machine learning. Jim is a former Marie Curie Scholar, and has won awards for his research including the IEEE Scale Prize, awarded by CCGrid, for his work on the HopsFS file system. Jim is a regular speaker at industry conferences on data and AI and is currently writing a book on feature stores for Manning.
Jorge Arnulfo Quiane Ruiz
TU Berlin and DFKI
Tutorial Title: Apache Wayang (Incubating): Performing AIoT Seamlessly
Abstract
We are living in a data deluge era, where data is being generated by a large number of sources. This just got exacerbated with the emergence of the Internet of Things (IoT). Nowadays, a large number of different devices are generating data at an unprecedented scale: smartphones, smartwatches, embedded sensors in cars, smart homes, wearable technology, just to mention a few. We are simply surrounded by data without even noticing it. This represents a great opportunity to improve our everyday lives by using the new advances in AI, also called AIoT. Connecting IoT with data storage and AI technology is just gaining more and more attention. Yet, performing AIoT in an efficient and scalable manner is a cumbersome task. Today, users have to implement different ad hoc solutions to move data from the IoT to “stable” storage on which they can perform AI (typically on the Cloud). In this tutorial, we will discuss and learn how Apache Wayang (Incubating) frees users from this burden. In particular, we explain how Wayang enables users to seamlessly run their AI tasks on the Fog and Cloud via its cross-platform optimizer.
Biography
Jorge Quiané is the head of the Big Data Systems research group at the Berlin Institute for the Foundations of Learning and Data (BIFOLD) and a Principal Researcher at DIMA (TU Berlin). He also acts as the Scientific Coordinator of the IAM group at the German Research Center for ArtificialIntelligence (DFKI). His current research is in the broad area of big data: mainly in federated data analytics, scalable data infrastructures, and distributed query processing. He has published numerous research papers on data management and novel system architectures. He has recently been honoured with the 2022 ACM SIGMOD Research Highlight Award, the Best demo Award at ICDE 2022, and the Best Paper Award at ICDE 2021 for his work on “EfficientControl Flow in Dataflow Systems”. He holds five patents in core database areas and on machine learning. Earlier in his career, he was a Senior Scientist at the Qatar Computing Research Institute (QCRI) and a Postdoctoral Researcher at Saarland University. He obtained his PhD in computer science from INRIA (Nantes University).
Events | Dates (AoE) |
---|---|
Abstract Submission for Research Track | |
Submission Dates | |
Research Paper Submission | |
Industry and Application Paper Submission | |
Tutorial Proposal Submission | |
Grand Challenge Solution Submission | |
Doctoral Symposium Submission | |
Poster and Demo Paper Submission | |
Notification Dates | |
Author Notification Research Track | |
Author Notification Industry and Application Track | |
Author Notification Tutorials | |
Author Notification Grand Challenge | |
Author Notification Doctoral Symposium | |
Author Notification Poster & Demo | |
Conference | |
Camera Ready for All Tracks | |
Conference | 27th June – 30th June 2022 |