Tutorials

Alessandro Margara

Politecnico di Milano Italy

Tutorial Title: A Unifying Model for Distributed Data-Intensive Systems

Abstract

Modern applications handle increasingly larger volumes of data, generated at an unprecedented and constantly growing pace. They introduce challenges that are transforming all research fields that gravitate around data management and processing, resulting in a blooming of distributed data-intensive systems. Each data-intensive system comes with its specific assumptions, data and processing model, design choices, implementation strategies, and guarantees. Yet, the problems data-intensive systems face and the solutions they propose are frequently overlapping. This tutorial presents a unifying model for data-intensive systems that dissects them into core building blocks, enabling a precise and unambiguous description and a detailed comparison. We show a list of classification criteria that derive from the model and we use them to build a taxonomy of state-of-the-art systems. The tutorial aims to offer a global view of the vast research field of data-intensive systems, highlight interesting observations on the current state of things, and suggest promising research directions.

Biography

Alessandro Margara is associate professor at Politecnico di Milano. He obtained his PhD from Politecnico di Milano and worked as a post- doctoral researcher at the Vrije Universiteit (VU) Amsterdam and Università della Svizzera italiana (USI). Alessandro’s research interests are in the area of software engineering and distributed systems. his research focuses on defining abstractions and building systems to simplify the design, development, and operation of complex distributed applications. Alessandro is a long-term member of the DEBS community and a regular member of the DEBS Program Committee. His DEBS 2010 paper received the DEBS 2020 Test of Time award. His DEBS 2014 paper received the Best Paper award. Alessandro was DEBS 2021 General Co-Chair. Alessandro has already presented two tutorials in DEBS 2011 and in DEBS 2016. Both tutorials had a similar goal and format as the proposed one. They were presenting a model and classification of heterogeneous software systems.

Alexander Artikis

University of Piraeus

Tutorial Title: Formal Models of Complex Event Recognition

Abstract

Complex Event Recognition (CER) refers to the activity of detecting patterns in streams of continuously arriving “event” data over (geographically) distributed sources. CER is a key ingredient of many contemporary Big Data applications that require the processing of such event streams in order to obtain timely insights and implement reactive and proactive measures. Examples of such applications include the recognition of attacks in computer network nodes, human activities on video content, emerging stories and trends on the Social Web, traffic and transport incidents in smart cities, error conditions in smart energy grids, violations of maritime regulations, cardiac arrhythmias and epidemic spread. In each application, CER allows to make sense of streaming data, react accordingly, and prepare for counter-measures. In this tutorial, we will present the formal methods for CER, as they have been developed in the artificial intelligence community. To illustrate the reviewed approaches, we will use the domain of maritime situational awareness.

Biography

Alexander Artikis is an Associate Professor at the University of Piraeus (GR), and a Research Associate at NCSR Demokritos, leading the complex event recognition group1. He holds a PhD from Imperial College London on Multi-Agent Systems, while his research interests lie in the area of Artificial Intelligence. He has published over 100 papers in related journals and conferences. Alexander has been developing complex event processing techniques in the context of several EU-funded Big Data projects, and he was the scientific coordinator in some of them. He has given tutorials on complex event processing in IJCAI, KR, VLDB and ECAI. In 2020, he co-organised the Dagstuhl seminar on the “Foundations of Composite Event Recognition”.

Jim Dowling

KTH Royal Institute of Technology

Tutorial Title: Hopsworks Feature Store

Abstract

Feature stores for machine learning are a new category of systems software that centralize the management of data for AI - for both training and serving data to models. They solve problems related to ensuring consistent transformations of data between training and serving, the reuse of pre-engineered features across different models, ensuring no future leakage in training data through point-in-time correct joins, and enabling collaboration between different personas putting AI in production, including data engineers, data scientists, and ML engineers. In this tutorial, we will present the historical evolution of feature stores and the needs that drove their development. We will deep dive into the first open-source feature store, Hopsworks, and how you can use Hopsworks to build both an analytical ML application and an operational ML application. This will involve showing an end-to-end system that includes feature engineering, the feature store, model training, and pipeline orchestration. You will need experience in programming in Python, some knowledge of Pandas will be helpful, but no prior experience of machine learning is needed - just enthusiasm for building real-world machine learning systems.

Biography

Jim Dowling is CEO of Hopsworks and an Associate Professor at KTH Royal Institute of Technology. He is one of the main developers of the open-source Hopsworks platform, a horizontally scalable data platform for machine learning that includes the industry’s first Feature Store. His research interests are in the areas of distributed file systems, decentralized systems, and systems support for real-time machine learning. Jim is a former Marie Curie Scholar, and has won awards for his research including the IEEE Scale Prize, awarded by CCGrid, for his work on the HopsFS file system. Jim is a regular speaker at industry conferences on data and AI and is currently writing a book on feature stores for Manning.

Jorge Arnulfo Quiane Ruiz

TU Berlin and DFKI

Tutorial Title: Apache Wayang (Incubating): Performing AIoT Seamlessly

Abstract

We are living in a data deluge era, where data is being generated by a large number of sources. This just got exacerbated with the emergence of the Internet of Things (IoT). Nowadays, a large number of different devices are generating data at an unprecedented scale: smartphones, smartwatches, embedded sensors in cars, smart homes, wearable technology, just to mention a few. We are simply surrounded by data without even noticing it. This represents a great opportunity to improve our everyday lives by using the new advances in AI, also called AIoT. Connecting IoT with data storage and AI technology is just gaining more and more attention. Yet, performing AIoT in an efficient and scalable manner is a cumbersome task. Today, users have to implement different ad hoc solutions to move data from the IoT to “stable” storage on which they can perform AI (typically on the Cloud). In this tutorial, we will discuss and learn how Apache Wayang (Incubating) frees users from this burden. In particular, we explain how Wayang enables users to seamlessly run their AI tasks on the Fog and Cloud via its cross-platform optimizer.

Biography

Jorge Quiané is the head of the Big Data Systems research group at the Berlin Institute for the Foundations of Learning and Data (BIFOLD) and a Principal Researcher at DIMA (TU Berlin). He also acts as the Scientific Coordinator of the IAM group at the German Research Center for ArtificialIntelligence (DFKI). His current research is in the broad area of big data: mainly in federated data analytics, scalable data infrastructures, and distributed query processing. He has published numerous research papers on data management and novel system architectures. He has recently been honoured with the 2022 ACM SIGMOD Research Highlight Award, the Best demo Award at ICDE 2022, and the Best Paper Award at ICDE 2021 for his work on “EfficientControl Flow in Dataflow Systems”. He holds five patents in core database areas and on machine learning. Earlier in his career, he was a Senior Scientist at the Qatar Computing Research Institute (QCRI) and a Postdoctoral Researcher at Saarland University. He obtained his PhD in computer science from INRIA (Nantes University).

Important Dates

Events	Dates (AoE)
Abstract Submission for Research Track	~~March 4th, 2022~~ March 21st, 2022
Submission Dates
Research Paper Submission	~~March 11th, 2022~~ March 28th, 2022
Industry and Application Paper Submission	~~March 25th, 2022~~ April 22nd, 2022
Tutorial Proposal Submission	April 15th, 2022
Grand Challenge Solution Submission	April 22th, 2022
Doctoral Symposium Submission	~~May 13rd, 2022~~ May 27th, 2022
Poster and Demo Paper Submission	~~May 12th, 2022~~ May 27th, 2022
Notification Dates
Author Notification Research Track	May 6th, 2022
Author Notification Industry and Application Track	~~April 22nd, 2022~~ May 18th, 2022
Author Notification Tutorials	April 29th, 2022
Author Notification Grand Challenge	~~May 3rd, 2022~~ May 9th, 2022
Author Notification Doctoral Symposium	~~May 20th, 2022~~ June 3rd, 2022
Author Notification Poster & Demo	~~May 26th, 2022~~ June 3rd, 2022
Conference
Camera Ready for All Tracks	~~May 31st, 2022~~ June 10th, 2022
Conference	27th June – 30th June 2022

A Twitter List by TwitterDev

TUTORIALS

Tutorials

Important Dates

Sponsored by