Discussion:
[Dbworld] Call for Papers: SIGMOD'19 Workshop on Data Management for End-to-End Machine Learning (DEEM 2019)
Stephan Seufert
2018-11-26 16:41:02 UTC
Permalink
DEEM'19
The 3rd Workshop on Data Management for End-to-End Machine Learning, June 30, 2019.
http://deem-workshop.org
https://twitter.com/deem_workshop

Held in conjunction with ACM SIGMOD 2019
Amsterdam, Netherlands, June 30-July 5, 2019
http://sigmod2019.org/


---------------------------
WORKSHOP
---------------------------

Applying Machine Learning (ML) in real-world scenarios is a challenging task. In recent years, the
main focus of the database community has been on creating systems and abstractions for the efficient
training of ML models on large datasets. However, model training is only one of many steps in an
end-to-end ML application, and a number of orthogonal data management problems arise from the
large-scale use of ML, which require the attention of the data management community.

For example, data preprocessing and feature extraction workloads result in complex pipelines that
often require the simultaneous execution of relational and linear algebraic operations. Next, the
class of the ML model to use needs to be chosen, for that often a set of popular approaches such as
linear models, decision trees and deep neural networks have to be tried out on the problem at hand.
The prediction quality of such ML models heavily depends on the choice of features and
hyperparameters, which are typically selected in a costly offline evaluation process, that poses
huge opportunities for parallelization and optimization. Afterwards, the resulting models must be
deployed and integrated into existing business workflows in a way that enables fast and efficient
predictions, while still allowing for the lifecycle of models (that become stale over time) to be
managed. Managing this lifecycle requires careful bookkeeping of metadata and lineage ("which data
was used to train this model?", "which models are affected by changes in this feature?") and
involves methods for continuous analysis, validation, and monitoring of data and models in
production. As a further complication, the resulting systems need to take the target audience of ML
applications into account; this audience is very heterogeneous, ranging from analysts without
programming skills that possibly prefer an easy-to-use cloud-based solution on the one hand, to
teams of data processing experts and statisticians developing and deploying custom-tailored
algorithms​ on​ the​ other​ hand.

DEEM aims to bring together researchers and practitioners at the intersection of applied machine
learning, data management and systems research, with the goal to discuss the arising data management
issues in ML application scenarios. The workshop solicits regular research papers describing
preliminary and ongoing research results. In addition, the workshop encourages the submission of
industrial experience reports of end-to-end ML deployments.

Areas of particular interest for the workshop include (but are not limited to):

- Data Management in Machine Learning Applications
- Definition, Execution and Optimization of Complex Machine Learning Pipelines
- Systems for Managing the Lifecycle of Machine Learning Models
- Systems for Efficient Hyperparameter Search and Feature Selection
- Machine Learning Services in the Cloud
- Modeling, Storage and Provenance of Machine Learning experimentation data
- Integration of Machine Learning and Dataflow Systems
- Integration of Machine Learning and ETL Processing
- Definition and Execution of Complex Ensemble Predictors
- Sourcing, Labeling, Integrating, and Cleaning Data for Machine Learning
- Benchmarking of Machine Learning Applications
- Interpretability and Reproducibility in Machine Learning Applications
- Responsible Data Management


---------------------------
IMPORTANT DATES
---------------------------

Papers submission deadline: March 11, 2019
Authors notification: April 15, 2019
Deadline for camera-ready copy: April 29, 2019
Workshop: Sunday, June 30, 2019


---------------------------
SUBMISSION GUIDELINES
---------------------------

The workshop will have two tracks for regular research papers and industrial papers. Submissions can
be short papers (4 pages) or long papers (up to 10 pages). Authors are requested to prepare
submissions following the ACM proceedings format. DEEM is a single-blind workshop, authors must
include their names and affiliations on the manuscript cover page.


---------------------------
WORKSHOP CHAIRS
---------------------------

- Sebastian Schelter (New York University)
- Neoklis Polyzotis (Google AI)
- Stephan Seufert (Amazon Research)
- Manasi Vartak (MIT)


---------------------------
STEERING COMMITTEE
---------------------------

- Markus Weimer (Microsoft AI)
- Juliana Freire (New York University)
- Volker Markl (TU Berlin)


---------------------------
PROGRAM COMMITTEE
---------------------------

*to be announced*
_______________________________________________
Please do not post msgs that are not relevant to the database community at large. Go to www.cs.wisc.edu/dbworld for guidelines and posting forms.
To unsubscribe, go to https://lists.cs.wisc.edu/mailman/listinfo/dbworld
Loading...