Machine learning solutions for data integration, cleaning, and data generation are beginning to emerge.
Ihab Ilyas is a professor in the Cheriton School of Computer Science and the Thomson Reuters-NSERC Research Chair on data quality at the University of Waterloo. His main research focuses on the areas of big data and database systems, with special interest in data quality and integration, managing uncertain data, machine learning for data curation, and information extraction. Ihab is also a co-founder of Tamr, a startup focusing on large-scale data integration. He is a recipient of the Ontario Early Researcher Award, a Cheriton Faculty Fellowship, an NSERC Discovery Accelerator Award, and a Google Faculty Award, and he is an ACM Distinguished Scientist. Ihab is an elected member of the VLDB Endowment board of trustees, elected SIGMOD vice chair, and an associate editor of the ACM Transactions of Database Systems (TODS). He holds a PhD in computer science from Purdue University, West Lafayette.
Human-guided ML pipelines for data unification and cleaning might be the only way to provide complete and trustworthy data sets for effective analytics.