Data Quality in Intelligent Systems

Lightning Talk: Data Quality in Intelligent Systems

Presenter: Sreyashi Das, Senior Data Engineer at Netflix

Abstract:

Data quality is a critical component in the development and deployment of intelligent systems, as it directly influences the accuracy, reliability, and trustworthiness of the insights and decisions derived from such systems. In this paper, we explore the multifaceted dimensions of data quality—including accuracy, completeness, consistency, timeliness, and relevance—and their impact on the performance of intelligent systems. We discuss the challenges and methodologies associated with ensuring high data quality in the context of rapidly evolving datasets and complex data environments. Furthermore, we examine the role of advanced data engineering techniques, such as data cleaning, validation, and enrichment, in maintaining and enhancing data quality. Through empirical studies and case analyses, we illustrate the profound effect that data quality has on machine learning models, predictive analytics, and real-time decision-making processes. Our findings underscore the necessity for robust data quality frameworks and best practices to harness the full potential of intelligent systems. By addressing these critical issues, we aim to provide actionable insights and strategies for researchers, practitioners, and organizations striving to optimize their intelligent systems for accuracy, reliability, and overall effectiveness.

Presenter Bio:

Sreyashi Das,  Netflix

Sreyashi Das is a Senior Data Engineer at Netflix with a demonstrated history of working in media entertainment and consumer electronics industries. She excels in the design and implementation of both streaming and batch data movement, along with analytical solutions. At Netflix she has developed new data products which provide the foundation for metric development and analysis insights for the Studio and Creative Production team. These products drive data health, launch timeliness, resource availability, and cost optimization. She has also designed a data extraction framework specifically for the animation team. Before her role at Netflix, her expertise was primarily in data warehousing and self-serve business intelligence. Sreyashi loves building high-quality data models.