Data Modeling Workshop

Workshop Title: Designing scalable and high-quality data models for feature engineering

Presenter: Sreyashi Das, Senior Data Engineer at Netflix

Abstract:  

This data modeling workshop will explore how structured/ unstructured data is cleansed, processed and integrated with the product experience in real-time using dimensional modeling concepts and data transformation tools (dbt). This will be an interactive session with Q&A, highlighting data modeling approaches for different real-world scenarios. The first part of the workshop will focus on technical concepts and state of data modeling for Generative AI applications and the second part will be more about best practices with illustrative examples and data integration with user-facing products.

Content:

  1. Introduction to Data Engineering and Feature Engineering
    • Role of data engineers in Data Analytics
    • Key Challenges in scaling ML workflows
    • State of the art data modeling in the age of Generative AI
  2. Data Lifecycle ~ heterogeneous data integration to consumer friendly data formats
    • Overview of data modeling
    • Building high-quality data models
    • Real-time vs Batch data processing with storage format as Iceberg
  3. Data transformations
    • Unified data layer
    • Introduction to dbt
    • Applying data classification strategies to develop key performance indicators (KPIs)
  4. Case Studies
    • Real-time recommendation system
    • Dynamic content generation
    • Fraud detection

Format:

  • Interactive lecture with live demonstrations.
  • Scenario-based data modeling examples
  • Q&A sessions to address participant questions.

Prerequisites:

  • Knowledge of SQL (beginners level) is required.
  • Dimensional Modeling (beginner’s level)

What attendees are expected to learn:

  • Cleansing and processing structured/unstructured data
  • Real-time data integration using dimensional modeling and dbt
  • Organize information retrieval for machine learning models
  • Best practices and illustrative examples

Presenter Bio:

Sreyashi Das,  Netflix

Sreyashi Das is a Senior Data Engineer at Netflix with a demonstrated history of working in media entertainment and consumer electronics industries. She excels in the design and implementation of both streaming and batch data movement, along with analytical solutions. At Netflix she has developed new data products which provide the foundation for metric development and analysis insights for the Studio and Creative Production team. These products drive data health, launch timeliness, resource availability, and cost optimization. She has also designed a data extraction framework specifically for the animation team. Before her role at Netflix, her expertise was primarily in data warehousing and self-serve business intelligence. Sreyashi loves building high-quality data models. Sreyashi is a IEEE Senior Member.