This session is designed for data architects, database administrators, data engineers, and analytics professionals who are interested in the future of database systems and the practical application of LLMs in enterprise data management.
Abstract Modern data systems increasingly expose semantic operators powerful extensions to the relational algebra like sem_filter and sem_join that evaluate natural-language predicates over large tables using Large Language Models (LLMs). While these operators unlock unprecedented analytical capabilities, they come at a significant cost: every LLM call costs money ($0.001–$0.10) and adds latency (0.5–5 seconds), making naive execution prohibitively expensive at scale. This talk introduces the emerging field of semantic query optimization, which aims to answer a critical question for data practitioners: how can we execute these powerful semantic queries while staying within a strict budget and meeting a specific quality target? We will explore the core architectural pattern for cost-based semantic optimization, using real-world examples and findings from the research paper. Key takeaways for attendees will include:
Speaker Bio:
Arun is a seasoned data engineering professional with over 15 years of experience building real-time data platforms and large-scale machine learning systems at companies including Amazon and Perion. He led foundational initiatives such as the supply funnel for Prime Video Ads and a conversational analytics platform for Ads Monetization. Passionate about self-serve analytics, he focused on designing production-grade architectures that democratize access to complex data. His work bridges massive-scale engineering and intelligent system design, driving multi-million dollar business impact while enabling autonomous, data-informed workflows. Today, he focuses on evolving modern data platforms to support the next generation of AI agents.
© 2025 Data Management Association of Puget Sound (DAMA-PS) | Affiliated Chapter of DAMA International