Save My Seat

The MLOps Live Webinar Series

Session #38

Save My Seat

Tuesday May 27th- 9am PST/12 noon EST/ 6pm CET

LLM Evaluation and Testing for Reliable AI Apps

LLM evaluation is essential. Building with LLMs means working with complex, non-deterministic systems. Testing is critical to catch failures and risks early - and to ship fast and with confidence.

In this webinar, we'll hear firsthand about the challenges and opportunities presented by LLM observability. We'll dive into:

  • Real-world risks: Explore actual examples of LLM failures in production environments, including hallucinations and vulnerabilities.
  • Practical evaluation techniques: Discover tips for synthetic data generation, building representative test datasets, and leveraging LLM-as-a-judge methods.
  • Evaluation-driven workflows: Learn how to integrate evaluation into your LLM product development and monitoring processes.
  • Production monitoring strategies: Gain insights on adding model monitoring capabilities to deployed LLMs, both in the cloud and on-premises.


Can’t attend the live webinar? Register to receive the recording and watch it at your convenience.

Presented By
Elena Samuylova

CEO and Co-founder, Evidently AI

Jonathan Daniel

Senior Software Engineer, Iguazio (acquired by McKinsey)