TCSE logo 
 Sigsoft logo
Sustainability badge
Sun 4 May 2025 16:30 - 17:15 at FSS2005 - AIware Bootcamp Sun 16:30

This session will detail the challenges and techniques that can be used to evaluate an AIware. Covered topics include:

  • Overview of evaluating an AIware - Importance
  • Evaluation primitives - Evaluation with ad hoc vibe checks, benchmarks, manually curated datasets, trace data, data splits, and repetitions
  • Evaluation metrics overview
  • Evaluating individual components of AIware - Agents, RAG, etc.
  • Testing AIware - Unit tests, summary evaluations, response evaluations, regression testing, backtesting
  • AI as judge - Overview, benefits, and costs

Sun 4 May

Displayed time zone: Eastern Time (US & Canada) change

16:30 - 17:15
AIware Bootcamp Sun 16:30Tutorials and Technical Briefings at FSS2005
16:30
45m
Talk
AIware: Evaluating AIwareSE for AI
Tutorials and Technical Briefings
Jiahuei (Justina) Lin Centre for Software Excellence, Huawei Canada
:
:
:
: