Evaluation and Alignment: The Seminal Papers

Manning Publications

A practical guide to designing and implementing AI evaluation systems, grounded in seminal research papers.

Get Early Access — 50% Off

About the Book

Building AI systems is only half the battle — knowing whether they actually work is the other half. Evaluation and Alignment bridges the gap between foundational research and practical engineering, walking you through the seminal papers that shaped how we measure and align AI systems today.

Whether you’re building LLM-powered applications, fine-tuning models, or designing evaluation pipelines, this book gives you the conceptual and practical tools to do it rigorously.

What You’ll Learn

  • Evaluation fundamentals — From classical metrics to modern LLM-as-judge approaches, understand what “good” means for your system and how to measure it.
  • Alignment techniques — RLHF, constitutional AI, and the research lineage behind today’s alignment methods.
  • Reading research effectively — Each chapter is anchored to seminal papers, teaching you how to extract practical insights from academic work.
  • Building evaluation pipelines — Design end-to-end evaluation systems that catch regressions, measure progress, and build confidence in your deployments.

Table of Contents

  1. Introduction to AI Evaluation and Alignment
  2. Classical Evaluation Metrics and Their Limitations
  3. Human Evaluation: Methods and Scaling Challenges
  4. Reward Modeling and RLHF
  5. Constitutional AI and Self-Alignment
  6. LLM-as-Judge: Automated Evaluation at Scale
  7. Benchmark Design and Contamination
  8. Evaluation for Retrieval-Augmented Systems
  9. Safety Evaluation and Red Teaming
  10. Building Production Evaluation Pipelines

Table of contents is preliminary and subject to change during Early Access.

Why This Book?

The AI evaluation landscape is fragmented across hundreds of papers, blog posts, and tribal knowledge. This book distills the work that matters most into a single, coherent narrative — so you can spend less time reading papers and more time building systems that work.

Get Early Access at Manning