Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Phase 17 — Exercises: Hardware Backends & Plugins

Work these after the labs. They escalate from "explain it" to "design it" — staff-level means you can do the last ones cold.

  1. List 3 decisions the Platform abstraction centralizes and why hardcoding them would hurt.
  2. Why is FP8 / CUDA-graph support platform-gated?
  3. How would a new accelerator vendor add support without forking vLLM?

Self-grading

For each: could you (a) explain it to a teammate in 2 minutes, and (b) point to the exact upstream/ file that proves your answer? If not, re-read the matching anchor in 01-deep-dive.md.