Skip to main content

Playbooks

How to implement — paired with Blueprints for reference designs.

Eval engineering

The eval engineering playbook series (golden datasets, plane playbooks, judge calibration, and more) is in draft — visible in local dev, not yet published to production.

Start with G.A.I.N Evaluation for principles; the full blueprint and playbooks link from Eval Blueprint.

PGAR runtime

Four playbook groups for Policy-Governed Agent Runtime. Start at the PGAR Runtime overview.

GroupStart hereCovers
FoundationOverviewSARAC, token custody, PEP/PDP, step-up, audit
AssurancePolicy test scenariosCI authorization cases, adversarial bypass
BoundaryOverviewFive trust boundaries in request order
DomainTool registryTools, manifests, RAG retrieval

Also see PGAR Blueprint for the reference design. Bridge reading: PGAR with RAG. Eval overlap: Action plane and Tool plane.