Skip to main content

34 docs tagged with "arch"

View all tags

Adversarial Testing

Prompt injection, PEP bypass, manifest violations, and entitlement escalation tests for PGAR runtimes.

Audit & Replay

Immutable verdict logs, examiner questions, and replaying authorization without chat transcripts.

Domain: RAG Retrieval

Retrieval as a PEP-gated tool, context pack logging, validation handoff, and PGAR applied to RAG.

Domain: Tool Registry

Tool manifests, schema compliance, PEP gating per tool, and blocking proposals outside the registry.

Eval Plane ①: Input

How to evaluate the Input plane — parsing, intent, injection resistance, and PII handling before inference begins.

Eval Plane ②: Data

How to evaluate the Data plane — source freshness, lineage, access boundaries, and factual correctness of underlying knowledge.

Eval Plane ③: Context

How to evaluate the Context plane — retrieval precision, ranking, scope, packing, and abstention when evidence is thin.

Eval Plane ④: Reasoning

How to evaluate the Reasoning plane — faithfulness to context, conclusion quality, tool selection, and multi-step logic.

Eval Plane ⑤: Tool

How to evaluate the Tool plane — selection, arguments, idempotency, error handling, and schema compliance for agent tool calls.

Eval Plane ⑥: Memory

How to evaluate the Memory plane — session scope, TTL, consistency, and cross-session leakage in agent and copilot systems.

Eval Plane ⑦: Action

How to evaluate the Action plane — policy enforcement, authorization, side effects, and auditability before irreversible operations execute.

Eval Plane ⑧: Outcome

How to evaluate the Outcome plane — end-user task success, clarity, usefulness, and trust in the final delivered response.

PDP Policy Surfaces

ALLOW, DENY, and STEP_UP only — policy versioning, rule authoring, and deterministic authorization.

PEP Enforcement

The four steps every Policy Enforcement Point runs on every proposal: receive, ask PDP, audit, act.

PGAR Boundary Playbooks

The five PGAR trust boundaries in request order (ingress, agentic app, LLM proposal, PEP + PDP, downstream), including multi-agent workflows, with links to each implementation playbook.

PGAR Foundation Playbooks

Core PGAR building blocks in implementation order — SARAC contracts, token custody, PEP/PDP enforcement, step-up, and audit replay.

PGAR Runtime Playbooks

Hub for Policy-Governed Agent Runtime playbooks (foundation, assurance, boundary, and domain groups in recommended implementation order).

Policy Test Scenarios

Golden scenario libraries for PDP/PEP regression, representative, edge, adversarial, and incident replay cases.

Step-Up & Attestation

STEP_UP verdict handling, four-eyes approval, re-evaluation with context.approval, and UX ownership in the agentic app.