Library AgentGuard.MCP | AgentGuard.MCP.html | Test MCP servers (stdio / SSE / streamable-HTTP / in-memory) |
Library AgentGuard.Skill | AgentGuard.Skill.html | Discover, parse, validate, grade Agent Skills |
Library AgentGuard.Tool | AgentGuard.Tool.html | BFCL-style tool-call AST + trajectory matching |
Library AgentGuard.Stats | AgentGuard.Stats.html | Mann-Whitney U, Cliff's δ, Vargha-Delaney A, bootstrap CIs, pass@k, TARr@N |
Library AgentGuard.Judge | AgentGuard.Judge.html | Classification-based LLM-as-Judge with Cohen's κ calibration |
Library AgentGuard.Security | AgentGuard.Security.html | Default-deny skill scanner, redactor, sandbox, AIDefence |
Library AgentGuard.Hook | AgentGuard.Hook.html | Claude Code hook lifecycle (12 events × 4 handler types) |
Library AgentGuard.SubAgent | AgentGuard.SubAgent.html | A2A 1.0 task lifecycle, framework bridges |
Library AgentGuard.Coding | AgentGuard.Coding.html | Drive Claude Code / Codex / Aider / OpenCode + #42796 metric pack |
Library AgentGuard.Benchmark | AgentGuard.Benchmark.html | SWE-bench Verified, Aider, HumanEval, MBPP, LiveCodeBench |
Library AgentGuard.Scenario | AgentGuard.Scenario.html | Unified scenario harness — drop-in for manykarim/rf-mcp e2e |