Plan validation report¶

Automated audit of PraisonAIBio against the gap-closure plan (Phase 0–1).

Run checks:

bash scripts/check_no_submission.sh
python scripts/validate_repo.py
python -m pytest tests/unit -q
python benchmarks/t2b_parity/eval_suite_runner.py

Summary¶

Area	Status	Notes
Package + tools	PASS	28 tools, 10 toolsets, entry points OK
MCP	PASS	`sysbio-full` exposes all 28 tools
Workflows	PASS	Discovery, lifecycle, platform, cookbooks, eight-pillar pipeline
Skills	PARTIAL	16 skills in catalog; not all 28 tools covered
Docs + examples	PASS	MkDocs, captured outputs, interactive guide
Hooks + policy	PASS	`wire_bio_hooks()`, SDK policy packs, policy gate
Benchmarks	PASS	10-case T2B parity via prompt router (no self-score cheat)
Session / repro	PASS	`repro_export` writes manifests under run dir
Knowledge / RAG	PARTIAL	Bridge code; full RAG when optional deps installed
Phase 2 backlog	DEFERRED	OLS adapter, MCP Docker, 312-Q suite, PyPI publish

Benchmark integrity¶

T2B parity cases are scored with infer_tool_from_prompt() — not by echoing expected_tool. CI runs eval_suite_runner.py (mean score must be ≥ 0.9).

python benchmarks/t2b_parity/eval_suite_runner.py
python benchmarks/run_all.py

Phase 2 (intentionally deferred)¶

ols_adapter.py (stub)
mcp/sysbio-server/Dockerfile
Full ClawBio bridge
312-question T2B benchmark import
PyPI publish (release workflow ready; needs tag)

Last validated¶

Re-run python scripts/validate_repo.py after any structural change.