Skip to content

Plan validation report

Automated audit of PraisonAIBio against the gap-closure plan (Phase 0–1).

Run checks:

bash scripts/check_no_submission.sh
python scripts/validate_repo.py
python -m pytest tests/unit -q
python benchmarks/t2b_parity/eval_suite_runner.py

Summary

Area Status Notes
Package + tools PASS 28 tools, 10 toolsets, entry points OK
MCP PASS sysbio-full exposes all 28 tools
Workflows PASS Discovery, lifecycle, platform, cookbooks, eight-pillar pipeline
Skills PARTIAL 16 skills in catalog; not all 28 tools covered
Docs + examples PASS MkDocs, captured outputs, interactive guide
Hooks + policy PASS wire_bio_hooks(), SDK policy packs, policy gate
Benchmarks PASS 10-case T2B parity via prompt router (no self-score cheat)
Session / repro PASS repro_export writes manifests under run dir
Knowledge / RAG PARTIAL Bridge code; full RAG when optional deps installed
Phase 2 backlog DEFERRED OLS adapter, MCP Docker, 312-Q suite, PyPI publish

Benchmark integrity

T2B parity cases are scored with infer_tool_from_prompt()not by echoing expected_tool. CI runs eval_suite_runner.py (mean score must be ≥ 0.9).

python benchmarks/t2b_parity/eval_suite_runner.py
python benchmarks/run_all.py

Phase 2 (intentionally deferred)

  • ols_adapter.py (stub)
  • mcp/sysbio-server/Dockerfile
  • Full ClawBio bridge
  • 312-question T2B benchmark import
  • PyPI publish (release workflow ready; needs tag)

Last validated

Re-run python scripts/validate_repo.py after any structural change.