fih-fact-check-pipeline.md
Source type:
memory· Harvested: 2026-05-04 · Original date: 2026-05-03T17:08:12.768Z Metadata:{"original_path":"fih-fact-check-pipeline.md"}
name: FIH Scout Fact-Check Pipeline description: 報告驗證 pipeline(PDF → claim extraction → DB verify → external verify → audit)。commit 89bb999。 type: project originSessionId: 08ba45d7-a603-42f8-acdf-4d36a001d3dd
FIH Scout Report Fact-Check Pipeline
Repo:~/Projects/oncology-fih-research/pipeline/
Commit:89bb999 on main(2026-05-04)
GPT-5.5 reviewed:架構通過,吸收 3 個建議
用法
cd ~/Projects/oncology-fih-research
PYTHONPATH=. python -m pipeline --pdf output/fih-scout-ip-memo.pdf
PYTHONPATH=. python -m pipeline --pdf output/fih-scout-ip-memo.pdf --skip-external5-stage 架構
- Extract:pdftotext → 句子/表格拆分 → regex 分類(db_numeric / external_fact / judgment)
- DB Verify:15 SQL templates vs fih-scout.db(integer 精確 / % ±0.5pp / avg ±5%)
- External Verify:CT.gov API v2(藥物 trial count + sponsor)+ PubMed E-utilities
- Audit:交叉比對 → AGREE / SOURCE_DRIFT / DISAGREE / UNCERTAIN
- Report:Markdown + JSON 輸出到
output/verification/
GPT-5.5 Review 吸收的 3 個設計改進
- SOURCE_DRIFT verdict:CT.gov live count ≠ DB snapshot count 時,不判 DISAGREE 而是 SOURCE_DRIFT
- denominator + cohort_size:每個 VerificationResult 帶分母和 cohort 大小
- DEFINITION_MISMATCH:scope 定義不同造成的差異(如 CT.gov 全庫 206 vs memo FIH subset 20)
IP memo 首次跑結果
- 175 claims extracted / 42 DB checks / 7 external checks
- 36 ✅ VERIFIED / 3 ⚠️ PARTIAL / 1 ❌ DISAGREE / 1 🔄 SOURCE_DRIFT
- 唯一 DISAGREE:Ivonescimab memo=20 DB=25(supplement 後多了 5 個 trial)
已知限制
- HER2/DXd target count 無法由 DB 驗證(缺 canonical target taxonomy)→ 已從 DB check 移除
- External CT.gov count 是全庫(所有 phase + 適應症),跟 memo 的 FIH subset 不可直接比
- 108/175 claims 被分為 judgment(戰略判斷),不做事實驗證
output/verification/目錄未 git tracked(binary/large)
下一步
- 加 target synonym registry 解決 HER2/DXd 驗證
- Agent team 模式:spawn DB/External/Audit 三個 parallel agent
- 對中文版 memo (
fih-scout-ip-memo-zh.pdf) 也跑一次
[← 回 Alfred Brain Hub]