fih-fact-check-pipeline.md

Source type: memory · Harvested: 2026-05-04 · Original date: 2026-05-03T17:08:12.768Z Metadata: {"original_path":"fih-fact-check-pipeline.md"}



name: FIH Scout Fact-Check Pipeline description: 報告驗證 pipeline(PDF → claim extraction → DB verify → external verify → audit)。commit 89bb999。 type: project originSessionId: 08ba45d7-a603-42f8-acdf-4d36a001d3dd

FIH Scout Report Fact-Check Pipeline

Repo~/Projects/oncology-fih-research/pipeline/ Commit89bb999 on main(2026-05-04) GPT-5.5 reviewed:架構通過,吸收 3 個建議

用法

cd ~/Projects/oncology-fih-research
PYTHONPATH=. python -m pipeline --pdf output/fih-scout-ip-memo.pdf
PYTHONPATH=. python -m pipeline --pdf output/fih-scout-ip-memo.pdf --skip-external

5-stage 架構

  1. Extract:pdftotext → 句子/表格拆分 → regex 分類(db_numeric / external_fact / judgment)
  2. DB Verify:15 SQL templates vs fih-scout.db(integer 精確 / % ±0.5pp / avg ±5%)
  3. External Verify:CT.gov API v2(藥物 trial count + sponsor)+ PubMed E-utilities
  4. Audit:交叉比對 → AGREE / SOURCE_DRIFT / DISAGREE / UNCERTAIN
  5. Report:Markdown + JSON 輸出到 output/verification/

GPT-5.5 Review 吸收的 3 個設計改進

  1. SOURCE_DRIFT verdict:CT.gov live count ≠ DB snapshot count 時,不判 DISAGREE 而是 SOURCE_DRIFT
  2. denominator + cohort_size:每個 VerificationResult 帶分母和 cohort 大小
  3. DEFINITION_MISMATCH:scope 定義不同造成的差異(如 CT.gov 全庫 206 vs memo FIH subset 20)

IP memo 首次跑結果

  • 175 claims extracted / 42 DB checks / 7 external checks
  • 36 ✅ VERIFIED / 3 ⚠️ PARTIAL / 1 ❌ DISAGREE / 1 🔄 SOURCE_DRIFT
  • 唯一 DISAGREE:Ivonescimab memo=20 DB=25(supplement 後多了 5 個 trial)

已知限制

  • HER2/DXd target count 無法由 DB 驗證(缺 canonical target taxonomy)→ 已從 DB check 移除
  • External CT.gov count 是全庫(所有 phase + 適應症),跟 memo 的 FIH subset 不可直接比
  • 108/175 claims 被分為 judgment(戰略判斷),不做事實驗證
  • output/verification/ 目錄未 git tracked(binary/large)

下一步

  • 加 target synonym registry 解決 HER2/DXd 驗證
  • Agent team 模式:spawn DB/External/Audit 三個 parallel agent
  • 對中文版 memo (fih-scout-ip-memo-zh.pdf) 也跑一次

[← 回 Alfred Brain Hub]