Patient-Centered Tolerability: Making ‘Can the Patient Live on This Dose?’ a Measurable Endpoint
以病人為中心的耐受性:把「病人能否長期活在這個劑量上」變成可量測的 endpoint
English
The most persistent blind spot in oncology dose-finding is the gap between what investigators record and what patients experience. A DLT (dose-limiting toxicity) is a physician-defined threshold — a grade 3 or higher adverse event within a specified observation window. It was designed for cytotoxic chemotherapy, where the dominant safety concern is acute severe toxicity occurring within the first cycle. For modern targeted therapies, immune therapies, ADCs, and T-cell engagers that patients may take for months or years, this threshold misses the clinical question that matters most: can this patient continue to live a functional life at this dose?
The structural problem with CTCAE alone. The CTCAE (Common Terminology Criteria for Adverse Events) grades toxicity severity from a clinician’s perspective. Grade 2 diarrhea is defined as “increase of 4–6 stools per day over baseline.” What CTCAE does not capture is whether that diarrhea lasted three days or three months, whether it required multiple emergency clinic visits, whether the patient stopped working, or whether it was already affecting adherence before the next dose was due. The AACR 2025 Industry Roundtable on dose optimization made this explicit: oncologist Mark Kris noted that CTCAE grade 2 events are frequently treated as “manageable” by investigators, but for the patient, persistent vomiting, four to six episodes of diarrhea per day requiring IV fluid replacement, or months of fatigue that prevent working, sleeping, and leaving the house are not manageable in any meaningful sense.
Three layers of tolerability. The international OPTIMISE-ROR consensus (published JCO 2026) provides the most useful framework: tolerability has three measurable layers. The first is the overall side effect impact summary — how much do adverse events collectively affect the patient’s life? The second is patient-reported symptomatic adverse events — the frequency, severity, and interference of specific symptoms as rated by the patient, not the investigator. The third is health-related quality of life — overall functioning in physical, emotional, social, and role domains. All three can be collected prospectively in early-phase trials using validated instruments such as PRO-CTCAE (the patient-reported outcomes version of the CTCAE), EORTC-QLQ, or FACT. The OPTIMISE-ROR consensus specifies six recommendations: PRO objectives should be pre-specified, analysis methods stated in advance, comparison across dose levels and over time planned, and results should be considered in the final dose recommendation.
PRO as dose-selection evidence, not narrative color. The distinction that every clinician teaching FIH trials needs to internalize is this: PRO data that is collected but not pre-specified, not analyzed by dose level, and not incorporated into the dose recommendation is decorative — it tells you the team thought about patients without actually giving patients a voice in the decision. The 2024 EClinicalMedicine expert roundtable led by Yap and colleagues drew three distinct potential roles for PRO in early-phase trials: (1) describing tolerability — a qualitative understanding of what the patient experiences; (2) guiding dose decisions — structured comparison of PRO scores across dose levels informing which dose to recommend; and (3) acting as real-time safety alerts — flagging when patient-reported symptoms cross pre-specified thresholds that trigger clinical review. Trials at the forefront of dose optimization are moving toward all three simultaneously.
The WIN-DOSE framework for comparative dose selection. A 2026 Clinical Cancer Research paper introduced WIN-DOSE, a method using generalised pairwise comparisons (GPC) and win ratios (WR) to formally integrate safety, efficacy, dose intensity, and patient-reported tolerability into a single comparative dose-selection framework. The clinical logic is this: when two doses produce similar response rates, the dose comparison cannot be made on efficacy alone. WIN-DOSE pre-specifies a hierarchy — for example: first compare DLT/severe toxicity; then compare objective response; then compare dose intensity achieved (as a proxy for how often dose reductions or interruptions occurred); then compare patient-reported symptom burden. The dose that “wins” more pairwise comparisons across this hierarchy is selected. The educational value for clinicians is not the mathematics but the forced explicitness: “higher dose has a slightly better response rate but substantially worse diarrhea and fatigue” can no longer be resolved by gut feeling; it must be resolved by a pre-declared priority ordering.
What happens when tolerability is missed in early trials. A 2025 analysis by Kitagaki and colleagues in Clinical Pharmacology & Therapeutics examined FDA-approved oncology drugs that subsequently required post-marketing dose optimization (either as a requirement or commitment). Three early signals were significantly associated with post-marketing dose optimization requirements: (1) the approved labeled dose was the MTD; (2) a higher proportion of patients discontinued due to adverse events; and (3) an established exposure-safety relationship existed at approval. This means that the cost of inadequate tolerability assessment in early-phase trials is not just academic — it is paid by the post-approval patient population, in unnecessary toxicity, until a corrected dose can be studied and relabeled.
For clinicians reading a FIH paper. Three questions replace “was it tolerable?”: First, what did the investigators record? (Grade-based toxicity assessment, CTCAE.) Second, what did patients report? (PRO instruments, if any were used.) Third, did the reported tolerability actually influence the dose recommendation? If the answer to questions two and three is “not reported” or “not pre-specified,” the RP2D may be tolerable in the short-term, first-cycle, grade-3-threshold sense — but its long-term clinical usability remains unknown.
中文
腫瘤劑量探索中最持久的盲點,是研究者記錄的內容與病人實際經歷之間的落差。DLT(dose-limiting toxicity,劑量限制毒性)是由醫師定義的門檻——在特定觀察窗口內發生的第 3 級或以上不良事件。它是為細胞毒性化療設計的,那個時代的主要安全性顧慮是第一療程內發生的急性嚴重毒性。對現代可能持續使用數月或數年的標靶治療、免疫治療、ADC 和 T-cell engager,這個門檻錯過了最重要的臨床問題:病人能否在這個劑量上繼續過有功能的生活?
單獨使用 CTCAE 的結構性問題。 CTCAE 從臨床醫師的角度評分毒性嚴重度。第 2 級腹瀉定義為「比基準線每天多排 4–6 次大便」。CTCAE 未能捕捉的是:這個腹瀉持續了三天還是三個月?是否需要多次緊急門診?病人是否停止工作?是否在下次給藥前就已影響用藥依從性?2025 年 AACR 劑量最佳化產業圓桌會議明確指出這一點:腫瘤科醫師 Mark Kris 指出,第 2 級不良事件常被研究者視為「可處理」,但對病人而言,持續嘔吐、每天 4–6 次需要靜脈補液的腹瀉,或影響工作、睡眠和出門的數月疲倦,在任何有意義的定義下都不是「可處理」的。
耐受性的三個層次。 國際 OPTIMISE-ROR 共識(JCO 2026 發表)提供了最有用的框架:耐受性有三個可測量的層次。第一層是整體副作用影響摘要——不良事件整體上對病人生活的影響有多大?第二層是病人自述的症狀性不良事件——特定症狀的頻率、嚴重度和干擾程度,由病人(而非研究者)評分。第三層是健康相關生活品質——身體、情緒、社交和角色功能領域的整體功能。這三層都可以在早期試驗中用已驗證的工具前瞻性收集,例如 PRO-CTCAE(CTCAE 的病人自述版)、EORTC-QLQ 或 FACT。OPTIMISE-ROR 共識提出六項建議:PRO 目標應預先說明、分析方法預先陳述、跨劑量層和隨時間的比較預先規劃,結果應被納入最終劑量建議。
PRO 作為劑量選擇證據,而非敘事色彩。 每位教授 FIH 試驗的臨床醫師需要內化的區分是:被收集但未預先說明、未依劑量層分析、且未被納入劑量建議的 PRO 資料是裝飾性的——它告訴你研究團隊考慮過病人,而實際上沒有給病人在決定中發聲的機會。Yap 等人主持的 2024 eClinicalMedicine 專家圓桌,為早期試驗中 PRO 的潛在角色區分了三種:(1) 描述耐受性——對病人體驗的定性理解;(2) 引導劑量決策——跨劑量層 PRO 分數的結構化比較,告知哪個劑量應被推薦;以及 (3) 作為即時安全警示——當病人自述症狀超過預設門檻時觸發臨床審查。劑量最佳化前沿的試驗正朝向三者同時並行。
WIN-DOSE 框架用於比較劑量選擇。 2026 年 Clinical Cancer Research 論文介紹 WIN-DOSE,一種使用廣義成對比較(GPC)和勝率比(WR)的方法,將安全性、療效、劑量強度和病人自述耐受性正式整合進單一的比較劑量選擇框架。臨床邏輯是:當兩個劑量產生類似的反應率時,劑量比較不能只依據療效。WIN-DOSE 預先說明一個層次順序——例如:首先比較 DLT/嚴重毒性;然後比較客觀反應;然後比較達到的劑量強度(作為劑量減少或中斷頻率的代理指標);然後比較病人自述的症狀負擔。在這個層次順序中贏得更多成對比較的劑量被選出。對臨床醫師而言,教育價值不在數學,而在強制明確性:「高劑量反應率稍好但腹瀉和疲倦明顯更差」,不再能依靠直覺解決;必須依據預先宣告的優先順序解決。
早期試驗忽視耐受性的後果。 Kitagaki 等人 2025 年在 Clinical Pharmacology & Therapeutics 的分析,檢視了後來需要上市後劑量最佳化(要求或承諾)的 FDA 核准腫瘤藥物。三個早期訊號與上市後劑量最佳化要求顯著相關:(1) 核准標示劑量就是 MTD;(2) 更高比例的病人因不良事件停藥;(3) 核准時已建立暴露-安全性關係。這意味著早期試驗中耐受性評估不足的代價,不只是學術性的——它由上市後的病人族群承受,以不必要的毒性形式支付,直到更正的劑量能被研究並重新標示。
臨床醫師讀 FIH 論文時。 三個問題取代「耐受性好不好?」:第一,研究者記錄了什麼?(基於分級的毒性評估,CTCAE。)第二,病人自述了什麼?(PRO 工具,如有使用。)第三,報告的耐受性是否真的影響了劑量建議?如果第二和第三個問題的答案是「未報告」或「未預先說明」,RP2D 可能在短期、第一療程、第 3 級門檻的意義上是可耐受的——但其長期臨床可用性仍然未知。
Key Concepts | 核心概念
| 概念 | 說明 |
|---|---|
| CTCAE 的盲點 | 只捕捉醫師評分的急性嚴重毒性,不捕捉持續低階症狀或長期生活影響 |
| PRO-CTCAE | CTCAE 的病人自述版,補上症狀頻率、嚴重度和干擾程度 |
| 三層耐受性 | 整體副作用影響 + 病人自述症狀 + 健康相關生活品質 |
| OPTIMISE-ROR 共識 | 早期試驗應預先定義 PRO 目標、分析方式和跨劑量比較 |
| WIN-DOSE | 用勝率比將安全性、療效、劑量強度和 PRO 整合進統一的劑量比較框架 |
| 上市後補課風險 | 若早期試驗未評估耐受性,後續被要求重做劑量最佳化的風險增加 |
| 病人中心劑量倡議 | 挑戰「越多越好」假設,要求早期將病人聲音納入劑量決策 |
教學活動建議: 給學員兩個候選 RP2D——療效類似,但高劑量有更多 grade 2 持續症狀和劑量減少。要求學員用三層耐受性框架(醫師記錄 / 病人自述 / 生活功能),以「對病人解釋為何選此劑量」的方式說明他們的選擇。
Related Pages | 相關頁面
- clinician-teaching-module-three-lines — 三線讀法:耐受性是劑量邏輯線的核心組成
- fih-paper-reading-checklist — FIH 論文判讀清單:安全性欄的詳細展開
- dose-optimization-teaching-synthesis — 劑量最佳化教學總整理:dose-finding 到 dose-optimization 的差距
- modern-phase-i-outcomes — 現代 phase I:ADC、TCE 帶來新的耐受性挑戰
- project-optimus-dose-optimization — Project Optimus 主頁(Batch 1)
- fih-oncology-wiki-index — 整個 wiki 的總索引