Backfill Strategies and Adaptive Randomization for Dose Optimization

回填策略與適應性隨機化:讓劑量最佳化從加量終點延伸為比較候選劑量

English

The fundamental premise of traditional phase I oncology dose escalation is that each cohort is a stepping stone toward a higher dose. The trial moves upward, one level at a time, until the maximum tolerated dose is identified or a stopping rule triggers. This mental model — dose escalation as a ladder — has a logical consequence that only became problematic in the era of Project Optimus: by the time the trial finishes escalating, the lower and middle dose levels have often been seen in only three to six patients, each followed for just one DLT observation window. This is barely enough to detect gross toxicity, and entirely inadequate to estimate whether a lower dose might be just as effective, significantly more tolerable during extended therapy, and better suited to serve as the foundation for combination regimens.

Backfill — the practice of enrolling additional patients at dose levels that have already cleared the initial safety threshold — is the structural solution to this data gap. The word sounds simple, but the challenge is that backfill only generates useful information if it is designed with a specific scientific question in mind. A 2025 simulation study in Statistics in Biopharmaceutical Research compared multiple backfill strategies and arrived at a conclusion that should discipline how clinical teams think about this tool: backfill should be triggered by pre-specified criteria, not by opportunistic availability of patients. The meaningful triggers are clinical — a dose level showing preliminary efficacy signals, a pharmacokinetic or pharmacodynamic readout that departs from predictions, or a need to characterize the exposure-response relationship before committing to a RP2D. If backfill is simply “enroll more patients at any dose that has been cleared,” the result is more data without more insight. If backfill is “enroll more patients at this specific dose because we have a hypothesis about its efficacy and tolerability that requires more evidence to confirm or refute,” it becomes a genuine scientific step.

The BARD design (Bayesian Adaptive Randomization Design), described in a 2026 Statistics in Biopharmaceutical Research paper, takes the backfill logic one step further. BARD treats the entire dose exploration and dose optimization process as a single, integrated trial. In the first stage, it behaves like a conventional Bayesian dose-escalation study, identifying the range of doses that appear to be both tolerable and pharmacologically active. In the second stage, it uses adaptive randomization to allocate patients between multiple promising doses, continuously updating the randomization probabilities based on accumulating efficacy and toxicity data. The practical effect is that patients later in the trial are systematically more likely to be assigned to the dose that appears to offer the best benefit-risk profile based on everything learned so far. This is not just a statistical optimization — it is a direct expression of the Project Optimus philosophy that the goal of early oncology development is to identify the dose that is best for patients, not merely the dose that is highest tolerated.

A 2026 seamless phase I/II design paper in Pharmaceutical Statistics formalized this integration further. The design uses Bayesian optimal interval boundaries to simultaneously monitor efficacy and toxicity during the dose-escalation phase, selects two candidate doses for a formal comparison phase, and then uses joint monitoring to reach a decision about both the optimal dose and the treatment’s clinical activity in a single uninterrupted trial. The appeal is obvious: eliminating the gap between phase I and phase II reduces the number of patients exposed during the information-gathering period and speeds the timeline to a definitive dose recommendation. The caution is equally clear: seamless designs require more sophisticated statistical oversight, more pre-specified decision rules, and greater trust from regulatory bodies that the early data are reliable enough to direct the later phase without introducing bias.

Preclinical prior information occupies a particularly nuanced role in all of these adaptive designs. Bayesian methods, by their nature, allow the incorporation of prior beliefs — including beliefs derived from animal toxicity studies, in vitro pharmacology, or data from related molecules in the same class. The theoretical efficiency gain is real: if the preclinical data reliably predicts human pharmacology, incorporating it as a prior allows the clinical trial to reach confident conclusions with fewer patients. A 2026 Pharmaceutical Statistics paper examined how to integrate preclinical insights into adaptive phase I dose escalation and emphasized both the promise and the hazard. The hazard is specific to drugs where the gap between animal pharmacology and human biology is large — immune activators where cross-reactivity to the human target differs from animal models, ADCs where linker stability or payload distribution varies by species, T-cell engagers where the magnitude of cytokine release in human blood cannot be predicted from mouse studies, and radiopharmaceuticals where organ biodistribution depends on species-specific expression patterns. For all of these, the prior should be treated as a hypothesis to be tested by the first human cohort, not as a reliable prediction that reduces the need for caution. Robust or discounted Bayesian priors — which give less weight to the preclinical information when human data conflicts with it — are the appropriate statistical tool, but only if the conflict resolution rules are pre-specified in the protocol.

The overarching lesson from backfill and adaptive randomization is a shift in the way clinical teams should conceptualize what a phase I trial is for. Under the traditional ladder model, phase I ends when the top rung is reached — when MTD is declared and the dose-escalation sequence closes. Under the Project Optimus model, phase I ends when the team can answer the question: “which dose, from among the range of doses that are safe and pharmacologically active, produces the best benefit-risk profile for the patients who will ultimately receive this drug?” Answering that question requires data from across the dose range, not just at the ceiling, and it requires that data to include not just DLT counts but pharmacokinetics, pharmacodynamic biomarkers, early efficacy signals, dose modification rates, and patient-reported tolerability. Backfill and adaptive randomization are the mechanisms by which modern phase I trials generate that richer evidence base. They are not statistical enhancements to an otherwise adequate design — they are structural components of a fundamentally different and more clinically useful type of trial.

中文

傳統第一期腫瘤試驗劑量升階的基本前提,是每個 cohort 都是通往更高劑量的墊腳石。試驗一次往上升一個層級,直到找到 maximum tolerated dose(MTD,最大耐受劑量)或停止規則觸發為止。這種「劑量升階如同爬梯子」的心智模型,有一個邏輯後果——在 Project Optimus 時代才成為問題:當試驗完成升階時,較低和中間劑量層通常只有三到六位病人,每位僅被追蹤一個 DLT 觀察窗。這勉強足以偵測嚴重毒性,卻完全不足以估計較低劑量是否同樣有效、在長期治療期間是否顯著更易耐受,或是否更適合作為組合療法方案的基礎。

回填(backfill)——在已通過初步安全門檻的劑量層額外入組病人——是解決這個資料缺口的結構性方案。這個詞聽起來簡單,但挑戰在於:回填只有在設計時心中有特定科學問題的前提下,才能產生有用的資訊。2025 年 Statistics in Biopharmaceutical Research 的一篇模擬研究比較了多種回填策略,得出了一個應該約束臨床團隊思考這個工具方式的結論:回填應由預先指定的標準觸發,而不是由機會性的病人可及性觸發。有意義的觸發點是臨床性的——某個劑量層級顯示初步療效訊號、藥物動力學或藥效學讀出偏離預測、或在確立 RP2D 之前需要描述暴露-反應關係。如果回填只是「在任何通過安全門檻的劑量多收些病人」,結果是更多資料但沒有更多洞見。如果回填是「在這個特定劑量多收病人,因為我們對其療效和耐受性有一個需要更多證據來確認或否定的假說」,它就成為了一個真正的科學步驟。

BARD 設計(Bayesian Adaptive Randomization Design,貝氏適應性隨機化設計)於 2026 年 Statistics in Biopharmaceutical Research 論文中描述,把回填邏輯更推進了一步。BARD 把整個劑量探索和劑量最佳化過程視為一個整合的單一試驗。在第一階段,它的行為像常規的貝氏劑量升階研究,識別出似乎既可耐受又有藥理活性的劑量範圍。在第二階段,它使用適應性隨機化在多個有希望的劑量之間分配病人,根據不斷累積的療效和毒性資料持續更新隨機化機率。實際效果是:試驗後期的病人系統性地更可能被分配到根據目前所有學習顯示效益-風險比最佳的劑量。這不只是統計最佳化——它是 Project Optimus 哲學的直接表達:早期腫瘤藥物開發的目標是找到對病人最好的劑量,而不僅是最高可耐受的劑量。

2026 年 Pharmaceutical Statistics 的 seamless phase I/II 設計論文進一步正式化了這種整合。該設計在劑量升階階段使用 BOIN 最佳邊界同時監測療效和毒性,選出兩個候選劑量進行正式比較,然後用聯合監測在一個連續不中斷的試驗中,同時做出關於最佳劑量和治療臨床活性的決策。其吸引力顯而易見:消除第一期和第二期之間的間隔,減少了資訊收集期間暴露的病人數量,並加快了確定性劑量建議的時間線。謹慎之處同樣清晰:seamless 設計需要更複雜的統計監督、更多預先指定的決策規則,以及監管機構更大的信任——信任早期資料足夠可靠,可以指導後期試驗而不引入偏差。

先臨床的 prior information(先驗資訊)在所有這些適應性設計中扮演特別微妙的角色。貝氏方法從本質上允許納入先驗信念——包括來自動物毒性研究、體外藥理學或同一藥物類別相關分子資料的信念。理論效率增益是真實的:若非臨床資料能可靠預測人體藥理,把它作為 prior 納入,可以讓臨床試驗用更少病人達到可信結論。2026 年 Pharmaceutical Statistics 一篇論文研究如何把非臨床洞見整合進適應性第一期劑量升階,強調了其承諾與危害。危害對動物藥理與人體生物學差距大的藥物特別針對性——免疫活化藥物(與人類目標的交叉反應不同於動物模型)、ADC(連接子穩定性或 payload 分佈因物種而異)、T-cell engager(人類血液中細胞激素釋放的程度無法從小鼠研究預測),以及放射性藥物(器官生物分佈取決於物種特異性表達模式)。對所有這些,prior 應被視為需由第一批人體 cohort 驗證的假說,而不是減少謹慎需求的可靠預測。Robust 或折扣貝氏 prior——當人體資料與之衝突時給予非臨床資訊更少權重——是適當的統計工具,但前提是衝突解決規則已在 protocol 中預先指定。

從回填和適應性隨機化中浮現的整體教訓,是臨床團隊應如何概念化第一期試驗用途的轉變。在傳統梯子模型下,第一期在到達最高階梯時結束——當 MTD 宣布、劑量升階序列關閉。在 Project Optimus 模型下,第一期在團隊能回答以下問題時才結束:「在安全且有藥理活性的劑量範圍中,哪個劑量為最終將接受這種藥物的病人產生最佳效益-風險比?」回答這個問題需要整個劑量範圍的資料,而不只是在上限處的資料;且這些資料需要包含的不只是 DLT 計數,還有藥物動力學、藥效學生物標記、早期療效訊號、劑量調整率和病人報告的耐受性。回填和適應性隨機化是現代第一期試驗生成這個更豐富證據基礎的機制。它們不是對其他方面已足夠設計的統計增強——它們是一種從根本上不同且臨床上更有用試驗類型的結構性組成部分。

Key Concepts | 核心概念

術語定義
Backfill回填:在已通過安全門檻的劑量層額外入組病人
BARDBayesian Adaptive Randomization Design,貝氏適應性隨機化設計
Adaptive randomization依累積療效/毒性資料動態調整各組隨機化機率
Seamless phase I/II第一/二期無縫設計,消除兩期之間的停頓
Preclinical prior非臨床先驗資訊,動物或體外研究資料進入 Bayesian 模型
Robust prior折扣先驗:人體資料與動物衝突時自動降低 prior 權重
RP2DRecommended Phase 2 Dose,建議第二期劑量
Joint monitoring同時監測療效與毒性的統計策略