Path2Prot offers a new way for breast tumor subtyping and treatment response prediction from AI-inferred proteomic biomarkers
Presenter: Saugato Rahman Dhruba, PhD Session: Digital Pathology 1 Time: 4/19/2026 2:00:00 PM → 4/19/2026 5:00:00 PM
Authors
Saugato Rahman Dhruba 1 , Danh-Tai Hoang 1 , Sumit Mukherjee 1 , Amos Stemmer 1 , Eldad Shulman 1 , Ranjan Kumar Barman 2 , Sanna Madan 2 , Sanju Sinha 3 , Kenneth D. Aldape 4 , Eytan Ruppin 5 1 National Cancer Institute - Cancer Data Science Laboratory (CDSL), Bethesda, MD, 2 National Cancer Institute - Cancer Data Science Laboratory (CDSL), Rockville, MD, 3 Sanford Bernham Prebys, La Jolla, CA, 4 Professor, Dept. of Pathology, Chair, NCI-CCR, Bethesda, MD, 5 Cedars-Sinai Medical Center, Los Angeles, CA
Abstract
Background: The advent of AI is revolutionizing precision medicine, including digital pathology where large foundation models (FMs) are applied to readily extract genomic/transcriptomic patterns from tumor whole-slide histopathology images (WSIs). In contrast, fewer studies have attempted to derive direct functional insights via proteomics from tumor morphology in WSIs, partly due to data scarcity. Henceforth, we propose a weakly-supervised deep learning model called Path2Prot to infer the relative abundance of 413 clinically relevant proteomic biomarkers in breast cancer (BC) from tumor H&E slide images. We show the clinical utility of such models in tumor subtyping and treatment response prediction by leveraging the inferred proteomic markers. Methods: Path2Prot is composed of two stages: First , each WSI is preprocessed via a standard pipeline to a set of 512 x 512 tile images at 20x magnification, which are fed to a transformer-based FM to extract morphological features; Next , using these features with matched patient-level proteomics, a multilayer perceptron is trained to infer the proteomic marker levels. To train, we used 2,074 WSIs from 841 TCGA-BRCA patients and the matched reverse-phase protein array (RPPA) data for 413 proteins (total + post-translationally modified). We leveraged both WSI types available via building three distinct models: FFPE model , trained on 893 formalin-fixed paraffin embedded WSIs (used for diagnosis); FF model , trained on 1,181 fresh-frozen WSIs (better RNA quality); and Combo model , combining the predictions of both models. Results: We assessed model performance with Pearson correlation ( R ) between inferred and measured proteomics, where proteins with R ≥ 0.4 are referred as the well-predicted proteins (WPPs). The Combo model performed the best with 23.7% WPPs (mean R = 0.31) in cross-validation and successfully generalized to cross-platform mass spectrometry proteomics in external validation with CPTAC-BRCA with 27.1% WPPs (mean R = 0.28; Overlap-in-WPPs = 71.8%). We further dichotomized the inferred HER2 and ER levels to identify their immunohistochemistry status and assigned patient tumors to clinically actionable subtypes (HER2+, ER+ & TNBC) across TCGA-BRCA ( n = 733), CPTAC-BRCA ( n = 89), TransNEO ( n = 160) and IMPRESS ( n = 126). This task can be done fairly well, yielding area under the curve (AUC) values of HER2+ = 0.69-0.72, ER+ = 0.72-0.77 and TNBC = 0.83-0.88. Finally, the inferred protein targets successfully estimated anti-HER2 response in TransNEO ( n = 60, AUC = 0.71) and CSHS-BRCA ( n = 20, AUC = 1.0) cohorts, and anti-PD1 response in CSHS-BRCA ( n = 16, AUC = 0.78). Conclusion: Our analysis reveals a clinically important subset of proteins in breast cancer can be robustly predicted from routine WSIs for clinical application. One may expect to significantly improve upon these results with the advent of larger proteomics datasets.
Disclosure
S. Dhruba, None.. D. Hoang, None.. S. Mukherjee, None.. A. Stemmer, None.. E. Shulman, None.. R. K. Barman, None.. S. Madan, None.. S. Sinha, None.. K. D. Aldape, None. E. Ruppin, Medaware Ltd. Other, Eytan Ruppin is a cofounder of Medaware Ltd.. Metabomed Other, Eytan Ruppin is a cofounder of Metabomed. Pangea Biomed Other, Eytan Ruppin is a cofounder (divested) and non-paid scientific consultant of Pangea Biomed. GSK Other, Eytan Ruppin is a scientific advisory board member of GSK Oncology. WIN Consortium Other, Eytan Ruppin is a scientific advisory board member of WIN consortium. ProCan Program Other, Eytan Ruppin is a scientific advisory board member of ProCan program, Other, Eytan Ruppin is a scientific advisory board member of ProCan program.
Cited in
Control: 7325 · Presentation Id: 3091 · Meeting 21436