DNA methylation differentiates early-stage lung adenocarcinoma cells-of-origin with distinct smoking history and grade
Presenter: Xuan Li, BA;PhD Session: DNA Methylation Time: 4/20/2026 9:00:00 AM → 4/20/2026 12:00:00 PM
Authors
Xuan Cindy Li 1 , Diego Almanza 2 , Phuc Huu Hoang 1 , Thi-Van-Trinh Tranh 1 , Nicholas H. Juul 2 , Maximilian Diehn 2 , Tushar J. Desai 2 , Maria Teresa Landi 1 1 Division of Cancer Epidemiology & Genetics, National Cancer Institute, Rockville, MD, 2 Stanford University, Stanford, CA
Abstract
Lung cancer is a major cause of cancer-related mortality worldwide. Approximately 10%-20% of lung cancers occur in patients with no smoking history and are mostly lung adenocarcinoma (LUAD). Studies in mouse models showed that LUAD cases arising from alveolar epithelial type I (AT1) cells transitioning to alveolar type II (AT2) are less aggressive than those that originate directly from type 2 (AT2), but the origin of human LUAD remains unclear. Leveraging the bulk methylation array data from the EAGLE and Sherlock- Lung studies, two large multi-omics studies of lung cancer in people with and without smoking history, respectively, we infer the cells-of-origin for early-stage tumor samples and examine their associations with smoking history and grade. We impute cells-of-origin from bulk methylation array data of tumor and normal samples collected from 388 stage I LUAD cases without smoking history and 238 with smoking history. For the methylation reference profiles of normal lung cell types, we combined a published normal human cell methylation data set (Loyfer et al. 2023) from whole-genome bisulfite sequencing (WGBS) as well as purified AT1 and AT2 enzymatically converted methylation profiles sequenced in-house. We then perform principled feature engineering to select methylation sites that best distinguish cell types and represent each cell type in a balanced manner. To estimate the proportion of each lung cell type in the samples, we employ a combinatorial optimization strategy by formulating and solving the deconvolution as a quadratic program. The inferred compositions of tumor origins are then evaluated in relation to smoking history and tumor grade based on the International Association for the Study of Lung Cancer (IASLC) grading system. The tumor samples of LUAD cases without smoking history (n=238) have 11% higher AT1 presence compared to those with smoking history (n=388, p Our study establishes a principled deconvolution framework to infer tumor origins in LUAD using DNA methylation data. Mirroring findings in mouse models, our results indicate that AT1 cells may generate LUAD in humans, and that this appears to occur more frequently in patients without smoking history and with better tumor grades. Our work adds a new dimension to the understanding of early-stage LUAD and sheds light on patient stratification, prognostic evaluation, and therapeutic targeting of tumor origin-specific vulnerabilities.
Disclosure
X. C. Li, None. D. Almanza, Natera, Inc Employment. P. Hoang, None.. T. Tranh, None.. N. H. Juul, None. M. Diehn, Foresight Diagnostics Stock, Patent. AstraZeneca Independent Contractor, ). Regeneron Pharmaceuticals, Inc. Independent Contractor. ROCHE SEQUENCING SOLUTIONS, INC. Patent. CiberMed Stock. Perception Medicine Stock. Foresight Diagnostics Stock. Gritstone Bio Stock Option. T. J. Desai, None.. M. Landi, None.
Cited in
Control: 4548 · Presentation Id: 2192 · Meeting 21436