An MCP-enabled AI agent for automated multimodal genomics analysis in the Isabl Platform

Presenter: Juan Arango Ossa, M Eng Session: Agentic AI in Cancer Time: 4/19/2026 2:00:00 PM → 4/19/2026 5:00:00 PM

Authors

Juan E. Arango Ossa 1 , Dylan Domenico 2 , Asher Preska Steinberg 1 , Eliyahu Havasov 2 , Gunes Gundem 2 , Konstantinos Liosis 2 , Alessandro Grande 1 , Jesús Gutierrez-Abril 3 , Elli Papaemmanuil 2 , Sohrab Shah 2 , Andrew William McPherson 2 1 Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 2 Memorial Sloan Kettering Cancer Center, New York, NY, 3 Memorial Sloan Kettering Cancer Center, New York

Abstract

Background: Modern precision oncology pipelines generate petabyte-scale multimodal genomics data requiring coordinated access to databases, APIs, and reproducible workflows. The Isabl Platform (Medina et al., BMC Bioinformatics 2020) is broadly adopted across multiple groups at Memorial Sloan Kettering Cancer Center (MSKCC) and external research institutions. In the Halvorsen Center for Computational Oncology at MSKCC, the platform manages 470 projects, 105k sequencing experiments from 70k individuals, and 580k analyses totaling more than 4.5 PB of data. Although powerful, its depth and schema complexity create a learning curve for analysts, clinicians, and computational researchers. Large Language Models (LLMs) reduce this barrier by translating natural-language questions into actionable queries, retrieving documentation, and safely interacting with evolving scientific tools. Methods: We developed the Isabl AI agent and an Isabl MCP, enabling any MCP client to access Isabl tools through a standardized interface. The agent integrates Retrieval-Augmented Generation (RAG), which retrieves relevant documentation to ground outputs, with the Model Context Protocol (MCP), a common standard for tool discovery and safe execution in LLM-based applications. GitBook documentation, the OpenAPI schema, CLI references, laboratory pipeline definitions, and core Isabl modules are indexed into a multi-vector semantic store. A ReAct-style agentic controller selects MCP tools such as call_isabl_api or run_isabl_app, supported by recursive chunking and modern embeddings. Results: The agent handles analytical tasks such as: • Cohort discovery, e.g.: “Identify pediatric B-ALL tumors with IKZF1 deletions and available RNA-seq.” It retrieves samples, summarizes counts, and reports assay availability. • Multi-step reasoning, e.g.: “How many high-risk neuroblastomas have 17q gain?” It finds eligible cases, locates CNV analyses, extracts 17q21-17q25 copy-number values, applies thresholds, and reports frequencies. • Workflow execution, e.g.: “Launch the whole-genome variant-calling pipelines for newly added pediatric sarcoma patients with matched germline controls.” It identifies tumor-normal pairs, checks existing analyses, and submits pipelines. These tasks show reduced onboarding time, intuitive schema navigation, and improved execution through natural language. Conclusions: Isabl AI Agent + MCP demonstrates how RAG, MCP’s standardized interface, and agentic reasoning simplify access to complex genomic systems. As MCP adoption grows in AI for scientific discovery, an Isabl MCP enables domain-specific capabilities to integrate into general-purpose AI models, providing a sustainable path for AI copilots that accelerate translational genomics research.

Disclosure

J. E. Arango Ossa, None.. A. Preska Steinberg, None.. E. Havasov, None.. K. Liosis, None.. A. Grande, None.

Cited in


Control: 5131 · Presentation Id: 2496 · Meeting 21436