PRISM: Pathway-Informed Response Inference from Multi-omics using Graph Neural Networks for Cancer Drug Sensitivity Prediction

Hong Joo Ryoo, Johns Hopkins University

0000-0002-4032-0215

ACCESS Allocation Request MED260007

Abstract:	PRISM is a Graph Neural Network (GNN) framework for predicting cancer drug sensitivity (IC50 values) that incorporates biological pathway structure to achieve generalization across unseen cancer types. The system encodes protein-protein interaction networks from STRING directly into the model architecture, enabling discovery of pathway-level patterns that transfer across cancer histologies. The core challenge: traditional ML (XGBoost, Random Forest) achieves reasonable performance on random splits but fails catastrophically (negative R²) on held-out cancer subtypes—the clinically relevant scenario. GNNs offer a principled solution by learning over biological graph structures rather than memorizing feature correlations. Research Goals: 1. Benchmark GNN architectures (GAT, GCN, GIN) against XGBoost/Random Forest baselines on histology-based and tissue-based generalization splits using GDSC (Genomics of Drug Sensitivity in Cancer) data 2. Develop multi-task architecture for 265 drugs with shared biological encoder, using Ray Tune distributed hyperparameter optimization with ASHA early-stopping 3. Evaluate biological prior integration strategies using STRING protein-protein interactions and KEGG pathway membership for graph construction In short, here are our planned use of ACCESS resources: GPU Computing: PyTorch Geometric GNN training with graph attention over ~5,000 gene nodes; Ray Tune parallel architecture search across layers, hidden dimensions, attention heads CPU Computing: Data preprocessing, STRING graph construction (~20K proteins, millions of edges), XGBoost/Random Forest baseline training Storage: Multi-omics dataset (~2 GB), processed PyTorch tensors (~5 GB), model checkpoints from architecture search (~20 GB), logs (~5 GB). Total: ~35 GB Benchmarking: Compare GNN generalization performance vs traditional ML on challenging cross-cancer-type prediction splits We anticipate the following software requirements: Containers: Singularity/Apptainer for HPC compatibility GNN Stack: Python 3.10+, PyTorch 2.0+, PyTorch Geometric 2.4+, Ray Tune ML Baselines: scikit-learn, XGBoost Data Processing: pandas, NumPy, NetworkX, h5py We request 400,000 ACCESS credits for initial architecture search and benchmarking.

Allocations:

2026	NCSA Delta GPU	1,000.0 GPU Hours
The estimated value of these awarded resources is $527.19. The allocation of these resources represents a considerable investment by the NSF in advanced computing infrastructure for the U.S. The dollar value of the allocation is estimated from the NSF awards supporting the allocated resources.

There are no other allocations for this project.

Other Titles:

There are no prior titles for this project.