PIGE: Pathway-Informed Graph Explanation Framework for Drug Response Interpretability

Overview

PIGE (Pathway-Informed Graph Explanation) is a deep learning framework that predicts anti-cancer drug response by understanding the underlying biological mechanisms. PIGE allows you to answer the following questions:

Why is this cell line sensitive or resistant to this drug?
Which pathways/genes/interactions drive the drug response or lack thereof?
What happens to drug response if we block a specific pathway/gene/interaction?

Precomputed results can be explored using Interactive PIGE Atlas: https://www.pigeatlas.com

Approach

PIGE integrates multi-omics data (mutations, copy number alterations, gene expression) with drug chemical structures through a biologically informed pathway interaction network. The model architecture processes genomic features through this pathway graph, creating interpretable pathway-level representations. By systematically knocking out pathways and genes and measuring changes in model predictions, PIGE generates importance scores that identify influential biological mechanisms.

Key Results

Improved Prediction Performance and Generalizability
- Spearman ρ=0.84 in 5-fold cross-validation on CTRPv2 training data
- Spearman ρ=0.56 on external GDSC2 validation (excluding all overlapping cell line-drug pairs)
- Outperforms DrugCell (ρ=0.52), DRPreter (ρ=0.51), and other state-of-the-art methods on external validation
Enhanced Interpretability
- 74% pathway hit rate at K=25 vs 50% for CRISPR differential essentiality, 60% for GSEA, and 29% for DrugCell
- Over 60% gene hit rate at K=100 (representing 4.8% of all genes) vs 41% for CRISPR screens
- Recovers specific mechanisms of action including non-obvious targets (e.g. CX3CL1 as a resistance hub to chemotherapy in triple-negative breast cancer, only discovered experimentally in 2017)
Enhanced Clinical Translation Potential
- Validated on BeatAML ex vivo patient samples
- Generalizes to unseen drugs and drug classes (e.g. MDM2 inhibitors not present in training data)
- Identifies patient-specific resistance mechanisms that suggest rational combination therapies

Architecture

PIGE is based on a biologically informed pathway interaction network constructed from Gene Ontology (GO) biological processes and OmniPath protein-protein interactions. The architecture consists of three integrated components:

This design mirrors biology: each pathway's influence on a cell's response is determined by its interactions with upstream pathways, the cell's genomics, and the drug's properties.

Citation

If you use PIGE in your research, please cite:

[Will be added when paper is published]

Web Application

Interactive PIGE Atlas: https://www.pigeatlas.com - Explore 45,000+ pathway graphs across 70 drugs and 800+ cell lines

Project Structure

The project consists of the following folders:

data/ - Contains all data on which models were built
- input_data/ - Raw input data (omics, drug response, pathway data)
- intermediate_data/ - Processed intermediate files (pathway graphs, precomputed features)
- output_data/ - Model outputs and trained model checkpoints
src/ - Source code for PIGE
- data_processing/ - Data preprocessing and feature generation
- models/ - Model architectures (Structural Causal Model, GAT layers)
- pathway_interaction_setup/ - Pathway graph construction from GO and OmniPath
- model_interpretability/ - Virtual knockout analysis and interpretability tools
- pige_atlas/ - Interactive graph visualization and atlas generation
- train/ - Training scripts and experiment management
- train_test/ - External validation and evaluation scripts
notebooks/ - Jupyter notebook tutorials and examples
- 01_train_erlotinib.ipynb - Train PIGE on a single drug
- 02_knockout_analysis.ipynb - Virtual knockout analysis
- 03_generate_pige_graphs.ipynb - Generate interactive pathway graphs

Quick Start

Virtual Environment Setup

We recommend using a virtual environment to manage dependencies:

# Create virtual environment (using conda)
conda create -n pige python=3.10
conda activate pige

# Or using venv
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Installation

# Install PyTorch (adjust CUDA version as needed)
# For CUDA 11.0+
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 -f https://download.pytorch.org/whl/torch_stable.html

# Install PyTorch Geometric and dependencies
pip install torch-scatter torch-sparse torch-cluster torch-spline-conv -f https://data.pyg.org/whl/torch-1.12.0+cu116.html
pip install torch-geometric

# Install other dependencies
pip install -r requirements.txt

Basic Usage

Start with the tutorial notebooks in notebooks/:

Train a model: notebooks/01_train_erlotinib.ipynb (~15 minutes)
Run interpretability: notebooks/02_knockout_analysis.ipynb (~5 minutes)
Generate graphs: notebooks/03_generate_pige_graphs.ipynb (~5 minutes)

For programmatic usage, use the main pipeline:

from src.pipeline.pipeline_orchestrator import PipelineOrchestrator
from src.main import resolve_config_variables
import yaml

# Load and resolve config
with open('src/configs/config.yaml', 'r') as f:
    config = yaml.safe_load(f)
config = resolve_config_variables(config)

# Run pipeline
orchestrator = PipelineOrchestrator(config)
orchestrator.run()

Notebooks

Interactive tutorials demonstrating PIGE's capabilities:

Tutorial	Description	Runtime	Link
Train PIGE	Train PIGE on a single drug (Erlotinib) using the full 5-stage pipeline	~15 min	`notebooks/01_train_erlotinib.ipynb`
Virtual Knockout Analysis	Identify key pathways and genes driving drug sensitivity vs resistance	~5 min	`notebooks/02_knockout_analysis.ipynb`
Generate PIGE Graphs	Create interactive visualizations of pathway crosstalk and mechanisms	~5 min	`notebooks/03_generate_pige_graphs.ipynb`

Data

Data Sources

PIGE uses the following publicly available datasets:

CTRPv2 - Cancer Therapeutics Response Portal v2 (drug response data)
GDSC - Genomics of Drug Sensitivity in Cancer (external validation)
DepMap - Dependency Map (omics data: mutations, CNAs, expression)
Gene Ontology - Biological process annotations (pathway definitions)
OmniPath - Protein-protein interaction network (pathway crosstalk)

Requirements

System Requirements

Python: 3.10+
PyTorch: 1.13+ (with CUDA support recommended)
CUDA: 11.0+ (GPU recommended, 8GB+ VRAM)
- Training and interpretability will work on CPU but will be significantly slower
Disk Space: ~30GB for data and intermediate files (if training on all 68 drugs)
RAM: 16GB+ needed for interpretability analysis

Contact

For questions, issues, or collaboration inquiries:

Email: cbahl076@uottawa.ca
Issues: Please open an issue on GitHub for bug reports or feature requests

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PIGE: Pathway-Informed Graph Explanation Framework for Drug Response Interpretability

Overview

Approach

Key Results

Architecture

Citation

Web Application

Project Structure

Quick Start

Virtual Environment Setup

Installation

Basic Usage

Notebooks

Data

Data Sources

Requirements

System Requirements

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PIGE: Pathway-Informed Graph Explanation Framework for Drug Response Interpretability

Overview

Approach

Key Results

Architecture

Citation

Web Application

Project Structure

Quick Start

Virtual Environment Setup

Installation

Basic Usage

Notebooks

Data

Data Sources

Requirements

System Requirements

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages