Skip to content

our in-house version of AF2 including AF2-Multimer

Notifications You must be signed in to change notification settings

Kuhlman-Lab/alphafold2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

443 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Getting Started

Steps to set up a local installation of our modified version of AlphaFold2.

Prerequisites

Installation of Anaconda is required to load dependencies.

Installation

  1. Clone the repo:

    git clone https://github.com/Kuhlman-Lab/alphafold.git
  2. Load AlphaFold2 model weights from source using script: https://github.com/Kuhlman-Lab/alphafold/blob/main/setup/download_alphafold_params.sh

  3. Set up conda environment:

    conda env create -n alphafold -f setup/af2_env.yml
    conda activate alphafold
    pip3 install --upgrade jax==0.3.25 jaxlib==0.3.25+cuda11.cudnn805 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
    python3 -m pip install /path/to/alphafold/alphafold/

Usage

The input file should be a CSV file. Each new line in the CSV will be identified as a separate query and will generate an individual prediction. Each line should contain a sequence starting with a comma, and each chain is separated by a comma.

Find below the flags that can be provided to the model runner, grouped by functionality:

Input/Output

Flag Type Default Description
--input_dir str '' Directory containing input files.
--output_dir str '' Directory to store prediction results. Default is ./prediction_{datetime}.
--params_dir str '' Path to parameter weights directory. Default is ../alphafold/data/.
--compress_output flag False Compress result dictionaries per query.
--dont_write_pdbs flag False Skip writing PDB files.
--save_timing flag False Save runtime metrics as a .pkl file.
--batch_predictions flag False Enable batched predictions.
--no_logging flag False Disable logging output.
--design_run flag False Suppress most outputs unless overridden.
--initial_guess str None Path to initial guess PDB file.

Sequence control

Flag Type Default Description
--min_length int 1 Minimum chain length for prediction.
--max_pad_size int None Pad short sequences to this length to avoid recompilation.
--max_length int 2500 Maximum single-chain length.
--max_multimer_length int 2500 Maximum total multimer length. Note: validation limited above 1536 residues.

MSA generation

Flag Type Default Description
--msa_mode str 'MMseqs2-U+E' MSA mode. Options: MMseqs2-U+E, MMseqs2-U, single_sequence.
--custom_msa_path str None Path to custom .a3m files. msa_mode should be single_sequence to use this flag.
--insert_msa_gaps flag False Add gaps to custom MSAs based on sequence differences.
--update_msa_query_seq float 1.0 Sequence identity threshold for updating custom MSAs.

Template usage

Flag Type Default Description
--use_templates flag False Use templates from MMseqs2.
--max_template_date str '2100-01-01' Max release date for templates (YYYY-MM-DD).
--custom_template_path str None Path to custom template PDBs.
--rm_template_seq flag False Remove sequences from templates.
--dont_mask_template_interchain flag False Enable interchain template usage.
--permute_templates flag False Enable symmetric chain template permutation.

Model control

Flag Type Default Description
--use_ptm flag False Use pTM models for predicted aligned error (PAE).
--no_multimer_models flag False Use monomer models even for multimers.
--use_multimer_v1 flag False Use original v1 multimer weights.
--use_multimer_v2 flag False Use v2 multimer weights (instead of v3).
--num_models int 5 Number of models to run (1–5).
--num_ensemble int 1 Number of MSA ensemble runs per model (CASP14 used 8).
--random_seed int None Random seed for reproducibility.
--num_seeds int 1 Number of seeds per model/sequence.
--is_training flag False Enable dropout to sample diverse structures.
--max_recycle int 3 Max recycling steps for refinement.
--recycle_tol float 0.0 Early stop recycling if Ca-RMS change < tol.

Other options

Flag Type Default Description
--use_amber flag False Run Amber relaxation (improves geometry but increases runtime).
--__override_disable flag False Internal override for debugging/disabling features.

(back to top)

Running and Analysis

To make a prediction, provide the sequences in the input CSV file. Set the options in the af2.flags file. Then, run the commands:

conda activate alphafold
python /path/to/alphafold/run/run_af2.py @af2.flags

Running the code will produce an outputs directory containing PDB files with predicted structures and .pbz2 files with compressed information about the predictions such as confidence metrics. To generate pLDDT plots and PAE heatmaps, navigate to a directory containing .pbz2 files and run the command:

conda activate alphafold
python /path/to/alphafold/run/plots.py

Specific examples

Specify your AF2 options in af2.flags.

To turn off MSA generation:

--msa_mode single_sequence  

To use precomputed MSA:

--msa_mode single_sequence  
--custom_msa_path /path/to/directory/with/a3m/files/

To use custom templates:

--use_templates  
--custom_template_path templates

Since we have modified the original AF2 code to allow for custom template databases, you will have to make sure each .pdb file within the templates folder has a name consisting of 4 letters and numbers, essentially mimicking a file from the PDB database with a PDB code (although the file name does not actually have to be a real PDB code). Some examples could be “temp.pdb”, “1uzg.pdb”, “PPPP.pdb”, etc.

(back to top)

About

our in-house version of AF2 including AF2-Multimer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •