From 17385726b07483d60557f05510700943f6acd3ba Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Fri, 23 Jan 2026 18:53:22 +0100 Subject: [PATCH 1/8] Update example_2.md Solve broken links. --- manuals/example_2.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/manuals/example_2.md b/manuals/example_2.md index 7a1031c4..6002c1b8 100644 --- a/manuals/example_2.md +++ b/manuals/example_2.md @@ -6,7 +6,7 @@ ## 1st step: compute multiple sequence alignment (MSA) and template features (run on CPUs) -Firstly, download sequences of L (Uniprot: [O09705](https://www.uniprot.org/uniprotkb/O09705/entry)) and Z(uniprot:[O73557](https://www.uniprot.org/uniprotkb/O73557/entry)) proteins. The result is [`example_2_sequences.fasta`](../example_data/example_2_sequences.fasta) +Firstly, download sequences of L (Uniprot: [O09705](https://www.uniprot.org/uniprotkb/O09705/entry)) and Z(uniprot:[O73557](https://www.uniprot.org/uniprotkb/O73557/entry)) proteins. The result is [`example_2_sequences.fasta`](./example_data/example_2_sequences.fasta) Now run: @@ -29,15 +29,15 @@ taken as the description of the protein and **please be aware** that any specia ## 1.1 Explanation about the parameters -See [Example 1](https://github.com/KosinskiLab/AlphaPulldown/blob/main/example_1.md#11-explanation-about-the-parameters) +See [Example 1](./example_1.md#explanation-about-the-parameters) ## 2nd step: Predict structures (run on GPU) #### **Task 1** -We want to predict the structure of full-length L protein together with Z protein. However, as the L protein is very long, many users would not have a GPU card with sufficient memory. Moreover, when attempting modeling the full L-Z, the resulting model does not match the known cryo-EM structure. In [Example 1](https://github.com/KosinskiLab/AlphaPulldown/blob/main/example_1.md), we showed how to use AlphaPulldown to find the interaction site by screening fragments using the ```pullldown``` mode. Here, to demonstrate the ```custom``` mode, we will assume the we know the interaction site and model the fragment using this mode, as demonstrated in the figure below ![custom_demo_2.png](./custom_demo_2.png): +We want to predict the structure of full-length L protein together with Z protein. However, as the L protein is very long, many users would not have a GPU card with sufficient memory. Moreover, when attempting modeling the full L-Z, the resulting model does not match the known cryo-EM structure. In [Example 1](./example_1.md), we showed how to use AlphaPulldown to find the interaction site by screening fragments using the ```pullldown``` mode. Here, to demonstrate the ```custom``` mode, we will assume the we know the interaction site and model the fragment using this mode, as demonstrated in the figure below ![custom_demo_2.png](./custom_demo_2.png): -Different proteins are seperated by ```;```. If a particular region is wanted from one protein, simply add ```,``` after that protein and followed by the region. Region comes in the format of ```number1-number2```. An example input file is: [`custom_mode.txt`](../example_data/custom_mode.txt) +Different proteins are seperated by ```;```. If a particular region is wanted from one protein, simply add ```,``` after that protein and followed by the region. Region comes in the format of ```number1-number2```. An example input file is: [`custom_mode.txt`](./example_data/custom_mode.txt) The command line interface for using custom mode will then become: @@ -125,7 +125,7 @@ or ``` #### **Task 2** -This taks is to determine the oligomer state of SSB protein [(Uniprot:P0AGE0)](https://www.uniprot.org/uniprotkb/P0AGE0/entry#function) by modelling its monomeric, homodimeric, homotrimeric, and homoquatrameric structures. Thus, homo-oligomer mode is needed. An oligomer state file will tell the programme the number of units. An example is: [`example_oligomer_state_file.txt`](../example_data/example_oligomer_state_file.txt) +This taks is to determine the oligomer state of SSB protein [(Uniprot:P0AGE0)](https://www.uniprot.org/uniprotkb/P0AGE0/entry#function) by modelling its monomeric, homodimeric, homotrimeric, and homoquatrameric structures. Thus, homo-oligomer mode is needed. An oligomer state file will tell the programme the number of units. An example is: [`example_oligomer_state_file.txt`](./example_data/example_oligomer_state_file.txt) In the file, oligomeric states of the corresponding proteins should be separated by ```,``` e.g. ```protein_A,3```means a homotrimer for protein_A ![homo-oligomer_demo](./homooligomer_demo.png) @@ -268,6 +268,9 @@ jupyter-lab output.ipynb We have also provided a singularity image called ```alpha-analysis.sif```to generate a CSV table with structural properties and scores. Firstly, download the singularity image: +> [!CAUTION] +> The signluarity images linked below are currently down. + ⚠️ If your results are from AlphaPulldown prior version 1.0.0: [alpha-analysis_jax_0.3.sif](https://www.embl-hamburg.de/AlphaPulldown/downloads/alpha-analysis_jax_0.3.sif). ⚠️ If your results are from AlphaPulldown with version >=1.0.0: [alpha-analysis_jax_0.4.sif](https://www.embl-hamburg.de/AlphaPulldown/downloads/alpha-analysis_jax_0.4.sif). @@ -293,7 +296,7 @@ By default, you will have a csv file named ```predictions_with_good_interpae.csv ## Appendix: Instructions on running in `all_vs_all` mode -As the name suggest, all_vs_all means predict all possible combinations within a single input file. The input can be either full-length proteins or regions of a protein, as illustrated in the [`example_all_vs_all_list.txt`](../example_data/example_all_vs_all_list.txt) and the figure below: +As the name suggest, all_vs_all means predict all possible combinations within a single input file. The input can be either full-length proteins or regions of a protein, as illustrated in the [`example_all_vs_all_list.txt`](./example_data/example_all_vs_all_list.txt) and the figure below: ![plot](./all_vs_all_demo.png) The corresponding command is: From 49a6a7a2277e020707aae054e35dc20de5ce4c9f Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Mon, 2 Feb 2026 20:28:58 +0100 Subject: [PATCH 2/8] Update example_2.md Singularity images are up again. --- manuals/example_2.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/manuals/example_2.md b/manuals/example_2.md index 6002c1b8..0af58f2a 100644 --- a/manuals/example_2.md +++ b/manuals/example_2.md @@ -268,9 +268,6 @@ jupyter-lab output.ipynb We have also provided a singularity image called ```alpha-analysis.sif```to generate a CSV table with structural properties and scores. Firstly, download the singularity image: -> [!CAUTION] -> The signluarity images linked below are currently down. - ⚠️ If your results are from AlphaPulldown prior version 1.0.0: [alpha-analysis_jax_0.3.sif](https://www.embl-hamburg.de/AlphaPulldown/downloads/alpha-analysis_jax_0.3.sif). ⚠️ If your results are from AlphaPulldown with version >=1.0.0: [alpha-analysis_jax_0.4.sif](https://www.embl-hamburg.de/AlphaPulldown/downloads/alpha-analysis_jax_0.4.sif). From 60fd44eba8274ea61503a05275235b58ccfad18e Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Mon, 2 Feb 2026 20:31:33 +0100 Subject: [PATCH 3/8] Update example_1.md --- manuals/example_1.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manuals/example_1.md b/manuals/example_1.md index 03462392..76f6817f 100644 --- a/manuals/example_1.md +++ b/manuals/example_1.md @@ -1,4 +1,4 @@ -# Example1 +# Example 1 # Aim: Find proteins involving human translation pathway that might interact with eIF4G2 From aa0c3733bfed5532b5e311f139ebcee684c45eed Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Mon, 2 Feb 2026 20:31:47 +0100 Subject: [PATCH 4/8] Update example_2.md --- manuals/example_2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manuals/example_2.md b/manuals/example_2.md index 0af58f2a..c8944c9b 100644 --- a/manuals/example_2.md +++ b/manuals/example_2.md @@ -1,6 +1,6 @@ # AlphaPulldown manual: -# Example2 +# Example 2 # Aims: Model interactions between Lassa virus L protein and Z matrix protein; Determine the oligomer state of _E.coli_ Single-stranded DNA-binding protein (SSB) From 58293a9508a5a793946ce4b21aa176a21cb73a89 Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Mon, 2 Feb 2026 20:32:00 +0100 Subject: [PATCH 5/8] Update example_3.md --- manuals/example_3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manuals/example_3.md b/manuals/example_3.md index 7a997689..b15aa301 100644 --- a/manuals/example_3.md +++ b/manuals/example_3.md @@ -1,6 +1,6 @@ # AlphaPulldown manual: -# Example3 +# Example 3 # Aims: Model activation of phosphoinositide 3-kinase by the influenza A virus NS1 protein (PDB: 3L4Q) ## 1st step: compute multiple sequence alignment (MSA) and template features using provided pbd templates (run on CPU) From 048e086b267f2913f5faab0f9d02685336f1e18d Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Mon, 2 Feb 2026 20:40:19 +0100 Subject: [PATCH 6/8] Update example_1.md Direct references to the correct (current) directory. --- manuals/example_1.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/manuals/example_1.md b/manuals/example_1.md index 76f6817f..3020a628 100644 --- a/manuals/example_1.md +++ b/manuals/example_1.md @@ -1,10 +1,12 @@ +# AlphaPulldown manual: + # Example 1 # Aim: Find proteins involving human translation pathway that might interact with eIF4G2 ## 1st step: compute multiple sequence alignment (MSA) and template features (run on CPUs) -For the purpose of this manual, the expected file is already provided here: [`example_1_sequences.fasta`](../example_data/example_1_sequences.fasta). If you want to run a smaller test, you can use [`example_1_sequences_shorter.fasta`](../example_data/example_1_sequences_shorter.fasta) instead. +For the purpose of this manual, the expected file is already provided here: [`example_1_sequences.fasta`](./example_data/example_1_sequences.fasta). If you want to run a smaller test, you can use [`example_1_sequences_shorter.fasta`](./example_data/example_1_sequences_shorter.fasta) instead. :memo: *The example file was generated by downloading all 294 proteins that belong to human translation pathway from: [Reactome](https://reactome.org/PathwayBrowser/#/R-HSA-72766&DTAB=MT). eIF4G2 sequence was downloaded from (Uniprot:[P78344](https://www.uniprot.org/uniprot/P78344)).* @@ -32,7 +34,7 @@ MMSeqs2 and ColabFold allow for much quicker calculation of MSAs than the defaul ### Expected output -```create_individual_features.py``` will compute necessary features each protein in [`example_1_sequences.fasta`](../example_data/example_1_sequences.fasta) and store them in the ```output_dir```. Please be aware that everything after ```>``` will be +```create_individual_features.py``` will compute necessary features each protein in [`example_1_sequences.fasta`](./example_data/example_1_sequences.fasta) and store them in the ```output_dir```. Please be aware that everything after ```>``` will be taken as the description of the protein and **please be aware** that any special symbol, such as ```| : ; #```, after ```>``` will be replaced with ```_```. The name of the pickles will be the same as the descriptions of the sequences in fasta files (e.g. ">protein_A" in the fasta file will yield "protein_A.pkl") @@ -160,9 +162,9 @@ different number if you wish to run an array of jobs in parallel then the progra #### **Run in pulldown mode** -Inspired by pull-down assays, one can specify one or more proteins as "bait" and another list of proteins as "candidates". Then the programme will use AlphafoldMultimerV2 to predict interactions between baits (as in [`example_data/baits.txt`](../example_data/baits.txt)) and candidates (as in [`example_data/candidates.txt`](../example_data/candidates.txt)). +Inspired by pull-down assays, one can specify one or more proteins as "bait" and another list of proteins as "candidates". Then the programme will use AlphafoldMultimerV2 to predict interactions between baits (as in [`example_data/baits.txt`](./example_data/baits.txt)) and candidates (as in [`example_data/candidates.txt`](./example_data/candidates.txt)). -**Note** If you want to save time and run fewer jobs, you can use [`example_data/candidates_shorter.txt`](../example_data/candidates_shorter.txt) instead of [`example_data/candidates.txt`](../example_data/candidates.txt) +**Note** If you want to save time and run fewer jobs, you can use [`example_data/candidates_shorter.txt`](./example_data/candidates_shorter.txt) instead of [`example_data/candidates.txt`](./example_data/candidates.txt) In this example, we selected pulldown mode and made eIF4G2 (Uniprot:[P78344](https://www.uniprot.org/uniprot/P78344)) as a bait while the other 294 proteins as candidates. Thus, in total, there will be 1 * 294 = 294 predictions. @@ -184,7 +186,7 @@ run_multimer_jobs.py --mode=pulldown \ --remove_result_pickles=True ``` -:memo: To reproduce the results of Lassa virus Z protein vs L protein fragments written in our paper, simply use [`baits_Z_protein.txt`](../example_data/baits_Z_protein.txt) and [`L_protein_fragments.txt`](../example_data/L_protein_fragments.txt) as the ```--protein_lists```inputs. This example shows also how to run the interaction screen for fragments of proteins, keeping the original full-length residue numbering in the output! +:memo: To reproduce the results of Lassa virus Z protein vs L protein fragments written in our paper, simply use [`baits_Z_protein.txt`](./example_data/baits_Z_protein.txt) and [`L_protein_fragments.txt`](./example_data/L_protein_fragments.txt) as the ```--protein_lists```inputs. This example shows also how to run the interaction screen for fragments of proteins, keeping the original full-length residue numbering in the output! ✨ **New Features** Now AlphaPulldown supports integrative structural modelling if the user has experimental cross-link data. Please refer to [this manual](run_with_AlphaLink2.md) if you'd like to model your protein complexes with cross-link MS data as extra input. @@ -346,7 +348,7 @@ By default, you will have a csv file named ```predictions_with_good_interpae.csv ## Appendix: Instructions on running in `all_vs_all` mode -As the name suggest, all_vs_all means predict all possible pairwise comparisons within a single input file. The input can be either full-length proteins or regions of a protein, as illustrated in the [`example_all_vs_all_list.txt`](../example_data/example_all_vs_all_list.txt) and the figure below: +As the name suggest, all_vs_all means predict all possible pairwise comparisons within a single input file. The input can be either full-length proteins or regions of a protein, as illustrated in the [`example_all_vs_all_list.txt`](./example_data/example_all_vs_all_list.txt) and the figure below: ![plot](./all_vs_all_demo.png) The corresponding command is: From b363a9120ccd72d9e1366c15d6f7fea0256e1200 Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Mon, 2 Feb 2026 20:44:31 +0100 Subject: [PATCH 7/8] Update example_3.md --- manuals/example_3.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/manuals/example_3.md b/manuals/example_3.md index b15aa301..83a5988a 100644 --- a/manuals/example_3.md +++ b/manuals/example_3.md @@ -43,14 +43,14 @@ It is also possible to combine all your fasta files into a single fasta file. ------------------------ -## 1.1 Explanation about the parameters +## Explanation about the parameters -See [Example 1](https://github.com/KosinskiLab/AlphaPulldown/blob/main/manuals/example_1.md#11-explanation-about-the-parameters) +See [Example 1](./example_1.md#explanation-about-the-parameters) ## 2nd step: Predict structures (run on GPU) #### **Task 1** -To predict structure we can use the usual ```run_multimer_jobs.py``` in custom mode (See [Example 2](https://github.com/KosinskiLab/AlphaPulldown/blob/main/manuals/example_2.md#2nd-step-predict-structures-run-on-gpu)) with an extra ```--multimeric_mode=True``` flag, that deactivates per-chain multimeric binary mask. +To predict structure we can use the usual ```run_multimer_jobs.py``` in custom mode (See [Example 2](./example_2.md#2nd-step-predict-structures-run-on-gpu)) with an extra ```--multimeric_mode=True``` flag, that deactivates per-chain multimeric binary mask. The user can also specify the depth of the MSA that is taken for modelling to increase the influence of the template on the predicted model. This can be done by using the flag ```--msa_depth```. It's always recommended running with all 5 AlphaFold Multimer settings but if you want to save time, you could specify the model name(s) you want to run, use the following flag: ```--model_names=model_1_multimer_v3,model_2_multimer_v3``` (for models 1 and 2). If you do not know the exact MSA depth, there is another flag ```--gradient_msa_depth=True``` for exploring the desired MSA depth. This flag generates a set of logarithmically distributed points (denser at lower end) with the number of points equal to the number of predictions. The MSA depth (```num_msa```) starts from 16 and ends with the maximum value taken from the model config file. The ```extra_num_msa``` is always calculated as ```4*num_msa```. The command line interface for using custom mode will then become: @@ -187,4 +187,4 @@ or --models_to_relax=all ``` -After the successful run one can evaluate and visualise the results in a usual manner (see e.g. [Example 2](https://github.com/KosinskiLab/AlphaPulldown/blob/main/manuals/example_2.md#2nd-step-predict-structures-run-on-gpu)) +After the successful run one can evaluate and visualise the results in a usual manner (see e.g. [Example 2](./example_2.md#2nd-step-predict-structures-run-on-gpu)) From b9eaf53f4fe84e9efaa13d98b0d3bbd3aa697114 Mon Sep 17 00:00:00 2001 From: amrismil <113579384+amrismil@users.noreply.github.com> Date: Mon, 2 Feb 2026 20:44:53 +0100 Subject: [PATCH 8/8] Update example_2.md --- manuals/example_2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manuals/example_2.md b/manuals/example_2.md index c8944c9b..bf49efd2 100644 --- a/manuals/example_2.md +++ b/manuals/example_2.md @@ -27,7 +27,7 @@ taken as the description of the protein and **please be aware** that any specia ------------------------ -## 1.1 Explanation about the parameters +## Explanation about the parameters See [Example 1](./example_1.md#explanation-about-the-parameters)