It is recommended to run the scripts on a Ubuntu LTS environment. We used Ubuntu server LTS version. Bash is required for executing the scripts in this project.
For Prokka and PEPPAN, please install Conda. Afterwards, access your terminal and create a new conda environment:
conda create --name <name>
Follow the prompts and select the default parameters.
Next, enter the conda environment:
conda activate <name>
And install Prokka and PEPPAN:
conda install -c conda-forge -c bioconda -c defaults prokka
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda install mmseqs2
conda install blast
conda install diamond
conda install rapidnj
conda install fasttree
For Roary and FastTree, we utilized the Ubuntu versions available through apt:
# for roary
sudo apt-get install bedtools cd-hit ncbi-blast+ mcl parallel cpanminus prank mafft fasttree
sudo cpanm -f Bio::Roary
# for fasttree
sudo apt install fasttree
Lastly, for downloading of files, download R.
sudo apt install r-base
Download the FASTA files associated with the bacterial and plasmid genomes by executing the s0_ script in R.
Rscript ./code/s0_download_Fasta.R
Next run Prokka on the downloaded FASTA files by executing the s1_ bash script with either bacterial or plasmid arguments (for the bacterial and plasmid genomes, respectively)
bash ./code/s1_prokka.sh plasmid
bash ./code/s1_prokka.sh bacterial
For bacterial genome, execute the ./s2_ script:
./code/s2_roary.sh bacterial
And for the plasmid genome, execute the ./s3_ script:
./code/s3_peppan.sh plasmid
For the bacterial genome, the resulting newick file is available at ./results/bacterial/fasttree_res/tree.newick.
For the plasmid genome, the resulting newick file is available at ./results/plasmid/peppan_res/PEPPAN.PEPPAN.gene_content.nwk