Create files to display the splicing data in the genome browser.
There are 3 inputs: the GFF, a list of samples to exclude, and the bams. The outputs are bigwig (for exonic reads) and bed (for splice junctions) files that can be loaded into a JBrowse2 browser.
The input sequencing must be first processed with the bulk_align pipeline, to yield the aligned BAMs.
The "master script" is src/generate_browser_data.sh, which:
- in
R/sj_to_bed, takes the SJ.out.tab files generated by STAR, generates bigBed files with the counts - if needed, merge the BAMs (not needed in bsn12), remove multimappers ( bsn9 uses
stringtie_quantif/src/prep_alignments.sh, need update) - in
R/bam_to_bigwig, takes the combined bams, generates bigWig files with exonic counts - exports everything in a
xxx_browser.tar.gzarchive, to be transferred to the genome browser (seesplicing_website)
Beforehand, run the bulk_align pipeline.
In older versions (with bsn9), we'd use the "augmented"" annotation created in stringtie_quantif, now (with bsn12) we directly use the Wormbase annotation.
If needed, update the data/outliers_to_ignore.txt list based on QC.
Edit the parameters at the beginning of src/generate_browser_data.sh, and run it. If everything goes well, the slurm log ends with e.g.
Send to vps with:
scp /home/aw853/ycga_project/splicing_browser/data/outs/231121_browser.tar.gz cengen-vps:/var/www/public_data/splicing
use this command to upload, and see repo splicing_website for next steps.