usage_examples

Usage Examples

Practical examples for using SQANTI-browser.

🚀 Quick Start

Most Basic Command

python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --output my_hub \
    --genome hg38

That's it! This creates a track hub in my_hub/ ready to upload.

Recommended Command (With Tables)

python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --output my_hub \
    --genome hg38 \
    --tables

Adds interactive HTML reports for exploring your data offline.

📦 After Generation → Upload

See Hosting Guide for detailed upload instructions:

GitHub (easiest)
Institutional server
Cloud storage (AWS S3, Google Cloud)

Quick summary: Upload files → Get public URL to hub.txt → Load in UCSC (My Data → Track Hubs)

🎯 Common Use Cases

1. Add All Validation Tracks (Recommended!)

python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --genome hg38 \
    --output my_hub \
    --refGTF reference.gtf \
    --star-sj SJ.out.tab \
    --CAGE-peak CAGE_peaks.bed \
    --polyA-peak polyA_peaks.bed \
    --tables

What this does:

Shows your transcripts alongside reference annotation
Adds short-read junction support
Validates transcription start sites (CAGE)
Validates termination sites (polyA)
Creates interactive HTML tables

When to use: Publication-quality visualization with full validation

2. Custom Genome (Non-Model Organism)

python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --genome my_species_v1 \
    --twobit genome.2bit \
    --output my_hub

When to use: Working with species not in UCSC (e.g., plants, non-model animals)

Note: --twobit automatically extracts chromosome sizes. See Custom Genomes for details.

3. Large Dataset (Faster Processing)

python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --genome hg38 \
    --output my_hub \
    --no-category-tracks

When to use: >50K transcripts, want faster processing

Trade-off: Only creates main track (all transcripts), skips individual category tracks

4. Subset of Category Tracks

python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --genome hg38 \
    --output my_hub \
    --category-tracks FSM,ISM,NIC

When to use: Want only FSM, ISM, and NIC tracks (not all nine categories)

Valid abbreviations: FSM, ISM, NIC, NNC, antisense, genic_intron, genic_genomic (or genic), intergenic, fusion

5. Sort by Expression (Show Top Expressed First)

python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --genome hg38 \
    --output my_hub \
    --sort-by FL

When to use: Want to prioritize viewing highly expressed isoforms

Other sort options: length, FL, diff_to_TSS, diff_to_TTS, dist_to_CAGE_peak, dist_to_polyA_site

See Isoform Ordering for details.

🔬 Validation Tracks Deep Dive

STAR Junctions (Short-Read Support)

--star-sj path/to/SJ.out.tab

What it does: Shows which junctions have short-read support from STAR alignment

When to use: You have paired short+long read data

CAGE Peaks (TSS Validation)

--CAGE-peak path/to/CAGE_peaks.bed

What it does: Validates transcription start sites (5' ends)

Where to get: See SQANTI3 CAGE documentation

When to use: Assess 5' end accuracy

PolyA Peaks (TTS Validation)

--polyA-peak path/to/polyA_peaks.bed

What it does: Validates transcription termination sites (3' ends)

Where to get: See SQANTI3 polyA documentation

When to use: Assess 3' end accuracy

Reference Annotation

--refGTF path/to/reference.gtf

What it does: Shows reference annotation alongside your transcripts for comparison

Where to get:

Same file you used with SQANTI3
GENCODE (human/mouse)
Ensembl (all species)
SQANTI3 example (chr22 test data)

When to use: Almost always! Shows context for novel isoforms

🛠️ Testing & Debugging

Command	What It Does	When to Use
`--validate-only`	Check inputs, don't create hub	Test before full run
`--dry-run`	Process data, skip bigBed creation	Test logic/filters
`--keep-temp`	Keep temporary files	Debug issues

Example:

# Quick validation (1 second)
python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --genome hg38 \
    --output my_hub \
    --validate-only

# Full dry run (processes data but doesn't create bigBeds)
python -m sqanti_browser \
    --gtf corrected.gtf \
    --classification classification.txt \
    --genome hg38 \
    --output my_hub \
    --dry-run

📊 Standalone HTML Reports

Generate reports without creating a track hub:

python src/filter_isoforms.py \
    --classification classification.txt \
    --output-dir reports

With ORF sequences:

python src/filter_isoforms.py \
    --classification classification.txt \
    --output-dir reports \
    --include-sequences

When to use: Share data tables with collaborators who don't need UCSC visualization

🎓 Example Dataset

Run the example workflow script to see commands and parameters:

python example/example_usage.py

Try SQANTI-browser with the included example:

# Basic example
python -m sqanti_browser \
    --gtf example/SQANTI3_QC_output/example_corrected.gtf \
    --classification example/SQANTI3_QC_output/example_classification.txt \
    --output example_output \
    --genome hg38 \
    --tables

# With all validation tracks
python -m sqanti_browser \
    --gtf example/SQANTI3_QC_output/example_corrected.gtf \
    --classification example/SQANTI3_QC_output/example_classification.txt \
    --genome hg38 \
    --output example_output \
    --chrom-sizes example/SQANTI3_QC_output/chrom.sizes \
    --star-sj example/SQANTI3_QC_output/exampleSJ.out.tab \
    --CAGE-peak example/SQANTI3_QC_output/chr22.human.refTSS_v3.1.hg38.bed \
    --polyA-peak example/SQANTI3_QC_output/polyApeaks.atlas.GRCh38.bed \
    --tables

💡 Quick Tips

Tip	Why
Always use `--tables`	Interactive reports are super useful
Add `--refGTF`	Compare with reference annotation
Use `--validate-only` first	Catch errors before full run
Run `hubCheck hub.txt`	Validate before upload
Sort by `FL`	See highest expressed isoforms first
Use `--no_highlight`	Disable highlighted isoforms

📚 See Also

Quick Reference - One-page cheat sheet
Command Line Reference - All options explained
Hosting Guide - Upload your hub
Custom Genomes - Working with .2bit files
SQANTI-reads Integration - Multi-sample experiments

Wiki index
Home
Quick Reference
FAQ
Glossary
Installation Guide
Usage Examples
Hosting Guide
Command Line Reference
Output Files
Interactive HTML Tables
Filtering in UCSC
Trix Search Syntax
Isoform Ordering
Custom Coloring
Working with Custom Genomes
SQANTI-reads Integration
Troubleshooting

usage_examples

Usage Examples

🚀 Quick Start

Most Basic Command

Recommended Command (With Tables)

📦 After Generation → Upload

🎯 Common Use Cases

1. Add All Validation Tracks (Recommended!)

2. Custom Genome (Non-Model Organism)

3. Large Dataset (Faster Processing)

4. Subset of Category Tracks

5. Sort by Expression (Show Top Expressed First)

🔬 Validation Tracks Deep Dive

STAR Junctions (Short-Read Support)

CAGE Peaks (TSS Validation)

PolyA Peaks (TTS Validation)

Reference Annotation

🛠️ Testing & Debugging

📊 Standalone HTML Reports

🎓 Example Dataset

💡 Quick Tips

📚 See Also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally