This repository generates a fusion circos plot from an Excel table of fusion
calls (for example MYH9::USP6). The workflow:
- Prepares plotting inputs through a dedicated data-flow module.
- Renders a circos plot through a dedicated plotting module.
- Saves the rendered figure as PDF with center summary and legend
(and when weighted mode is enabled, annotates only the top-fusion partner
genes on the outer ring). The center
summary shows total samples, unique fusions, and up to three ranked fusions
(
Top,Second,Third) depending on available unique fusion count.
Data-flow sequence: Read Excel -> Resolve fusion column -> Split fusion names -> Resolve gene loci -> Aggregate frequencies -> Rank fusion keys -> Build color map.
Rendering sequence: Build base tracks -> Draw fusion links -> Annotate top-fusion partner genes (weighted mode only) -> Apply inner circle -> Add center summary -> Finalize layout -> Add legend -> Save PDF
Color policy:
- Chromosome sectors use a rainbow sequence ordered as
Y, 1..22, X. - Fusion link/ribbon colors are determined by the source gene chromosome
(
gene1::gene2usesgene1chromosome color).
circos_plot/
|-- fusion_plot.py # CLI entrypoint and orchestration
|-- interactive_fusion_plot.py # Interactive HTML CLI entrypoint and orchestration
`-- fusion_circos/
|-- cli.py # Argument parsing and primitive CLI type conversion
|-- config.py # Central runtime configuration values
|-- dataflow.py # Data preparation pipeline before plotting
|-- fusion_io.py # Fusion-column resolution and fusion-string splitting
|-- gene_locator.py # Ensembl-based gene-locus lookup with JSON cache
|-- genome.py # Genome metadata and sector construction
|-- geometry.py # Coordinate transforms and edge clamping
|-- interactive/
| |-- cli.py # Argument parsing for interactive HTML export
| `-- plotting.py # Plotly-based interactive circos rendering helpers
|-- plotting.py # Rendering helpers for tracks, links, optional top-fusion outer labels, center summary, and legend
|-- style.py # Backend-agnostic style tokens and visual mappings
`-- locus_type.py # Core genomic interval dataclass
Use Conda to create a reproducible environment.
conda env create -f environment.yml
conda activate fusion_circosNotes:
numpyis pinned to<2to avoid ABI incompatibility issues.- On
osx-arm64, some dependencies are more reliable viapip(already listed underpipinenvironment.yml).
Verify installation:
python -c "import numpy,pandas,openpyxl,matplotlib,pycirclize,plotly,pyensembl; print('ok', numpy.__version__)"Run:
python fusion_plot.py \
--excel_file source_data/nodular_fasciitis.xlsx \
--excel_sheet Sheet1 \
--fusion_column_name Fusions \
--output_pdf nodular_fasciitis_circos.pdf \
--save_next_to_script true \
--connection_style ribbon \
--frequency_weighted_links trueNote: --ensembl_release defaults to 110, so it is omitted above.
Run:
python interactive_fusion_plot.py \
--excel_file source_data/nodular_fasciitis.xlsx \
--excel_sheet Sheet1 \
--fusion_column_name Fusions \
--output_html nodular_fasciitis_interactive.html \
--save_next_to_script true \
--connection_style ribbon \
--frequency_weighted_links trueThis command writes one standalone offline HTML file. You can open it directly by double-clicking in Finder/File Explorer (no running server required).
fusion_plot.py: Thin orchestration entrypoint (CLI, path resolution, cache wiring, pipeline invocation).fusion_circos/dataflow.py: Owns pre-render data preparation and returnsDataflowResult.fusion_circos/plotting.py: Owns rendering implementation (tracks, links, optional top-fusion outer labels, center summary, legend, figure save helpers).interactive_fusion_plot.py+fusion_circos/interactive/: Own interactive HTML pipeline (Plotly renderer + standalone HTML export).
Arguments:
--excel_file(str): Path to input Excel file.--excel_sheet(int | str): Sheet index or sheet name.--fusion_column_name(str): Column storing fusion strings. Canonicalized fallback matching is always enabled (case/space/underscore tolerant).--output_pdf(str): Output PDF path or filename.--save_next_to_script(bool): Save output next tofusion_plot.pyif true.--connection_style(line | ribbon): Fusion link rendering style.--frequency_weighted_links(bool):truescales ribbon width by fusion frequency (line mode remains uniform-width for bothtrueandfalse). In PDF output, this mode also annotates only the top-fusion partner genes on the outer ring, and legend(n=...)counts are shown only when this option istrue.falseuses uniform-width links for all observed fusion pairs while keeping reciprocal directions visually separable.--ensembl_release(int): Optional Ensembl release for locus lookup. Default is110.
- Plot PDF at the path passed to
--output_pdf. - Interactive standalone HTML at the path passed to
--output_html. - Runtime cache files under
runtime/:runtime/pyensembl_cache/runtime/gene_locus_cache.json
This project is licensed under the Apache License 2.0. See LICENSE for
details.