Skip to content

Upgrades to v2 #73

@halexand

Description

@halexand

Improvements to codify in the most recent version of eukrhythmic:

  • MAD clustering parameters: shift defaults coverage/pid to 0.95/0.95 for MAD clustering
  • MAD clustering parameters: add parameters to config.yaml for mmseqs clustering for pid and coverage
  • Sample naming in MAD: add project prefix to config.yaml for renaming in MAD
  • Abundance filtering: default behavior to run salmon filtering on initial MAD (removal of any contigs with no reads recruited)
  • Abundance filtering: add option in config.yaml for user to pass a higher cutoff (e.g. genes with fewer than 10 reads recruiting)
  • Software output file: add final output file with version info / environment hashes used in the creation of MAD output. Perhaps add parameters as well? Call it methods-section.txt @shu251 :p
  • MAD-info file: add final output info file contianing: number of contigs in MAD (and salmon-filtered MAD), CAGs etc. Add additional info like length distribution and other quast type info?
  • Merged annotation table: create final merged annotation table that combines emapper + eukulele outputs
  • File clean up: Try to reduce total folder size as much as possible. Maintain CAG final assemblies and some other things but remove intermediate mapping and assembly folders.

Additions to the readthedocs:

  • Impact of coverage/pid on MAD: include stats / plot from GO-SHIP and others on impact of clustering parameters MAD
  • Discussion of salmon:transdecoder interactions?

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions