-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Improvements to codify in the most recent version of eukrhythmic:
- MAD clustering parameters: shift defaults coverage/pid to 0.95/0.95 for MAD clustering
- MAD clustering parameters: add parameters to
config.yamlfor mmseqs clustering for pid and coverage - Sample naming in MAD: add project prefix to
config.yamlfor renaming in MAD - Abundance filtering: default behavior to run salmon filtering on initial MAD (removal of any contigs with no reads recruited)
- Abundance filtering: add option in
config.yamlfor user to pass a higher cutoff (e.g. genes with fewer than 10 reads recruiting) - Software output file: add final output file with version info / environment hashes used in the creation of MAD output. Perhaps add parameters as well? Call it methods-section.txt @shu251 :p
- MAD-info file: add final output info file contianing: number of contigs in MAD (and salmon-filtered MAD), CAGs etc. Add additional info like length distribution and other quast type info?
- Merged annotation table: create final merged annotation table that combines emapper + eukulele outputs
- File clean up: Try to reduce total folder size as much as possible. Maintain CAG final assemblies and some other things but remove intermediate mapping and assembly folders.
Additions to the readthedocs:
- Impact of coverage/pid on MAD: include stats / plot from GO-SHIP and others on impact of clustering parameters MAD
- Discussion of salmon:transdecoder interactions?
Reactions are currently unavailable
Metadata
Metadata
Labels
enhancementNew feature or requestNew feature or request