Skip to content

high complexity reads removed #3

@theo-allnutt-bioinformatics

Description

Hi,

I have an issue with high complexity reads being removed. I am using fasta input because I am only interested in the complexity , not quality.

The files are here:
https://drive.google.com/open?id=1IHfERmzQauE3XNVVzaPU4yLg7dWe5Bmo

command:
InfoTrim.py 69.fasta -o 69_infotrim.fasta --fasta --min_length 100 -p 12

The file 69_trimmed.fasta contains the reads from the original (69.fasta) that were removed in 69_infotrim.fasta.

A quick look at these reads shows that they are not low complexity. I really only want to remove or trim sequence that is'mostly' homopolymer or simple repeat.

Thanks,

Theo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions