-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Dear StringTie Development Team,
First, thank you for developing and maintaining such a powerful and widely-used tool.
I am planning to use StringTie's hybrid mode to assemble a comprehensive transcriptome by integrating short-read (Illumina) and long-read (Oxford Nanopore) RNA-seq data from the same set of biological samples. My question concerns the potential for batch effects between these two technologies. As the short-read and long-read data were generated from separate library preparations and sequenced on different platforms, they inherently contain non-biological technical variations.
Could you please clarify how StringTie's hybrid mode handles such technical discrepancies? Specifically, I am interested in knowing:
- Does the hybrid assembly algorithm implicitly account for systematic differences in coverage or representation between the two technologies?
- Are there recommended best practices for pre-processing or normalizing the data (e.g., the BAM files) before inputting them into StringTie to mitigate these batch effects?
- Alternatively, is it considered better practice to perform assemblies separately and then merge the results, rather than using a direct hybrid approach when such strong technical biases are expected?
Any insights or recommendations you could provide would be greatly appreciated. Thank you for your time and for your contributions to the community.
Cheers,
Ting