Skip to content

Scalability Issues of CellOracle #236

@cconnors8

Description

@cconnors8

Hey,

First of all I want to say that this has been a wonderful tool to use and very helpful - so thank you for that. However, as we try and scale it to larger datasets, the compute requirement becomes untenable. We'd been running it on 18k cells and 7k hvgs (which I understand is over the recommended, but it worked fine). However, when we went to a larger version of the same dataset, and scaled back the HVGs (55k cells, 3k HVGs) we experienced repeated kernel crashing at the estimate transition prob step, even when given very large compute resources (we tried to run on m7a48xlarge aws instances, for reference). This was much higher than before - I'd previously just used a m7a8xlarge, and even that was more of a default point I most likely could have gone lower. I don't know if there is some way to reduce the memory load of this step (that is what has caused almost all kernel crashes in the past for me) but that would be helpful in being able to use this tool on larger datasets!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions