Inconsistent error with tensor mismatch

Dear Team Decima,


Our names are Sanne Jansen and Orfeas Gkourlias, both student of the Master Data Science for Life Sciences followed at the Hanze University of Applied Sciences in Groningen (Netherlands). Currently, we are working on a project for which we would like to deploy your model, Decima. Specifically for the prediction of the effects of variants (predict_variant_effect), using your Python API.

For this, we have performed runs on single-cell eQTL data. In Decima, we specify the the specific cell type as a task, e.g:
df_variant = pd.read_table(args.input)
    predict_variant_effect(df_variant,
                            output_pq = args.output,
                            device = device,
                            tasks = f"cell_type == 'B cell'",
                            genome = "path/to/hg38/hg38.fa")


To avoid computational overhead, we splitted our eQTL tsvs (df_variant) per chromosome for each celltype. We succesfully performed variant effect predictions for Decima for dendritic cells, CD4 + T-cells and CD8+ T-cells. However, when trying to perform predictions of NK-cells, Monocytes and B-cells, chromosome 19 fails. All the other chromosomes succesfully produced an output.

For these cell types, we find the following error:

RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/path/to/miniconda3/lib/python3.13/site-packages/torch/utils/data/_utils/worker.py", line 349, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/path/to/miniconda3/lib/python3.13/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
            ~~~~~~~~~~~~^^^^^
  File "/path/to/miniconda3/lib/python3.13/site-packages/decima/data/dataset.py", line 441, in __getitem__
    inputs = torch.vstack([seq, mask])
RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 524289 but got size 524288 for tensor number 1 in the list.

We wonder whether this is a known problem, as it seems to be inconsistent across celltypes due to the fact that we did recieve proper output for 3/6 celltypes we tried.
For completeness, these are the tasks we specified to the model for  the failing celltypes: 'classical monocyte', 'B cell', 'NK'.

The version of the operating system: Linux Rocky version 9.5
The version of Python: Python 3.13.5.

Of course, it would be ideal if we could solve this specific problem. Thank you for your time in advance, and we hope to hear from you soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inconsistent error with tensor mismatch #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inconsistent error with tensor mismatch #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions