Skip to content

Is It Possible to Run Inference on Consumer-Grade GPUs? #1

@cellzero

Description

@cellzero

First of all, amazing work — thank you for sharing this!

I noticed in the blog post that inference was performed using an H200 GPU. While that's impressive, such hardware is far beyond the reach of most individual users.

I'm wondering if it's possible to run inference on more accessible, consumer-grade GPUs — for example, an RTX 4090 with 24GB of VRAM. Would that be sufficient? Are there any recommended optimizations or settings for running on such hardware?

Looking forward to your advice, and thanks again for the great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions