Is It Possible to Run Inference on Consumer-Grade GPUs?

First of all, amazing work — thank you for sharing this!

I noticed in the blog post that inference was performed using an H200 GPU. While that's impressive, such hardware is far beyond the reach of most individual users.

I'm wondering if it's possible to run inference on more accessible, consumer-grade GPUs — for example, an RTX 4090 with 24GB of VRAM. Would that be sufficient? Are there any recommended optimizations or settings for running on such hardware?

Looking forward to your advice, and thanks again for the great work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is It Possible to Run Inference on Consumer-Grade GPUs? #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is It Possible to Run Inference on Consumer-Grade GPUs? #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions