We list various cloud options and how to set them up. The cheapest option is RunPod.
Step 1: Launch a GPU Instance
- Log into AWS: Go to the AWS Management Console. Make an account or log in if you have one.
- Create a new EC2 instance:
- Go to EC2 > Launch Instance.
- Choose a name for the instance.
- Select an AMI (Amazon Machine Image) that is Unix based, supports GPU, and has CUDA and PyTorch installed. For example,
Deep Learning OSS Nvidia Driver AMI GPU PyTorch 2.7 (Ubuntu 22.04).
- Choose an Instance Type:
- Select an instance type with a GPU that has at least 16GB VRAM and 32GB RAM, e.g.
g4dn.2xlargewhich has an NVIDIA T4
- Select an instance type with a GPU that has at least 16GB VRAM and 32GB RAM, e.g.
- Choose a key pair
- If you do not have a key pair, click Create new key pair and download the .pem key
- Configure the Instance:
- Set up security groups to allow SSH (port 22) and a specific port for protocol communication (port 49200, though you can change and specify this)
- We recommend minimum of 80GB of storage
- Review and Launch the instance.
Step 2: Edit Security Group
- Follow the AWS section in the network guide to configuring port 49200 to be accessible for external connections.
Step 3: Connect to the Instance
- SSH into your instance:
- Find your public ip and connect via SSH:
ssh -i your-key.pem ubuntu@your-ec2-public-ip
The cheapest option is a NVIDIA T4 (16GB VRAM) and n1-standard-8 (8 vCPU, 4 core, 30 GB memory) for $0.51/hour.
Step 1: Create a GPU-enabled VM
-
Log into Google Cloud Go to the Google Cloud Console. Make an account or log in if you have one.
-
Create a new VM instance:
- Go to Compute Engine > VM instances > Create Instance.
- Choose a name and region for the instance.
- Change from General Purpose to GPUs and select a GPU and Machine Type
- E.g. Choose 1
NVIDIA T4andn1-standard-8(8 vCPU, 4 core, 30 GB memory)
- E.g. Choose 1
- In the OS and storage tab change the image to one that is Unix based, supports GPU, and has CUDA and PyTorch installed
- E.g. OS
Deep Learning on Linuxand imageDeep Learning VM for PyTorch 2.4 with CUDA 12.4 M129
- E.g. OS
- In the Security tab click Manage Access. Under Add manually generated SSH keys click Add item, enter your SSH public key, and click Save.
- Click Create
Step 2: Edit Firewall settings
- Follow the GCP section in the network guide to configuring port 49200 to be accessible for external connections.
Step 2: Connect to the Instance
- Set up SSH Keys
- Go into your instance and click Edit
- Under SSH Keys click Add item, enter your SSH public key, and click Save.
- SSH into your VM:
- Find your external ip and username (this is linked with your SSH key) under the instance details and connect via SSH:
ssh ubuntu@your-external-ip
The cheapest option is a RTX 2000 Ada: 16GB VRAM, 31Gb RAM, 6 vCPUs for $0.23/hour.
RunPod launches your workspace within a docker container, so it is difficult to launch docker within the docker container. We recommend using conda instead. See the installing guide for how to install conda.
RunPod also assigns random external port mappings, so we need to find and specify that external port. See the RunPod section in the network guide
Finally, if you need to install anything else with RunPod, note that most standard packages are not installed, so run apt update first.
Step 1: Launch a GPU Pod
- Log into RunPod: Go to the RunPod Console. Make an account or log in if you have one.
- Set SSH Keys:
- Go to Settings and under SSH Public Keys add your public SSH key. If you have not made a SSH key yet, follow this guide from RunPod.
- Create a new Pod:
- Go to Pods to see available pods, and choose a Pod
- E.g. RTX 2000 Ada: 16GB VRAM, 31Gb RAM, 6 vCPUs for $0.23/hour.
- Choose a Pod name and Pod Template
- Want one with CUDA and PyTorch installed, the default
RunPod Pytorch 2.1works.
- Want one with CUDA and PyTorch installed, the default
- Ensure that SSH Terminal Access is enabled
- Click Deploy On-Demand
- Go to Pods to see available pods, and choose a Pod
Step 2: Edit the Pod
- Follow the RunPod section in the network guide to edit the Pod to expose a TCP port.
Step 3: Connect to the Pod
- SSH into your Pod:
- Go to Connect and in the SSH tab look at the ssh command under SSH over exposed TCP.
Tensordock offers low-cost consumer GPUs as low as an RTX A4000 for $0.105/hr.
The distributed compute option in Tensordock also assigns random external port mappings, so we need to find and specify that external port. See the Tensordock section in the network guide
- Log into Tensordock: Go to the Tensordock Deploy Dashboard. Make an account or log in if you have one.
- Set SSH Keys:
- Go to Secrets and click Add Secret to add your public SSH key. If you have not made a SSH key yet, follow this guide for Windows and this guide for Linux.
- Choose a Name for your SSH Key, choose Type as
SSH Keyand enter your public key value under Value. The public key value will look something like thisssh-rsa ...
- Deploy a GPU
- Go to Deploy GPU to see available GPUs, and choose a GPU
- E.g. RTX 4000: 16GB VRAM for $0.105/hour.
- Choose a Instance Name, configure the resource with CPU Cores, RAM and Storage options and choose a location and select the OS. We recommend
Ubuntu 24.04 LTS. - Click Deploy Instance
- Go to Deploy GPU to see available GPUs, and choose a GPU
- Connect to your instance
- Click on My Servers and you should see the newly provisioned GPU instance. You can click the instance to get details about the instance
- Instructions for connecting to the instance using SSH can be found under the Access section
You may need to setup your Tensordock instances with NVIDIA toolkit and Docker (if using Docker).
To install NVIDIA toolkit, run the following commands in your instance CLI:
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo systemctl restart docker
To install Docker, run the following commands in your instance CLI:
sudo apt install apt-transport-https ca-certificates curl software-properties-common -y
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y
sudo groupadd docker
sudo usermod -aG docker $USER
Lambda does not support 16GB GPUs, the cheapest options is a RTX 6000 (24 GB VRAM) with 14 vCPUs, 46 GiB RAM, 0.5 TiB SSD for $0.50 / hr.
Step 1: Launch a GPU Instance
- Log into Lambda: Go to the Lambda instances. Make an account or log in if you have one.
- Set SSH Keys:
- Go to SSH Keys and add your public SSH key.
- Create a new Instance:
- Go to Instances and select Launch an Instance to see available instances, and choose an instance
- E.g. 1x RTX 6000 (24 GB), for $0.50/hour.
- Choose a Region and FileSystem. If you don't have a filesystem, select Create a filesystem
- Click Launch
- Go to Instances and select Launch an Instance to see available instances, and choose an instance
Step 2: Edit the Firewall
- Follow the Lambda Labs section in the network guide to edit the firewall to expose a TCP port.
Step 3: Connect to the Instance
- SSH into your Instance:
- Once the instance has booted, look at the SSH command under SSH Login.