Ephemeral · On-demand · No setup
Your AI coding agent can write CUDA kernels and train models. Now it can run them too.
Why GPUs R Us
Runs entirely in our infrastructure. No AWS, no IAM roles, no VPCs. Your agent gets a GPU shell — you don't touch a thing.
PyTorch, CUDA, and the full ML stack — pre-built and ready to run. No driver installs, no environment setup, no wasted time.
Per-second billing. Your agent can iterate on the same session all day — you're only charged for actual GPU time.
When the work is done, your agent opens a pull request with the results. Your code, your branch, zero manual steps.
How it works
You describe the task. Your agent handles the rest — provisioning, iterating, and cleaning up when done.
One command installs and configures everything. Your agent picks up the GPU tools automatically — no config files, no manual steps.
Describe the task in plain language. Your agent picks the right environment and provisions a GPU — ready in under a minute.
Your agent runs commands, reads output, adjusts, and retries — all in one session. When done it terminates and you're billed only for what ran.
Runtime Environments
Pre-built Docker images on NVIDIA-optimized instances. No setup, no driver installs — containers start and run.
The NVIDIA NGC stack — PyTorch, RAPIDS, and Jupyter ready to go. Everything needed for training, fine-tuning, and inference workflows.
Full CUDA toolkit with nvcc, CMake, GDB, and build essentials. For writing, compiling, and profiling custom CUDA kernels.
Pricing
Buy credits upfront, spend them as your agents run. No subscriptions, no minimums. Credits never expire.
Starter
$10
10 credits
Builder
$50
Team
$100
FAQ
No. GPU instances run entirely in our infrastructure. You never provide cloud credentials, manage IAM roles, or configure networking. We handle all of that — you just use the MCP tools.
Run claude mcp add @gpu-runners/mcp in your terminal. After authenticating with your GPUs R Us account, Claude Code discovers all GPU tools automatically the next time you start a session.
We currently support the g4dn family on AWS: g4dn.xlarge (1× T4, 16 GB VRAM), g4dn.2xlarge (1× T4, 32 GB), and g4dn.12xlarge (4× T4, 192 GB). More instance types — including A10G and H100 — are on the roadmap.
Transparent spot interruption handling is coming soon. When it ships, your workspace will be automatically checkpointed to a git branch and restored on a new instance — Claude sees a brief delay but never loses context. Today, sessions run on on-demand instances.
Yes. Connect your GitHub account in settings and Claude can use the create_pr MCP tool to open pull requests with results, trained models, or any artifacts from the session.
Credits are deducted from your balance as GPU time accrues, billed per second. Your dashboard shows your real-time balance and any charges accruing on active sessions. When a session terminates, the final charge is settled. You can always see your current spend in the top bar.
The MCP server is purpose-built for Claude Code, but the underlying REST API is open. If you want to integrate GPU provisioning into your own agent or workflow, get in touch.
One command sets everything up. Then just describe your task.
# Install and configure in one step $ curl -fsSL https://gpusr.us/install.sh | bash # Then just describe your task to Claude $ claude > Train a ResNet-50 on CIFAR-10 and open a PR with the results