Run your first GPU session
Install the MCP tool on your machine, connect it to Claude Code, and run a real CUDA program or library on a GPU in minutes.
Prerequisites
You'll need a few things before you start:
- Claude Code — the Anthropic CLI. Install it here if you haven't already.
- Node.js 18+ — Install it here if you don't have it.
- A GPU Runners account with credits. Sign in or create one.
Install the MCP server
Run the installer in your terminal. It installs the GPU Runners MCP package, walks you through a quick browser login, and automatically adds the server to your Claude Code config.
curl -fsSL https://gpusr.us/install.sh | bash
The installer will:
- Install
@gpu-runners/mcpglobally via npm - Open
https://gpusr.us/auth/devicein your browser and prompt you to enter a short code - Save your credentials to
~/.gpu-runners/config.json - Register the MCP server in
~/.claude.json
Verify the connection
Open a new Claude Code session and ask it to list your sessions:
You: List my GPU sessions. Claude: I'll check your active sessions using the GPU Runners tools. [calls list_sessions] You have no active sessions right now.
If Claude responds using the tool, you're wired up. If it says it doesn't have GPU tools available, double-check that you restarted Claude Code after the install.
Run CUDA Hello World
Launch Claude Code inside the repo you want to work on.
detect_repo tool reads your local .git/config to find your GitHub remote — that's how the GPU instance knows where to clone your code from. If you're not in a Git repo, file syncing won't work.
Don't have a repo ready? Use our CUDA hello world example — clone it, cd in, and launch Claude Code from there.
git clone https://github.com/GPUs-R-Us/cuda-hello-world cd cuda-hello-world claude
Then give Claude a natural language prompt:
You: Provision a GPU, compile and run the CUDA hello world program, then terminate the session. Claude: I'll provision a GPU session, push the code, compile it, and run it. Let me get started. [calls detect_repo] → repo: github.com/you/cuda-hello-world, branch: main [calls provision_gpu with profile "g4dn-xlarge"] → session abc123 is provisioning... [calls wait_for_ready] → session ready (took ~90s) [calls exec: "nvcc hello.cu -o hello && ./hello"] → Hello from GPU thread 0! Hello from GPU thread 1! Hello from GPU thread 2! ... Hello from GPU thread 31! [calls terminate] → session abc123 terminated.
Claude handles the full lifecycle — provisioning, code sync, compilation, execution, and teardown — using the MCP tools in the background. You just describe what you want.
What to try next
Once you've got hello world running, the same workflow handles real workloads:
- Iterative debugging — keep the session alive between execs. Claude can edit a file, recompile, and rerun without reprovisioning.
- ML training — provision a session, push a training script, exec it, and read the results back — all without leaving Claude Code.
- Long runs — sessions stay alive until you terminate them or the TTL expires (default 2 hours). Tell Claude the session ID to reconnect to a running instance.
Ready to run on a GPU?
Create an account and run the install script — you'll be provisioning GPUs in minutes.