Getting Started

Run your first GPU session

Install the MCP tool on your machine, connect it to Claude Code, and run a real CUDA program or library on a GPU in minutes.

01

Prerequisites

You'll need a few things before you start:

02

Install the MCP server

Run the installer in your terminal. It installs the GPU Runners MCP package, walks you through a quick browser login, and automatically adds the server to your Claude Code config.

curl -fsSL https://gpusr.us/install.sh | bash

The installer will:

  • Install @gpu-runners/mcp globally via npm
  • Open https://gpusr.us/auth/device in your browser and prompt you to enter a short code
  • Save your credentials to ~/.gpu-runners/config.json
  • Register the MCP server in ~/.claude.json
After install: restart Claude Code (or start a new session) so it picks up the new MCP server.
03

Verify the connection

Open a new Claude Code session and ask it to list your sessions:

Claude Code
You:  List my GPU sessions.

Claude: I'll check your active sessions using the GPU Runners tools.
        [calls list_sessions]
        You have no active sessions right now.

If Claude responds using the tool, you're wired up. If it says it doesn't have GPU tools available, double-check that you restarted Claude Code after the install.

04

Run CUDA Hello World

Launch Claude Code inside the repo you want to work on.

!
Claude Code must be launched from inside a Git repo. The detect_repo tool reads your local .git/config to find your GitHub remote — that's how the GPU instance knows where to clone your code from. If you're not in a Git repo, file syncing won't work.

Don't have a repo ready? Use our CUDA hello world example — clone it, cd in, and launch Claude Code from there.

git clone https://github.com/GPUs-R-Us/cuda-hello-world
cd cuda-hello-world
claude

Then give Claude a natural language prompt:

Claude Code — inside your repo
You:  Provision a GPU, compile and run the CUDA hello world
      program, then terminate the session.

Claude: I'll provision a GPU session, push the code, compile it, and
        run it. Let me get started.

        [calls detect_repo]
        → repo: github.com/you/cuda-hello-world, branch: main

        [calls provision_gpu with profile "g4dn-xlarge"]
        → session abc123 is provisioning...

        [calls wait_for_ready]
        → session ready (took ~90s)

        [calls exec: "nvcc hello.cu -o hello && ./hello"]
        → Hello from GPU thread 0!
           Hello from GPU thread 1!
           Hello from GPU thread 2!
           ...
           Hello from GPU thread 31!

        [calls terminate]
        → session abc123 terminated.

Claude handles the full lifecycle — provisioning, code sync, compilation, execution, and teardown — using the MCP tools in the background. You just describe what you want.

05

What to try next

Once you've got hello world running, the same workflow handles real workloads:

  • Iterative debugging — keep the session alive between execs. Claude can edit a file, recompile, and rerun without reprovisioning.
  • ML training — provision a session, push a training script, exec it, and read the results back — all without leaving Claude Code.
  • Long runs — sessions stay alive until you terminate them or the TTL expires (default 2 hours). Tell Claude the session ID to reconnect to a running instance.

Ready to run on a GPU?

Create an account and run the install script — you'll be provisioning GPUs in minutes.