The doctor CLI for CUDA environments.

Documentation for checking whether a machine can build and run CUDA workloads correctly.

Example session

$ cuda-doctor doctor
Risk: high
GPU: RTX 5090 (Blackwell / sm_120)
Problem: local toolchain cannot target sm_120
PyTorch: installed wheel mismatches local runtime

$ cuda-doctor doctor auto
Repair plan: reconcile toolkit, flags, and runtime pairing

$ cuda-doctor validate
Status: passed

$ cuda-doctor build
Architecture target: sm_120

What it handles

Built to catch and handle CUDA failures.

RTX 5000-series and Blackwell readiness

Missing `sm_120` support in toolchains or build flags

Outdated CUDA, PyTorch, or runtime stacks

Wrong Linux kernel module flavor

Driver and runtime mismatches

Fake-success installs that fail during GPU execution

Command flow

Diagnose -> Repair -> Validate -> Build

doctor

Diagnose the GPU, driver, toolkit, runtime, build chain, and validation risk.

doctor auto

Repair what is compatible and refuse to call it fixed until validation passes.

validate

Prove memory transfer and kernel execution work on the local GPU.

build

Compile with the right architecture and toolchain settings for the machine.

Documentation map

Begin with the quick start, then move into the command reference.

Start here

Quickstart

Install cuda-doctor, diagnose the machine, repair what is compatible, then prove GPU execution works.

Diagnose

doctor

Run a full environment diagnosis for the GPU, driver, toolkit, runtime, build chain, and validation risk.

Repair

doctor auto

Apply compatible repairs to a broken or misleading CUDA environment and refuse success until validation passes.

Execution

validate

Prove that device selection, memory transfer, kernel launch, and runtime behavior work on the local GPU.

Execution

build

Build CUDA code in the current project with the correct toolkit, compiler, and architecture settings for the local machine.

Repository reference

Every tracked file

Browse a descriptive page for each tracked file in the cuda-doctor repo, including placeholders and planned subsystems.

From the blog

Why this project exists and what broke in 2025.

March 14, 2026by Adrian Osorio

Why cuda-doctor exists

In September 2025, the hard part was not discovering that Blackwell support existed somewhere. The hard part was proving that the machine in front of me could actually build and run real CUDA workloads end to end.

Read the postResearch-backed sources

Repo guide

What is in the repo and where to start.

If you are new here, think of the project in a few simple parts: the main code, the CUDA test programs, the CLI package, and the scripts and environments used to build and test everything.

src/           main program logic
include/       shared headers for the native code
kernels/       small CUDA test programs
cuda_doctor/   Python CLI package
tests/         automated tests
docker/        build and test environments
scripts/       setup and helper scripts