doctor
Diagnose the GPU, driver, toolkit, runtime, build chain, and validation risk.
Documentation for checking whether a machine can build and run CUDA workloads correctly.
Example session
$ cuda-doctor doctor
Risk: high
GPU: RTX 5090 (Blackwell / sm_120)
Problem: local toolchain cannot target sm_120
PyTorch: installed wheel mismatches local runtime
$ cuda-doctor doctor auto
Repair plan: reconcile toolkit, flags, and runtime pairing
$ cuda-doctor validate
Status: passed
$ cuda-doctor build
Architecture target: sm_120What it handles
RTX 5000-series and Blackwell readiness
Missing `sm_120` support in toolchains or build flags
Outdated CUDA, PyTorch, or runtime stacks
Wrong Linux kernel module flavor
Driver and runtime mismatches
Fake-success installs that fail during GPU execution
Command flow
doctor
Diagnose the GPU, driver, toolkit, runtime, build chain, and validation risk.
doctor auto
Repair what is compatible and refuse to call it fixed until validation passes.
validate
Prove memory transfer and kernel execution work on the local GPU.
build
Compile with the right architecture and toolchain settings for the machine.
Documentation map
Start here
Install cuda-doctor, diagnose the machine, repair what is compatible, then prove GPU execution works.
Diagnose
Run a full environment diagnosis for the GPU, driver, toolkit, runtime, build chain, and validation risk.
Repair
Apply compatible repairs to a broken or misleading CUDA environment and refuse success until validation passes.
Execution
Prove that device selection, memory transfer, kernel launch, and runtime behavior work on the local GPU.
Execution
Build CUDA code in the current project with the correct toolkit, compiler, and architecture settings for the local machine.
Repository reference
Browse a descriptive page for each tracked file in the cuda-doctor repo, including placeholders and planned subsystems.
From the blog
In September 2025, the hard part was not discovering that Blackwell support existed somewhere. The hard part was proving that the machine in front of me could actually build and run real CUDA workloads end to end.
Repo guide
If you are new here, think of the project in a few simple parts: the main code, the CUDA test programs, the CLI package, and the scripts and environments used to build and test everything.
src/ main program logic
include/ shared headers for the native code
kernels/ small CUDA test programs
cuda_doctor/ Python CLI package
tests/ automated tests
docker/ build and test environments
scripts/ setup and helper scripts