Ubuntu上安装NVIDIA GPU开发环境

1 问题场景

在 Linux 系统上进行深度学习相关项目开发时,需要安装 PyTorch(libTorch) 、 CUDA套件 来充分利用 NVIDIA GPU 的计算能力,加速模型训练和推理过程。然而,我在实际安装过程中遇到找不到显卡、系统依赖项不满足、CUDA 版本与显卡驱动不兼容、PyTorch运行失败等各种问题,因此需要一个详细的安装指南来确保顺利安装和配置。

2 技术方案

2.1 前置条件

  • 英伟达显卡
  • 尽可能高版本的python3及pip
  • 版本不是很低的gcc编译器(支持c++17往上)
  • 其他编译相关基础环境,如cmake等

2.2 版本兼容性问题

2.2.1 显卡与驱动版本的兼容性问题

首先要注意驱动的安装版本,为了避免兼容性问题通常用以下几种方式:

  1. 文件安装,通过英伟达驱动官网,搜索对应显卡型号,找到对应的xxx.run驱动安装文件下载安装;

  2. 系统安装,以Ubuntu系统为例,当系统检测到显卡时,通过下面的命令可以列出或安装驱动程序。

    sudo ubuntu-drivers list --gpgpu
    sudo ubuntu-drivers install --gpgpu

个人比较推荐通过文件安装的方式。

2.2.2 显卡架构与版本号

NVIDIA的架构代号有很多,以sm_xx类型为例,各架构兼容性如下表:

架构名称 版本号 特点
费米 Fermi sm_20 不支持CUDA 10及以后版本
开普勒 Kepler sm_30、sm_35、sm_37 支持统一内存模型编程,支持动态并行化,增加了一些寄存器,不支持CUDA 11及以后版本
麦克斯韦 Maxwell sm_50、sm_52、sm_53 不支持CUDA 11及以后版本
帕斯卡 Pascal sm_60、sm_61、sm_62 支持CUDA 8及以后版本
伏特 Volta sm_70、sm_72 支持CUDA 9及以后版本
图灵 Turing sm_75 支持CUDA 10及以后版本
安培 Ampere sm_80、sm_86 支持CUDA 11及以后版本,从8.0上编译出的二进制文件可以也在8.6上运行,但推荐在fp32类型上使用8.6的编译选项
哈珀 Hopper sm_90 支持CUDA 12及以后版本(计划中)

关于如何查找所用显卡对应的sm_xx代号,大概有以下两种方式:

  1. 通过网址查询:https://developer.nvidia.com/cuda-gpus
  2. 安装cuda编译器后,通过__nvcc_device_query命令查询

2.2.3 torch版本与显卡架构兼容性

选择安装pytorch/libtorch时也要注意与显卡架构兼容性的问题,可在pytorch_compute_capabilities仓库中查询,摘录结果如下:

package architectures
pytorch-2.5.1-py3.12_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.12_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.12_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.5.1-py3.11_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.11_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.11_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.5.1-py3.10_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.10_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.10_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.5.1-py3.9_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.9_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.1-py3.9_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.5.0-py3.12_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.12_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.12_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.5.0-py3.11_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.11_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.11_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.5.0-py3.10_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.10_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.10_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.5.0-py3.9_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.9_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90, sm_90a
pytorch-2.5.0-py3.9_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.12_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.12_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.12_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.11_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.11_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.11_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.10_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.10_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.10_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.9_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.9_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.1-py3.9_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.12_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.12_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.12_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.11_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.11_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.11_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.10_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.10_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.10_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.9_cuda12.4_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.9_cuda12.1_cudnn9.1.0_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.4.0-py3.9_cuda11.8_cudnn9.1.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.12_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.12_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.10_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.1-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.0-py3.12_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.0-py3.12_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.0-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.0-py3.11_cuda11.8_cudnn8.7.0_0
pytorch-2.3.0-py3.10_cuda12.1_cudnn8.9.2_0
pytorch-2.3.0-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.0-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.3.0-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.12_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.12_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.10_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.2-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.12_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.12_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.10_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.1-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.12_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.12_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_70, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_70, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.10_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_70, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.2.0-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.2-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.2-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.2-py3.10_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.2-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.2-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.2-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.1-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.1-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.1-py3.10_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.1-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.1-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.1-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.0-py3.11_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.0-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.0-py3.10_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.0-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.0-py3.9_cuda12.1_cudnn8.9.2_0 sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.1.0-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.0.1-py3.11_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.0.1-py3.11_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-2.0.1-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.0.1-py3.10_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-2.0.1-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.0.1-py3.9_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-2.0.0-py3.10_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.0.0-py3.10_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-2.0.0-py3.9_cuda11.8_cudnn8.7.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86, sm_90
pytorch-2.0.0-py3.9_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.1-py3.10_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.1-py3.10_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.1-py3.9_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.1-py3.9_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.0-py3.10_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.0-py3.10_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.0-py3.9_cuda11.7_cudnn8.5.0_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.13.0-py3.9_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.1-py3.10_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.1-py3.10_cuda11.3_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.1-py3.10_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.12.1-py3.9_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.1-py3.9_cuda11.3_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.1-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.12.0-py3.10_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.0-py3.10_cuda11.3_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.0-py3.10_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.12.0-py3.9_cuda11.6_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.0-py3.9_cuda11.3_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.12.0-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.11.0-py3.10_cuda11.5_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.11.0-py3.10_cuda11.3_cudnn8.2.0_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.11.0-py3.10_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.11.0-py3.10_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.11.0-py3.9_cuda11.5_cudnn8.3.2_0 sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.11.0-py3.9_cuda11.3_cudnn8.2.0_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.11.0-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.11.0-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.10.2-py3.9_cuda11.3_cudnn8.2.0_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.10.2-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.10.2-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.10.1-py3.9_cuda11.3_cudnn8.2.0_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.10.1-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.10.1-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.10.0-py3.9_cuda11.3_cudnn8.2.0_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.10.0-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.10.0-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.9.1-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.9.1-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.9.0-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.9.0-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.8.1-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.8.1-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.8.1-py3.9_cuda10.1_cudnn7.6.3_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.8.0-py3.9_cuda11.1_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80, sm_86
pytorch-1.8.0-py3.9_cuda10.2_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.8.0-py3.9_cuda10.1_cudnn7.6.3_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.7.1-py3.9_cuda11.0.221_cudnn8.0.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75, sm_80
pytorch-1.7.1-py3.9_cuda10.2.89_cudnn7.6.5_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.7.1-py3.9_cuda10.1.243_cudnn7.6.3_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70, sm_75
pytorch-1.7.1-py3.9_cuda9.2.148_cudnn7.6.3_0 sm_35, sm_37, sm_50, sm_60, sm_61, sm_70

3 实现路径

3.1 显卡检查

  1. 显卡连接主板后,查询系统是否能找到显卡:

    lspci | grep -i nvidia
  2. 显卡似乎未正常工作时(如nvidia-smi命令返回为空),可用以下命令查询出错信息:

    sudo dmesg

如显卡供电线未工作时输出如下:

[  301.584104] NVRM: GPU 0000:02:00.0: GPU does not have the necessary power cables connected.
[  301.584597] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x24:0x1c:1512)
[  301.584642] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[  304.095280] NVRM: GPU 0000:02:00.0: GPU does not have the necessary power cables connected.
[  304.095722] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x24:0x1c:1512)
[  304.095765] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[  342.790660] NVRM: GPU 0000:02:00.0: GPU does not have the necessary power cables connected.
[  342.791129] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x24:0x1c:1512)
[  342.791182] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[  343.163675] NVRM: GPU 0000:02:00.0: GPU does not have the necessary power cables connected.
[  343.164110] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x24:0x1c:1512)
[  343.164151] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[  349.650912] NVRM: GPU 0000:02:00.0: GPU does not have the necessary power cables connected.
[  349.651390] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x24:0x1c:1512)
[  349.651445] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0

3.2 显卡驱动安装

  1. 下载正确的.run文件;
  2. 运行该文件,按提示完成安装;
  3. 运行nvidia-smi测试安装结果。

详见《显卡与驱动版本的兼容性问题》

3.3 cuda套件安装

安装 CUDA Toolkit :

  1. 安装官网的步骤进行下载安装;
  2. cuda安装路径加入PATH:
    export PATH=/usr/local/cuda/bin:$PATH  
    export LD_LIBRARY_PATH=/usr/local/cuda/lib:$LD_LIBRARY_PATH 
  3. nvcc -v测试安装结果。

3.4 pytorch安装

  1. 确定与显卡架构兼容的pytorch版本(见技术方案部分),如我们的显卡sm_61支持的最新torch版本为pytorch-2.5.1-py3.12_cuda12.4_cudnn9.1.0_0
  2. 前往PyTorch官网https://pytorch.org/get-started/previous-versions,找到对应版本的安装命令:pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1
  3. 通过pip安装。

需要注意的是,如果系统提示需要创建一个python虚拟环境(如Ubuntu24.04),则前置步骤如下:

  1. 选定合适的目录,运行python3 -m venv torch_env;
  2. source /path/to/torch_env/bin/activate加入到~/.bashrc中;
  3. source ~/.bashrc,自动导入虚拟环境,接下来所有python包都会安装在该torch_env目录中。

可通过以下小程序测试安装结果:

import torch
print("PyTorch 版本:", torch.__version__)
print("CUDA 是否可用:", torch.cuda.is_available())
print("CUDA 版本:", torch.version.cuda)
print("GPU 数量:", torch.cuda.device_count())
print("当前 GPU 名称:", torch.cuda.get_device_name(0))

torch.cuda.is_available()返回True,则说明 PyTorch 已成功配置 GPU 支持。

3.5 pytorch扩展安装

安装扩展时,要注意与torch版本的一致性,如torch2.5.1版本扩展安装命令如下:

pip install torch-cluster -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install pyg-lib -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install torchmetrics -f https://data.pyg.org/whl/torch-2.5.1+cu124.html

3.6 libtorch安装

如果要在C++环境中使用torch,则需下载libtorch库,注意torch版本和cuda版本的兼容性选择,如之前安装了pytorch2.5.1和cuda12.8,则下载命令如下:

wget https://download.pytorch.org/libtorch/cu128/libtorch-cxx11-abi-shared-with-deps-2.5.1%2Bcu128.zip

解压后设置环境变量即可使用:

export Torch_ROOT=/path/to/libtorch
export LD_LIBRARY_PATH=/path/to/libtorch/lib:$LD_LIBRARY_PATH

3.7 libtorch扩展安装

扩展组件一般都需要编译安装,过程如下(以torch-scatter为例):

git clone https://github.com/rusty1s/pytorch_scatter.git
cd pytorch_scatter
mkdir build && cd build
cmake ..
ccmake .. #修改安装路径等选项
make -j8
make install
#加入到环境变量~/.bashrc
export TorchScatter_DIR=/path/to/pytorch_scatter/install
Author: zcp
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source zcp !
评论
 Current
Ubuntu上安装NVIDIA GPU开发环境
1 问题场景在 Linux 系统上进行深度学习相关项目开发时,需要安装 PyTorch(libTorch) 、 CUDA套件 来充分利用 NVIDIA GPU 的计算能力,加速模型训练和推理过程。然而,我在实际安装过程中遇到找不到
Next 
flex/bison使用
1 基本介绍 Flex(词法分析器)像“文本拆分器”,负责将原始文本拆解成有意义的单词(称为 Token)。例如:从代码 price = 100+20 中识别出 price、=、100、+、20。 Bison(语法分析器)像
  TOC