43d4b89454baa3d13ce65da567d14d2d.png
  • 安裝 Nvidia Driver
    可以到這邊找適合的 driver: https://www.nvidia.com/Download/index.aspx?lang=en-us
    sudo add-apt-repository ppa:graphics-drivers/ppa -y
    sudo apt -y update
    sudo apt install -y nvidia-384
    不過有可能遇到一些狀況,比如在 x-server 下無法安裝的問題
    可以按 [ctrk] + [alt] + [F1] 進入文字模式
    然後關閉 x server
    sudo service lightdm stop
    sudo init 3
    這個時候才開始安裝 nvidia 的 .run 檔案
     
    如果還是遇到問題,那可能就要把 nouveau 停用
    vi /etc/modprobe.d/blacklist-nouveau.conf
    
    # 加入
    blacklist nouveau
    options nouveau modeset=0
    
    # 然後執行
    sudo update-initramfs -u
    
    設定好後就可以重開機了!!
     
  • 安裝 CUDA
    到這邊下載 CUDA
    然後執行以下命令
    sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
    sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
    sudo apt-get -y update
    sudo apt-get -y install cuda libcupti-dev
  • 增加環境變數
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
  • 安裝 cuDNN v7.0
    然後解開安裝
    sudo dpkg -i libcudnn7_7.4.1.5-1+cuda9.2_amd64.deb
  • 安裝 python 跟 virtual-env
    sudo apt-get install -y python-pip python-dev python-virtualenv
    建立一個虛擬的 python 環境
    virtualenv --system-site-packages ~/tensorflow
    Active 該虛擬環境
    source ~/tensorflow/bin/activate
  • 安裝 NVIDIA TensorRT 3.0
    安裝 tensorrt
    sudo dpkg -i nv-tensorrt-repo-ubuntu1604-cuda9.0-trt5.0.0.10-rc-20180906_1-1_amd64.deb
    sudo apt-get update
    sudo apt-get install tensorrt libcudnn7
    sudo apt-get install uff-converter-tf graphsurgeon-tf
    sudo apt-get install libcudnn7=7.3.0.29-1+cuda9.0 libcudnn7-dev=7.3.0.29-1+cuda9.0
    sudo apt-mark hold libcudnn7 libcudnn7-dev
     
    安裝 PyCUDA
    pip install 'pycuda>=2017.1.1'
    
     
    或是用 tar 的方式安裝
    tar -zxvf TensorRT-5.0.0.10.Ubuntu-16.04.4.x86_64-gnu.cuda-10.0.cudnn7.3.tar.gz 
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:TensorRT-5.0.0.10/lib
    cd TensorRT-5.0.0.10/python 
    sudo pip2 install tensorrt-5.0.0.10-py2.py3-none-any.whl 
    cd TensorRT-5.0.0.10/uff 
    sudo pip2 install uff-0.5.1-py2.py3-none-any.whl 
    cd TensorRT-5.0.0.10/graphsurgeon 
    sudo pip2 install graphsurgeon-0.2.2-py2.py3-none-any.whl
  • 安裝 gpu 版本的 tensorflow
    先升級 pip 版本
    easy_install -U pip
    安裝 tensorflow
    pip install --upgrade tensorflow-gpu
    如果裝了有問題,那就是是看用 1.5 版
    pip install tensorflow-gpu==1.5
    
  • Hello World
    如果都好了,那就跑個程式試試看是否正確吧!
    # Python
    import tensorflow as tf
    hello = tf.constant('Hello, TensorFlow!')
    sess = tf.Session()
    如果你的輸出有看到 GPU,那應該就沒問題了!~
    以下是我的輸出 (炫耀文)
    2018-10-30 23:27:53.582391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
    2018-10-30 23:27:53.582698: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1 2 3 4 5 6 7 
    2018-10-30 23:27:53.582708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0:   Y N N N N N N N 
    2018-10-30 23:27:53.582713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1:   N Y N N N N N N 
    2018-10-30 23:27:53.582718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 2:   N N Y N N N N N 
    2018-10-30 23:27:53.582722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 3:   N N N Y N N N N 
    2018-10-30 23:27:53.582727: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 4:   N N N N Y N N N 
    2018-10-30 23:27:53.582740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 5:   N N N N N Y N N 
    2018-10-30 23:27:53.582751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 6:   N N N N N N Y N 
    2018-10-30 23:27:53.582758: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 7:   N N N N N N N Y 
    2018-10-30 23:27:53.582770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: P106-100, pci bus id: 0000:01:00.0, compute capability: 6.1)
    2018-10-30 23:27:53.582778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: P106-100, pci bus id: 0000:02:00.0, compute capability: 6.1)
    2018-10-30 23:27:53.582785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: P106-100, pci bus id: 0000:03:00.0, compute capability: 6.1)
    2018-10-30 23:27:53.582791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: P106-100, pci bus id: 0000:04:00.0, compute capability: 6.1)
    2018-10-30 23:27:53.582797: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:4) -> (device: 4, name: P106-100, pci bus id: 0000:05:00.0, compute capability: 6.1)
    2018-10-30 23:27:53.582803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:5) -> (device: 5, name: P106-100, pci bus id: 0000:06:00.0, compute capability: 6.1)
    2018-10-30 23:27:53.582809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:6) -> (device: 6, name: P106-100, pci bus id: 0000:09:00.0, compute capability: 6.1)
    2018-10-30 23:27:53.582815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:7) -> (device: 7, name: P106-100, pci bus id: 0000:0b:00.0, compute capability: 6.1)
    Hello, TensorFlow!
    
  • 無法正常執行
    如果甚麼都裝了,但是一執行就出現
    tensorflow-gpu illegal instruction (core dumped)
    可以考慮試試看用 anaconda,先到這邊下載並執行
     
    https://www.anaconda.com/download/#linux
     
    安裝好之後,再執行以下指令建立並啟用 tensorflow 環境
    conda create -n tensorflow
    conda activate tensorflow
    然後安裝 tensorflow-gpu
    conda install tensorflow-gpu -n tensorflow
    安裝相關套件
    conda install -c anaconda matplotlib cudatoolkit _tflow_190_select PIL keras msgpack
  • 建立 GPU 環境
    conda create -n tensorflow_gpuenv tensorflow-gpu
    conda activate tensorflow_gpuenv
    conda install -c anaconda matplotlib cudatoolkit _tflow_190_select PIL keras msgpack 
    安裝相關套件
    pip install --upgrade pip
    pip install glob2 opencv-python
    
  • 查看 tensorflow 版本
    python -c "import tensorflow as tf; print(tf.__version__)"
    
回應
訪客如要回應,請先 登入