之所以安装centOS7,也是因为它支持CUDA10.0. 之前一直听说过tensorflow和GPU编程,也一直没有机会实践。
现在就尝试一下搭建支持GPU的tensorflow的环境吧。
1.安装DOCKER
Uninstall old versions
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
 | sudo yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-selinux \
                  docker-engine-selinux \
                  docker-engine 
 | 
 
| 1
2
3
4
5
6
7
8
9
 | sudo yum install -y yum-utils \
  device-mapper-persistent-data \
  lvm2 
sudo yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce
sudo systemctl start docker
sudo docker run hello-world
 | 
 
Manage Docker as a non-root user
| 1
2
3
4
5
 | # Create the docker group.
 sudo groupadd docker
#Add your user to the docker group.
sudo usermod -aG docker $USER
#Log out and log back in so that your group membership is re-evaluated.
 | 
 
2. 安装CUDA(其实只用安装CUDA-driver就行)
2.1 下载CUDA rpm
2.2 Verify You Have a CUDA-Capable GPU
2.3 Verify You Have a Supported Version of Linux
| 1
 | uname -m && cat /etc/*release
 | 
 
2.4 Verify the System Has gcc Installed
2.5 Verify the System has the Correct Kernel Headers and Development Packages Installed
| 1
2
 | uname -r
sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
 | 
 
The NVIDIA CUDA Toolkit is available at http://developer.nvidia.com/cuda-downloads.
| 1
2
3
4
5
6
7
 | # Install repository meta-data
sudo rpm --install cuda-repo-<distro>-<version>.<architecture>.rpm
## Clean Yum repository cache
sudo yum clean expire-cache
## Install CUDA
sudo yum install cuda
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
 | 
 
3.安装nvidia-docker
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
 | # If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo yum remove nvidia-docker
# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
  sudo tee /etc/yum.repos.d/nvidia-docker.repo
# Install nvidia-docker2 and reload the Docker daemon configuration
sudo yum install -y nvidia-docker2
sudo pkill -SIGHUP dockerd
##  docker GPU
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker run --runtime=nvidia -it --rm tensorflow/tensorflow:latest-gpu \
   python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
 | 
 
参考
- Docker 教程 http://www.runoob.com/docker/docker-tutorial.html
- Tensorflow https://tensorflow.google.cn/
- nvidia-docker https://github.com/NVIDIA/nvidia-docker