之所以安装centOS7,也是因为它支持CUDA10.0. 之前一直听说过tensorflow和GPU编程,也一直没有机会实践。
现在就尝试一下搭建支持GPU的tensorflow的环境吧。
1.安装DOCKER
Uninstall old versions
1
2
3
4
5
6
7
8
9
10
|
sudo yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
|
1
2
3
4
5
6
7
8
9
|
sudo yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
sudo yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce
sudo systemctl start docker
sudo docker run hello-world
|
Manage Docker as a non-root user
1
2
3
4
5
|
# Create the docker group.
sudo groupadd docker
#Add your user to the docker group.
sudo usermod -aG docker $USER
#Log out and log back in so that your group membership is re-evaluated.
|
2. 安装CUDA(其实只用安装CUDA-driver就行)
2.1 下载CUDA rpm
2.2 Verify You Have a CUDA-Capable GPU
2.3 Verify You Have a Supported Version of Linux
1
|
uname -m && cat /etc/*release
|
2.4 Verify the System Has gcc Installed
2.5 Verify the System has the Correct Kernel Headers and Development Packages Installed
1
2
|
uname -r
sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
|
The NVIDIA CUDA Toolkit is available at http://developer.nvidia.com/cuda-downloads.
1
2
3
4
5
6
7
|
# Install repository meta-data
sudo rpm --install cuda-repo-<distro>-<version>.<architecture>.rpm
## Clean Yum repository cache
sudo yum clean expire-cache
## Install CUDA
sudo yum install cuda
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
|
3.安装nvidia-docker
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo yum remove nvidia-docker
# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
sudo tee /etc/yum.repos.d/nvidia-docker.repo
# Install nvidia-docker2 and reload the Docker daemon configuration
sudo yum install -y nvidia-docker2
sudo pkill -SIGHUP dockerd
## docker GPU
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker run --runtime=nvidia -it --rm tensorflow/tensorflow:latest-gpu \
python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
|
参考
- Docker 教程 http://www.runoob.com/docker/docker-tutorial.html
- Tensorflow https://tensorflow.google.cn/
- nvidia-docker https://github.com/NVIDIA/nvidia-docker