Setting up a deep learning machine from Scratch (software)
A detailed guide-to-setting up your machine for deep learning. Includes instructions to the install drivers, tools and various deep learning frameworks. This is tested on a a-bit machine with Nvidia Titan X, running Ubuntu 14.04
There is several great guides with a similar goal. Some is limited in scope, while others is not up to date. This are based on (with some portions copied verbatim from):
- Caffe Installation for Ubuntu
- Running a deep learning Dream machine
Table of Contents
- Basics
- Nvidia Drivers
- CUDA
- CuDNN
- Python Packages
- TensorFlow
- Openblas
- Common Tools
- Caffe
- Theano
- Keras
- Torch
- X2go
Basics
First, open a terminal and run the following commands to make sure your OS are up-to-date
sudo apt-get update sudo apt-get upgrade sudo apt-get install build-essential cmake g++ gfortran git pkg-config python-dev software-properties-common wgetsudo apt-get autoremove sudo rm -rf /var/lib/apt/lists/*
Nvidia Drivers
Find your graphics card model
lspci | grep -i nvidia
Go to the Nvidia website and find the latest drivers for your graphics card and system setup. Can download the driver from the website and install it, but doing so makes updating to newer drivers and uninstalling It a little messy. Also, doing this would require you have to quit your X server session and install from a Terminal session, which are A has Sle.
We'll install the drivers using Apt-get. Check If your latest driver exists in the "proprietary GPU Drivers" PPA. Note that the latest drivers is necessarily the most stable. It's advisable to install the driver version recommended on the page. ADD the "proprietary GPU Drivers" PPA repository. At the time of this writing, the latest version was 361.42, however, the recommended version is 352:
sudo add-apt-repository ppa:graphics-drivers/ppasudo apt-get updatesudo apt-get install nvidia-352
Restart your system
sudo shutdown -r now
Check to ensure the correct version of NVIDIA drivers is installed
cat /proc/driver/nvidia/version
CUDA
Download CUDA 7.5 from Nvidia. Go to the Downloads directory and install CUDA
sudo dpkg -i cuda-repo-ubuntu1404*amd64.debsudo apt-get updatesudo apt-get install cuda
ADD CUDA to the environment variables
echo ‘export PATH=/usr/local/cuda/bin:$PATH‘ >> ~/.bashrcecho ‘export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH‘ >> ~/.bashrcsource ~/.bashrc
Check to ensure the correct version of CUDA is installed
nvcc -V
Restart your computer
sudo shutdown -r now
Checking your CUDA installation (Optional)
Note: ( -j $(($(nproc) + 1))
) executes the make command is parallel using the number of cores in your machine, so the compilation is faster
CuDNN
CuDNN is a GPUs accelerated library for Dnns. It can help speed up execution in many cases. To being able to download the CuDNN library, you need to register in the Nvidia website at HTTPS://DEVELOPER.NVIDIA.COM/CUDNN . This can take anywhere between a few hours to a couple from working days to get approved. Once your registration is approved, download CuDNN v4 for Linux. The latest version is CuDNN V5, however, not all toolkits support it yet.
Extract and copy the files
cd ~/Downloads/tar xvf cudnn*.tgzcd cudasudo cp */*.h /usr/local/cuda/include/sudo cp */libcudnn* /usr/local/cuda/lib64/sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
Check
- You can do a check to ensure everything are good so far using the
nvidia-smi
command. This should output some stats about your GPU
Python Packages
Install some useful Python packages using Apt-get. There is some version incompatibilities with using PIP install and TensorFlow (see Https://github.com/tensorflow/tensorf low/issues/2034)
sudo apt-get update && apt-get install -y python-numpy python-scipy python-nose python-h5py python-skimage python-matplotlib python-pandas python-sklearn python-sympysudo apt-get clean && sudo apt-get autoremoverm -rf /var/lib/apt/lists/*
TensorFlow
This installs v0.8 with GPU support. Instructions below is from here
sudo apt-get install python-pip python-devsudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
Run a test to ensure your TensorFlow installation is successful. When you execute import
the command, there should is no warning/error.
python>>> import tensorflow as tf>>> exit()
Openblas
Openblas is a linear algebra library and is faster than Atlas. This step was optional, but note that some of the following steps assume that Openblas is installed. You'll need to the install Gfortran to compile it.
mkdir ~/gitcd ~/gitgit clone https://github.com/xianyi/OpenBLAS.gitcd OpenBLASmake FC=gfortran -j $(($(nproc) + 1))sudo make PREFIX=/usr/local install
Add the path to your Ld_library_path variable
echo ‘export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH‘ >> ~/.bashrc
Common Tools
Caffe
The following instructions is from here. The first step is to install the pre-requisites
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compilersudo apt-get install --no-install-recommends libboost-all-devsudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
Clone the Caffe Repo
cd ~/gitgit clone https://github.com/BVLC/caffe.gitcd caffecp Makefile.config.example Makefile.config
If you installed CuDNN, uncomment the "line" in the USE_CUDNN := 1
Makefile
sed -i ‘s/# USE_CUDNN := 1/USE_CUDNN := 1/‘ Makefile.config
If you installed Openblas, modify the BLAS
parameter value toopen
sed -i ‘s/BLAS := atlas/BLAS := open/‘ Makefile.config
Install The requirements, build Caffe, build the tests, run the tests and ensure that all tests pass. Note that all this takes a while
sudo pip install -r python/requirements.txtmake all -j $(($(nproc) + 1))make test -j $(($(nproc) + 1))make runtest -j $(($(nproc) + 1))
Build Pycaffe, the Python interface to Caffe
make pycaffe -j $(($(nproc) + 1))
ADD Caffe to your environment variable
echo ‘export CAFFE_ROOT=$(pwd)‘ >> ~/.bashrcecho ‘export PYTHONPATH=$CAFFE_ROOT/python:$PYTHONPATH‘ >> ~/.bashrcsource ~/.bashrc
Test to ensure this your Caffe installation is successful. There should be no warnings/errors when the import command is executed.
ipython>>> import caffe>>> exit()
Theano
Install the pre-requisites and install Theano. These instructions is sourced from here
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ python-pygments python-sphinx python-nosesudo pip install Theano
Test your Theano installation. There should be no warnings/errors when the import command is executed.
python>>> import theano>>> exit()
Keras
Torch
Instructions to install Torch below is sourced from here. The installation takes a little while
git clone https://github.com/torch/distro.git ~/git/torch --recursivecd torch; bash install-deps;./install.sh
X2go
-
If Your deep learning machine was not your primary work desktop, it helps to was able to access it remotely. X2go is a fantastic remote access solution. You can install the X2go server on your Ubuntu machine using the instructions below.
sudo apt-get install software-properties-commonsudo add-apt-repository ppa:x2go/stablesudo apt-get Updatesudo apt-get Install x2goserver x2goserver-xsession
-
X2go does not support the Unity Deskto P Environment (the default in Ubuntu). I have the found XFCE to work pretty well. More details on the supported environmens here
sudo apt-get updatesudo apt-get install-y xfce4 Xfce4-goodie s xubuntu-desktop
-
Find The IP of your machine using
hostname-i
-
You can install a client on your main machine to connect to your deep learning server using the above IP. More instructions here depending on your Client OS
Setting up a deep learning machine from Scratch (software)