GPU Accelerated NLP Task (Theano+cuda)

Source: Internet
Author: User
Tags theano gtx

Prior to learning CNN's knowledge, referring to Yoon Kim (2014) paper, using CNN for text classification, although the CNN network structure simple effect, but the paper did not give specific training time, which deserves further discussion.

Yoon Kim Code: Https://github.com/yoonkim/CNN_sentence

Use the source code provided by the author to study, in my machine on the training, do a CV average training time as follows, ordinate for MIN/CV (for reference):

Machine configuration: Intel (R) Core (TM) i3-4150 CPU @ 3.50GHz, 32g,x64

Obviously, training very slowly slowly!!! Training on the CPU, do 10 CV, more than 10 hours ah, friend email and Yoon Kim confirmed, he said really slow slowly, no wonder there is no training time data in the paper ~.~

Consider the improvement, or multithreading for parallel, convolution layer can do parallel, but the code is not easy to write AH: (, so I consider GPU acceleration.)

Process: 1, install nvidia driver, 2, install the configuration cuda;3, modify the program to run with the GPU;

1, install Nvida Drive

  (0) See if you have a compatible video card: LSPCI | Grep-i nvidia, reference tutorial

  (1) Download nvidia driver for video card: http://www.nvidia.com/Download/index.aspx?lang=en-us

I gpu:geforce GTX 660 Ti, corresponding to download the driver for Nvidia-linux-x86_64-352.63.run

(2) Add executable permission: sudo chmod +x nvidia-linux-x86_64-352.63.run

(3) Close X-window:sudo service LIGHTDM stop and switch to TTY1:CTRL+ALT+F1

(4) Installation drive: sudo./nvidia-linux-x86_64-352.63.run. Follow the prompts to install, and you may want to set Compat32-libdir

(5) Restart X-window:sudo service LIGHTDM start.

(6) Verify that the driver installation is successful: Cat/proc/driver/nvidia/version

2, installation configuration Cuda

  (1) Installation tutorial: Http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#ubuntu-installation

(2) Download cuda-toolkit:https://developer.nvidia.com/cuda-downloads. Select the Cuda download that you configured to match: Cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb

(3) Note that different system installation commands are different, the following is the ubuntu14.04 installation command. What's the problem? See the tutorial above to get it done.

sudo dpkg-i cuda-repo-ubuntu1404-7-5-local_7. 5-18_amd64.debsudo apt-get updatesudoinstall Cuda

(4) Verify Toolkit success: Nvcc-v

(5) Configuration path: Vim. BASHRC

Path= $PATH:/usr/local/cuda-7.0/binld_library_path= $LD _library_path:/usr/local/cuda7. 0/lib64export pathexport ld_library_path

3, the revision program uses the GPU to run

  According to Theano Official document: http://deeplearning.net/software/theano/tutorial/using_gpu.html

The following code can be used to test whether the Cuda configuration is correct, the GPU can be used properly.

 fromTheanoImportfunction, config, shared, sandboxImportTheano.tensor as TImportNumPyImportTimevlen= 10 * 30 * 768#ten x #cores x # threads per coreIters = 1000rng= Numpy.random.RandomState (22) x=Shared (Numpy.asarray (Rng.rand (Vlen), config.floatx)) F=function ([], T.exp (x))Print(F.maker.fgraph.toposort ()) T0=time.time () forIinchxrange (iters): R=f () T1=time.time ()Print("looping%d times took%f seconds"% (iters, T1-t0))Print("Result is%s"%(R,))ifNumpy.any ([Isinstance (X.op, T.elemwise) forXinchF.maker.fgraph.toposort ()]):Print('used the CPU')Else:    Print('used the GPU')
View Code

Save the above code as check_gpu.py, use the following command to test, according to the test results to know whether the GPU can be used properly, if the error may be the above path configuration problem.

$ theano_flags=mode=fast_run,device=cpu,floatx=float32 python check1.py[elemwise{exp,no_inplace} (<tensortype (float32, vector) >)]looping +Times took3.06635117531Secondsresult is [1.23178029  1.61879337  1.52278066...,2.20771813  2.29967761  1.62323284]used the cpu$ theano_flags=mode=fast_run,device=gpu,floatx=float32 python check1.pyusing GPU device0: GeForce GTX580[Gpuelemwise{exp,no_inplace} (<cudandarraytype (float32, Vector) >), Hostfromgpu (Gpuelemwise{exp,no_inplace}.0)]looping +Times took0.638810873032Secondsresult is [1.23178029  1.61879349  1.52278066...,2.20771813  2.29967761  1.62323296]used the GPU

Since the Nvidia GPU is primarily optimized for float32 bit floating point calculations, it is necessary to place the data and variable types in the code into float32.

Make the following changes to the code specifically:

(1) process_data.py

Line, W = Np.zeros (Shape= (vocab_size+1, k), dtype='float32', w[0] = Np.zeros (k , dtype='float32')

After modifying, run the command to get the word vector (float32) for each word.

Python process_data.py Googlenews-vectors-negative300.bin

(2) conv_net_sentence.py

Add Allow_input_downcast=true, program intermediate operation Process if produce float64, will cast to float32.

Lin, Set_zero = Theano.function ([Zero_vec_tensor], updates=[(Words, T.set_subtensor (words[0,:], zero_vec_tensor))] , allow_input_downcast=True) lin131, Val_model=theano.function ([index], classifier.errors (y), Givens={X:val_set_x[index* Batch_size: (index + 1) *Batch_size], Y:val_set_y[index* Batch_size: (index + 1) * Batch_size]},allow_input_downcast= True) Lin137, Test_model =theano.function ([index], classifier.errors (y), Givens={X:train_set_x[index* Batch_size: (index + 1) *Batch_size], Y:train_set_y[index* Batch_size: (index + 1) * Batch_size]},allow_input_downcast= True) Lin141, Train_model = Theano.function ([index], cost, updates=Grad_updates, Givens={x:train_set_x[indexbatch_size: (Index+1) Batch_size], Y:train_set_y[indexbatch_size: (Index+1) Batch_size]},allow_input_downcast= True) Lin155, Test_model_all = Theano.function ([x, y], Test_error,allow_input_downcast=true)

(3) running the program

Theano_flags=mode=fast_run,device=gpu0,floatx=float32,warn_float64=raise python conv_net_sentence.py-static- Word2vec
Theano_flags=mode=fast_run,device=gpu0,floatx=float32,warn_float64=raise python conv_net_sentence.py-nonstatic- Word2vec
Theano_flags=mode=fast_run,device=gpu0,floatx=float32,warn_float64=raise python conv_net_sentence.py-nonstatic- Rand

(4) The result was amazing and the training time increased by 20x.

The first time to run the GPU, the above process, if there is negligence, please also more guidance.

Reference:

1, about Theano configuration: http://deeplearning.net/software/theano/library/config.html

2, Ubuntu installation theano+cuda:http://www.linuxidc.com/linux/2014-10/107503.htm

GPU Accelerated NLP Task (Theano+cuda)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.