Hinton Open Source Capsulenet

Source: Internet
Author: User

The current theory of deep learning was established by Geoffrey Hinton in 2007, but now he thinks, "the CNN feature extraction layer and the secondary sampling layer cross access, the same type of adjacent feature detector output together" is very problematic.


Last September, in a media interview in Toronto, Hinton the great God declared that he would give up the reverse and let the whole artificial intelligence be rebuilt from scratch. In the October, the capsule paper "Dynamic Routing between Capsules", which has long been the focus of Hinton, has finally unveiled its veil.



In the paper, capsule is defined by the Hinton as such a group of neurons: its activity vector represents an instantiated parameter of a specific entity type. His experiments show that the capsule system of discriminant training shows the most advanced performance on the Mnist handwritten data set, and it is far better than CNN to recognize highly overlapped numbers.


Recently, Sara Sabour, the paper, finally unveiled the code in the paper on GitHub. The project, which was launched 5 days, was 217 star and was fork 14,218 times. Now let's take a look at Sara Sabour open source code.


The code for the capsule model is used in the following papers:


"Dynamic Routing between Capsules" by Sara Sabour, Nickolas frosst, Geoffrey E. Hinton.


Requirements


TensorFlow (Access http://www.tensorflow.org Learn how to install and upgrade)

NumPy (refer to http://www.numpy.org/)

Gpu


Run the test code to verify that the settings are correct, for example:


Python layers_test.py


Quick Mnist Test Results:


Download and extract Mnist records from the following URLs to $DATA _dir/:https://storage.googleapis.com/capsule_toronto/mnist_data.tar.gz

Download and extract the Mnist model detection point (checkpoint) to $ckpt_dir from the following URL:


Python experiment.py--data_dir= $DATA _dir/mnist_data/--train=false \
--summary_dir=/tmp/--
checkpoint= $CKPT _dir/mnist_checkpoint/model.ckpt-1


Quick CIFAR10 Ensemble test results:


Download and extract the Cifar10 binary version to $data_dir/from the following URL:

Https://www.cs.toronto.edu/~kriz/cifar.html

Download and extract the CIFAR10 model detection point (checkpoint) to $ckpt_dir from the following URL:

Https://storage.googleapis.com/capsule_toronto/cifar_checkpoints.tar.gz

Passes the extracted binary directory as Data_dir to ($DATA _dir)


Python experiment.py--data_dir= $DATA _dir--train=false--dataset=cifar10 \
--hparams_override=num_prime_capsules=64,padding=same,leaky=true,remake=false \
--summary_dir=/tmp/--checkpoint= $CKPT _dir/cifar/cifar{}/model.ckpt-600000 \
--num_trials=7


Sample CIFAR10 Training Command:


Python experiment.py--data_dir= $DATA _dir--dataset=cifar10--max_steps=600000\
--hparams_override=num_prime_capsules=64,padding=same,leaky=true,remake=false \
--summary_dir=/tmp/


Sample mnist Complete Training Command:


Python experiment.py--data_dir= $DATA _dir/mnist_data/--max_steps=300000\
--summary_dir=/tmp/attempt0/


Sample mnist Baseline Training Command:


Python experiment.py--data_dir= $DATA _dir/mnist_data/--max_steps=300000\
--summary_dir=/tmp/attempt1/--model=baseline


Tests on the validation set during the training of the above model


Considerations for continuous operation in training:


--validate = true in training

Total 2 GPU required: one for training, one for validating

If the training and validation work is on the same machine, you need to limit the amount of RAM per task, because TensorFlow assigns all the RAM to the first task, and the second task cannot.


Testing/Training on Multimnist:


--num_targets = 2
--data_dir = $ data_dir/multitest_6shifted_mnist.tfrecords@10


The code that generates the Multimnist/mnist record is located in input_data/mnist/mnist_shift.py


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.