Mxnet Combat Series (i) Getting started and running mnist datasets

Last Update:2016-10-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently in touch with Mxnet and tensorflow. I've got two of them together. TensorFlow ran a lot of code, generally used more smoothly, the document is very rich, API familiar with writing code is no Problem.

today, I made a comparison of two Platforms. Running Mnist,tensorflow is ten or twenty times times slower than Mxnet. Mxnet only takes half a minute and TensorFlow runs for 13 minutes.

How to run in the mxnet?

Cd/mxnet/example/image-classificationpython train_mnist.py

I'm Using the latest version of Mxnet. Run the script it will download the dataset Automatically.
Then brush the brush Screen.
Let's take a look at how this script is written to build mxnet programming ideas:
Import Find_mxnet
import mxnet as MX
Import Argparse
Import os, sys
Import Train_model

def _download (data_dir):
if not os.path.isdir (data_dir):
Os.system ("mkdir" + data_dir)
Os.chdir (data_dir)
if (not os.path.exists (' train-images-idx3-ubyte ')) or \
(not os.path.exists (' train-labels-idx1-ubyte ')) or \ \
(not os.path.exists (' t10k-images-idx3-ubyte ')) or \ \
(not os.path.exists (' t10k-labels-idx1-ubyte ')):
Os.system ("wget Http://data.dmlc.ml/mxnet/data/mnist.zip")
Os.system ("unzip-u mnist.zip; RM mnist.zip ")
Os.chdir ("..")

def get_loc (data, attr={' lr_mult ': ' 0.01 '}):
    """
the localisation network in lenet-stn, it'll increase acc about more than 1,
when Num-epoch >=15
    """
loc = mx.symbol.Convolution (data=data, num_filter=30, kernel= (5, 5), stride= (2,2))
loc = Mx.symbol.Activation (data = loc, act_type= ' Relu ')
loc = mx.symbol.Pooling (data=loc, kernel= (2, 2), stride= (2, 2), pool_type= ' Max ')
loc = mx.symbol.Convolution (data=loc, num_filter=60, kernel= (3, 3), stride= (), pad= (1, 1))
loc = Mx.symbol.Activation (data = loc, act_type= ' Relu ')
loc = mx.symbol.Pooling (data=loc, global_pool=true, kernel= (2, 2), pool_type= ' avg ')
loc = Mx.symbol.Flatten (data=loc)
loc = mx.symbol.FullyConnected (data=loc, num_hidden=6, name= "stn_loc", attr=attr)
return Loc

def GET_MLP ():
    """
multi-layer perceptron
    """
data = mx.symbol.Variable (' data ')
FC1 = mx.symbol.FullyConnected (data = data, name= ' fc1 ', num_hidden=128)
Act1 = mx.symbol.Activation (data = fc1, name= ' relu1 ', act_type= "relu")
FC2 = mx.symbol.FullyConnected (data = act1, name = ' FC2 ', num_hidden = +)
Act2 = mx.symbol.Activation (data = fc2, name= ' relu2 ', act_type= "relu")
FC3 = mx.symbol.FullyConnected (data = act2, name= ' fc3 ', num_hidden=10)
MLP = mx.symbol.SoftmaxOutput (data = fc3, name = ' Softmax ')
return MLP

def get_lenet (add_stn=false):
    """
lecun, Yann, Leon bottou, Yoshua bengio, and Patrick
Haffner. "gradient-based Learning applied to document recognition."
Proceedings of the IEEE (1998)
    """
data = mx.symbol.Variable (' data ')
if (add_stn):
data = Mx.sym.SpatialTransformer (data=data, loc=get_loc (data), target_shape = (28,28),
transform_type= "affine", sampler_type= "bilinear")
# First Conv
conv1 = mx.symbol.Convolution (data=data, kernel= (5,5), num_filter=20)
tanh1 = mx.symbol.Activation (data=conv1, act_type= "tanh")
pool1 = mx.symbol.Pooling (data=tanh1, pool_type= "max",
kernel= (2,2), stride= (2,2))
# Second Conv
conv2 = mx.symbol.Convolution (data=pool1, kernel= (5,5), num_filter=50)
TANH2 = mx.symbol.Activation (data=conv2, act_type= "tanh")
pool2 = mx.symbol.Pooling (data=tanh2, pool_type= "max",
kernel= (2,2), stride= (2,2))
# First FULLC
flatten = Mx.symbol.Flatten (data=pool2)
FC1 = mx.symbol.FullyConnected (data=flatten, num_hidden=500)
TANH3 = mx.symbol.Activation (data=fc1, act_type= "tanh")
# Second FULLC
FC2 = mx.symbol.FullyConnected (data=tanh3, num_hidden=10)
# loss
lenet = mx.symbol.SoftmaxOutput (data=fc2, name= ' Softmax ')
return lenet

def get_iterator (data_shape):
def Get_iterator_impl (args, kv):
Data_dir = Args.data_dir
if '://' not in Args.data_dir:
_download (args.data_dir)
flat = False If len (data_shape) = = 3 Else True

train = Mx.io.MNISTIter (
image = Data_dir + "train-images-idx3-ubyte",
label = Data_dir + "train-labels-idx1-ubyte",
Input_shape = data_shape,
batch_size = args.batch_size,
Shuffle = True,
flat = flat,
num_parts = kv.num_workers,
Part_index = Kv.rank)

val = mx.io.MNISTIter (
image = Data_dir + "t10k-images-idx3-ubyte",
label = Data_dir + "t10k-labels-idx1-ubyte",
Input_shape = data_shape,
batch_size = args.batch_size,
flat = flat,
num_parts = kv.num_workers,
Part_index = Kv.rank)

return (train, Val)
return Get_iterator_impl

def Parse_args ():
parser = Argparse. Argumentparser (description= ' train an image classifer on mnist ')
parser.add_argument ('--network ', type=str, default= ' MLP ',
choices = [' MLP ', ' lenet ', ' lenet-stn '],
help = ' The Cnn-use ')
parser.add_argument ('--data-dir ', type=str, default= ' mnist/',
help= ' The input data directory ')
parser.add_argument ('--gpus ', type=str,
help= ' The GPUs would be used, e.g "0,1,2,3" ')
parser.add_argument ('--num-examples ', type=int, default=60000,
help= ' The number of training examples ')
parser.add_argument ('--batch-size ', type=int, default=128,
help= ' The batch size ')
parser.add_argument ('--lr ', type=float, default=.1,
help= ' The initial learning rate ')
parser.add_argument ('--model-prefix ', type=str,
help= ' The prefix of the model to Load/save ')
parser.add_argument ('--save-model-prefix ', type=str,
help= ' The prefix of the model to save ')
parser.add_argument ('--num-epochs ', type=int, default=10,
help= ' The number of training epochs ')
parser.add_argument ('--load-epoch ', type=int,
help= "load The model on an epoch using the Model-prefix")
parser.add_argument ('--kv-store ', type=str, default= ' local ',
help= ' The Kvstore type ')
parser.add_argument ('--lr-factor ', type=float, default=1,
help= ' times the LR with a factor for every Lr-factor-epoch epoch ')
parser.add_argument ('--lr-factor-epoch ', type=float, default=1,
help= ' The number of the epoch to factor the lr, could is. 5 ')
return Parser.parse_args ()


if __name__ = = ' __main__ ':
args = Parse_args ()


if args.network = = ' MLP ':
Data_shape = (784,)
net = GET_MLP ()
elif args.network = = ' Lenet-stn ':
data_shape = (1, +)
net = get_lenet (True)
else:
data_shape = (1, +)
net = get_lenet ()

# Train
train_model.fit (args, net, get_iterator (data_shape))

Look at the main function, that is, read the configuration parameters, read the network structure, including setting the size of the data, and then call the existing package Train_model. Then pass in the previous set of three Parameters. I started Training.
The programming architecture is also pretty clear. Modularity is also doing Well.
Then look at the parameter setting Problem. Parameters import a lot of configuration files, basically Caffe in the Proto are set in This. Includes data set address, batch size, learning rate, loss function, and so On. And then look at the network structure,
Reading network structure is a layer of building blocks, according to the previous read the configuration file or you define some Parameters. The building blocks began to Train.
Caffe a disadvantage is not flexible enough, after all, not to write their own code, just write configuration files, total feeling Cowgirl. Mxnet and TensorFlow are more convenient and provide APIs that you can invoke and define in your own way
Network Structure. overall, the latter two frameworks are actually modular to do well, providing the underlying API to support you in writing your own Network. It's hard for Caffe to write his own network layer .

Mxnet Combat Series (i) Getting started and running mnist datasets

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Mxnet Combat Series (i) Getting started and running mnist datasets

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support