Recently in touch with Mxnet and tensorflow. I've got two of them together. TensorFlow ran a lot of code, generally used more smoothly, the document is very rich, API familiar with writing code is no Problem.
today, I made a comparison of two Platforms. Running Mnist,tensorflow is ten or twenty times times slower than Mxnet. Mxnet only takes half a minute and TensorFlow runs for 13 minutes.
How to run in the mxnet?
Cd/mxnet/example/image-classificationpython train_mnist.py
I'm Using the latest version of Mxnet. Run the script it will download the dataset Automatically.
Then brush the brush Screen.
Let's take a look at how this script is written to build mxnet programming ideas:
Import Find_mxnet
import mxnet as MX
Import Argparse
Import os, sys
Import Train_model
def _download (data_dir):
if not os.path.isdir (data_dir):
Os.system ("mkdir" + data_dir)
Os.chdir (data_dir)
if (not os.path.exists (' train-images-idx3-ubyte ')) or \
(not os.path.exists (' train-labels-idx1-ubyte ')) or \ \
(not os.path.exists (' t10k-images-idx3-ubyte ')) or \ \
(not os.path.exists (' t10k-labels-idx1-ubyte ')):
Os.system ("wget Http://data.dmlc.ml/mxnet/data/mnist.zip")
Os.system ("unzip-u mnist.zip; RM mnist.zip ")
Os.chdir ("..")
def get_loc (data, attr={' lr_mult ': ' 0.01 '}):
"""
the localisation network in lenet-stn, it'll increase acc about more than 1,
when Num-epoch >=15
"""
loc = mx.symbol.Convolution (data=data, num_filter=30, kernel= (5, 5), stride= (2,2))
loc = Mx.symbol.Activation (data = loc, act_type= ' Relu ')
loc = mx.symbol.Pooling (data=loc, kernel= (2, 2), stride= (2, 2), pool_type= ' Max ')
loc = mx.symbol.Convolution (data=loc, num_filter=60, kernel= (3, 3), stride= (), pad= (1, 1))
loc = Mx.symbol.Activation (data = loc, act_type= ' Relu ')
loc = mx.symbol.Pooling (data=loc, global_pool=true, kernel= (2, 2), pool_type= ' avg ')
loc = Mx.symbol.Flatten (data=loc)
loc = mx.symbol.FullyConnected (data=loc, num_hidden=6, name= "stn_loc", attr=attr)
return Loc
def GET_MLP ():
"""
multi-layer perceptron
"""
data = mx.symbol.Variable (' data ')
FC1 = mx.symbol.FullyConnected (data = data, name= ' fc1 ', num_hidden=128)
Act1 = mx.symbol.Activation (data = fc1, name= ' relu1 ', act_type= "relu")
FC2 = mx.symbol.FullyConnected (data = act1, name = ' FC2 ', num_hidden = +)
Act2 = mx.symbol.Activation (data = fc2, name= ' relu2 ', act_type= "relu")
FC3 = mx.symbol.FullyConnected (data = act2, name= ' fc3 ', num_hidden=10)
MLP = mx.symbol.SoftmaxOutput (data = fc3, name = ' Softmax ')
return MLP
def get_lenet (add_stn=false):
"""
lecun, Yann, Leon bottou, Yoshua bengio, and Patrick
Haffner. "gradient-based Learning applied to document recognition."
Proceedings of the IEEE (1998)
"""
data = mx.symbol.Variable (' data ')
if (add_stn):
data = Mx.sym.SpatialTransformer (data=data, loc=get_loc (data), target_shape = (28,28),
transform_type= "affine", sampler_type= "bilinear")
# First Conv
conv1 = mx.symbol.Convolution (data=data, kernel= (5,5), num_filter=20)
tanh1 = mx.symbol.Activation (data=conv1, act_type= "tanh")
pool1 = mx.symbol.Pooling (data=tanh1, pool_type= "max",
kernel= (2,2), stride= (2,2))
# Second Conv
conv2 = mx.symbol.Convolution (data=pool1, kernel= (5,5), num_filter=50)
TANH2 = mx.symbol.Activation (data=conv2, act_type= "tanh")
pool2 = mx.symbol.Pooling (data=tanh2, pool_type= "max",
kernel= (2,2), stride= (2,2))
# First FULLC
flatten = Mx.symbol.Flatten (data=pool2)
FC1 = mx.symbol.FullyConnected (data=flatten, num_hidden=500)
TANH3 = mx.symbol.Activation (data=fc1, act_type= "tanh")
# Second FULLC
FC2 = mx.symbol.FullyConnected (data=tanh3, num_hidden=10)
# loss
lenet = mx.symbol.SoftmaxOutput (data=fc2, name= ' Softmax ')
return lenet
def get_iterator (data_shape):
def Get_iterator_impl (args, kv):
Data_dir = Args.data_dir
if '://' not in Args.data_dir:
_download (args.data_dir)
flat = False If len (data_shape) = = 3 Else True
train = Mx.io.MNISTIter (
image = Data_dir + "train-images-idx3-ubyte",
label = Data_dir + "train-labels-idx1-ubyte",
Input_shape = data_shape,
batch_size = args.batch_size,
Shuffle = True,
flat = flat,
num_parts = kv.num_workers,
Part_index = Kv.rank)
val = mx.io.MNISTIter (
image = Data_dir + "t10k-images-idx3-ubyte",
label = Data_dir + "t10k-labels-idx1-ubyte",
Input_shape = data_shape,
batch_size = args.batch_size,
flat = flat,
num_parts = kv.num_workers,
Part_index = Kv.rank)
return (train, Val)
return Get_iterator_impl
def Parse_args ():
parser = Argparse. Argumentparser (description= ' train an image classifer on mnist ')
parser.add_argument ('--network ', type=str, default= ' MLP ',
choices = [' MLP ', ' lenet ', ' lenet-stn '],
help = ' The Cnn-use ')
parser.add_argument ('--data-dir ', type=str, default= ' mnist/',
help= ' The input data directory ')
parser.add_argument ('--gpus ', type=str,
help= ' The GPUs would be used, e.g "0,1,2,3" ')
parser.add_argument ('--num-examples ', type=int, default=60000,
help= ' The number of training examples ')
parser.add_argument ('--batch-size ', type=int, default=128,
help= ' The batch size ')
parser.add_argument ('--lr ', type=float, default=.1,
help= ' The initial learning rate ')
parser.add_argument ('--model-prefix ', type=str,
help= ' The prefix of the model to Load/save ')
parser.add_argument ('--save-model-prefix ', type=str,
help= ' The prefix of the model to save ')
parser.add_argument ('--num-epochs ', type=int, default=10,
help= ' The number of training epochs ')
parser.add_argument ('--load-epoch ', type=int,
help= "load The model on an epoch using the Model-prefix")
parser.add_argument ('--kv-store ', type=str, default= ' local ',
help= ' The Kvstore type ')
parser.add_argument ('--lr-factor ', type=float, default=1,
help= ' times the LR with a factor for every Lr-factor-epoch epoch ')
parser.add_argument ('--lr-factor-epoch ', type=float, default=1,
help= ' The number of the epoch to factor the lr, could is. 5 ')
return Parser.parse_args ()
if __name__ = = ' __main__ ':
args = Parse_args ()
if args.network = = ' MLP ':
Data_shape = (784,)
net = GET_MLP ()
elif args.network = = ' Lenet-stn ':
data_shape = (1, +)
net = get_lenet (True)
else:
data_shape = (1, +)
net = get_lenet ()
# Train
train_model.fit (args, net, get_iterator (data_shape))
Look at the main function, that is, read the configuration parameters, read the network structure, including setting the size of the data, and then call the existing package Train_model. Then pass in the previous set of three Parameters. I started Training.
The programming architecture is also pretty clear. Modularity is also doing Well.
Then look at the parameter setting Problem. Parameters import a lot of configuration files, basically Caffe in the Proto are set in This. Includes data set address, batch size, learning rate, loss function, and so On. And then look at the network structure,
Reading network structure is a layer of building blocks, according to the previous read the configuration file or you define some Parameters. The building blocks began to Train.
Caffe a disadvantage is not flexible enough, after all, not to write their own code, just write configuration files, total feeling Cowgirl. Mxnet and TensorFlow are more convenient and provide APIs that you can invoke and define in your own way
Network Structure. overall, the latter two frameworks are actually modular to do well, providing the underlying API to support you in writing your own Network. It's hard for Caffe to write his own network layer .
Mxnet Combat Series (i) Getting started and running mnist datasets