"MXNet" First play _ Basic operation and common layer implementation

Last Update:2018-05-15 Source: Internet

Author: User

Tags pytorch mxnet keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Mxnet is the foundation, Gluon is the encapsulation, both like TensorFlow and Keras, but thanks to the dynamic graph mechanism, the interaction between the two is much more convenient than TensorFlow and Keras, its basic operation and pytorch very similar, but a lot of convenience, It's easy to get started with a pytorch foundation.

Library import notation,

From mxnet import Ndarray as Ndfrom mxnet import autogradfrom mxnet import gluonimport mxnet as MX

MXNet

mxnet. Ndarray is the basis of the entire scientific computing system, and the overall API is consistent with the NumPy Nparray, which is similar to Pytorch, but unlike the pytorch built-in variables, tensor and other data types, Mxnet simplified only ndarray one, through the Mxnet.autograd can directly realize the derivation, very convenient.

Automatic derivation

x = Nd.arange (4). Reshape ((4, 1)) # tags need to be automatically derivative of the amount of X.attach_grad () # There is an automatic derivation needs to be recorded calculation chart with Autograd.record ():    y = 2 * Nd.dot (x.t , x) # Reverse propagation Output Y.backward () # Get gradient print (' X.grad: ', X.grad)

nd conversion to digital

nd. asscalar()

the Mutual of ND and NP arrays

y = Nd.array (x) # NumPy converted to Ndarray.

z = y.asnumpy () # Ndarray converted to NumPy.

memory-saving additions

nd. Elemwise_add(x, y, out=z)

Layer Implementation

Relu activation

def relu (x):    return nd.maximum (x, 0)

Fully connected Layer

# variable Generation w = Nd.random.normal (scale=1, Shape= (num_inputs, 1)) B = Nd.zeros (shape= (1,)) params = [w, b]# variable mount gradient for Param in para MS:    Param.attach_grad () # implements fully connected Def net (x, W, b):    return Nd.dot (x, W) + b

SGD implementation

def SGD (params, LR, batch_size): for    param in params:        param[:] = PARAM-LR * param.grad/batch_size

Gluon Memory Data Set loading

import mxnet as Mxfrom mxnet import Autograd, Ndimport numpy as Npnum_inputs = 2num _examples = 1000true_w = [2, -3.4]true_b = 4.2features = Nd.random.normal (scale=1, shape= (Num_examples, num_inputs)) label s = true_w[0] * features[:, 0] + true_w[1] * features[:, 1] + true_blabels + = Nd.random.normal (scale=0.01, shape=labels.sh APE) from Mxnet.gluon import data as Gdatabatch_size = 10dataset = Gdata. Arraydataset (features, labels) Data_iter = Gdata. Dataloader (DataSet, Batch_size, Shuffle=true) for X, y with Data_iter:print (x, y) break

[[ -1.74047375  0.26071024] [0.65584248-0.50490594] [ -0.97745866-0.01658815] [ -0.55589193  0.30666101] [- 0.61393601-2.62473822] [0.82654613-0.00791582] [0.29560572-1.21692061] [ -0.35985938-1.37184834] [ -1.69631028-1.740 14604] [1.31199837-1.96280086]]<ndarray 10x2 @cpu (0) >[ -0.14842382   7.22247267   2.30917668   2.0601418   11.89551163   5.87866735   8.94194221   8.15139961   6.72600317  13.50252151]< Ndarray @cpu (0) >

Model definition

Sequence model generation
Layer Fill
Initialize model parameters

NET = Gluon.nn.Sequential () with Net.name_scope ():    Net.add (Gluon.nn.Dense (1)) Net.collect_params (). Initialize ( Mx.init.Normal (sigma=1)  # model parameter initialization Select Normal distribution

Optimizer

The WD parameter adds L2 regularization to the model, with the following mechanism: W = w-lr*grad-wd*w

Trainer = Gluon. Trainer (Net.collect_params (), ' sgd ', {        ' learning_rate ': learning_rate, ' WD ': Weight_decay})

Trainer.step (batch_size) needs to run after each reverse propagation, the parameters are updated, and a simulated training process is as follows,

For e in range (epochs):        to data, label in Data_iter_train: With            Autograd.record ():                output = net (data)                Loss = Square_loss (output, label)            Loss.backward ()            trainer.step (batch_size)        train_loss.append (Test (net , X_train, Y_train))        test_loss.append (Test (NET, X_test, y_test))

Layer Function API

Stretch

nn.  Flatten()

Fully connected Layer

Gluon.  nn.  Dense(activation="Relu")

parameter represents the output node count

loss function Class API

Cross Entropy

Loss = Gloss. Softmaxcrossentropyloss()

"MXNet" First play _ Basic operation and common layer implementation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More