"MXNet" First play _ Basic operation and common layer implementation

Source: Internet
Author: User
Tags pytorch mxnet keras

Mxnet is the foundation, Gluon is the encapsulation, both like TensorFlow and Keras, but thanks to the dynamic graph mechanism, the interaction between the two is much more convenient than TensorFlow and Keras, its basic operation and pytorch very similar, but a lot of convenience, It's easy to get started with a pytorch foundation.

Library import notation,

From mxnet import Ndarray as Ndfrom mxnet import autogradfrom mxnet import gluonimport mxnet as MX

MXNet

mxnet. Ndarray is the basis of the entire scientific computing system, and the overall API is consistent with the NumPy Nparray, which is similar to Pytorch, but unlike the pytorch built-in variables, tensor and other data types, Mxnet simplified only ndarray one, through the Mxnet.autograd can directly realize the derivation, very convenient.

Automatic derivation
x = Nd.arange (4). Reshape ((4, 1)) # tags need to be automatically derivative of the amount of X.attach_grad () # There is an automatic derivation needs to be recorded calculation chart with Autograd.record ():    y = 2 * Nd.dot (x.t , x) # Reverse propagation Output Y.backward () # Get gradient print (' X.grad: ', X.grad)
nd conversion to digital

nd. asscalar()

the Mutual of ND and NP arrays

y = Nd.array (x) # NumPy converted to Ndarray.

z = y.asnumpy () # Ndarray converted to NumPy.

memory-saving additions

nd. Elemwise_add(x, y, out=z)

Layer Implementation

Relu activation

def relu (x):    return nd.maximum (x, 0)

Fully connected Layer

# variable Generation w = Nd.random.normal (scale=1, Shape= (num_inputs, 1)) B = Nd.zeros (shape= (1,)) params = [w, b]# variable mount gradient for Param in para MS:    Param.attach_grad () # implements fully connected Def net (x, W, b):    return Nd.dot (x, W) + b
SGD implementation
def SGD (params, LR, batch_size): for    param in params:        param[:] = PARAM-LR * param.grad/batch_size

  

Gluon Memory Data Set loading
import mxnet as Mxfrom mxnet import Autograd, Ndimport numpy as Npnum_inputs = 2num _examples = 1000true_w = [2, -3.4]true_b = 4.2features = Nd.random.normal (scale=1, shape= (Num_examples, num_inputs)) label s = true_w[0] * features[:, 0] + true_w[1] * features[:, 1] + true_blabels + = Nd.random.normal (scale=0.01, shape=labels.sh APE) from Mxnet.gluon import data as Gdatabatch_size = 10dataset = Gdata. Arraydataset (features, labels) Data_iter = Gdata. Dataloader (DataSet, Batch_size, Shuffle=true) for X, y with Data_iter:print (x, y) break 
[[ -1.74047375  0.26071024] [0.65584248-0.50490594] [ -0.97745866-0.01658815] [ -0.55589193  0.30666101] [- 0.61393601-2.62473822] [0.82654613-0.00791582] [0.29560572-1.21692061] [ -0.35985938-1.37184834] [ -1.69631028-1.740 14604] [1.31199837-1.96280086]]<ndarray 10x2 @cpu (0) >[ -0.14842382   7.22247267   2.30917668   2.0601418   11.89551163   5.87866735   8.94194221   8.15139961   6.72600317  13.50252151]< Ndarray @cpu (0) >
Model definition
    • Sequence model generation
    • Layer Fill
    • Initialize model parameters
NET = Gluon.nn.Sequential () with Net.name_scope ():    Net.add (Gluon.nn.Dense (1)) Net.collect_params (). Initialize ( Mx.init.Normal (sigma=1)  # model parameter initialization Select Normal distribution
Optimizer

The WD parameter adds L2 regularization to the model, with the following mechanism: W = w-lr*grad-wd*w

Trainer = Gluon. Trainer (Net.collect_params (), ' sgd ', {        ' learning_rate ': learning_rate, ' WD ': Weight_decay})

Trainer.step (batch_size) needs to run after each reverse propagation, the parameters are updated, and a simulated training process is as follows,

For e in range (epochs):        to data, label in Data_iter_train: With            Autograd.record ():                output = net (data)                Loss = Square_loss (output, label)            Loss.backward ()            trainer.step (batch_size)        train_loss.append (Test (net , X_train, Y_train))        test_loss.append (Test (NET, X_test, y_test))
Layer Function API

Stretch

nn.  Flatten()  

Fully connected Layer

Gluon.  nn.  Dense(activation="Relu")         

parameter represents the output node count

loss function Class API

Cross Entropy

Loss = Gloss. Softmaxcrossentropyloss()

"MXNet" First play _ Basic operation and common layer implementation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.