MXNET:權重衰減-gluon實現

來源:互聯網
上載者:User

標籤:資料集   turn   定義   code   loader   war   []   test   .data   

構建資料集
# -*- coding: utf-8 -*-from mxnet import initfrom mxnet import ndarray as ndfrom mxnet.gluon import loss as glossimport gbn_train = 20n_test = 100num_inputs = 200true_w = nd.ones((num_inputs, 1)) * 0.01true_b = 0.05features = nd.random.normal(shape=(n_train+n_test, num_inputs))labels = nd.dot(features, true_w) + true_blabels += nd.random.normal(scale=0.01, shape=labels.shape)train_features, test_features = features[:n_train, :], features[n_train:, :]train_labels, test_labels = labels[:n_train], labels[n_train:]
資料迭代器
from mxnet import autogradfrom mxnet.gluon import data as gdatabatch_size = 1num_epochs = 10learning_rate = 0.003train_iter = gdata.DataLoader(gdata.ArrayDataset(    train_features, train_labels), batch_size, shuffle=True)loss = gloss.L2Loss()
訓練並展示結果

gb.semilogy函數:繪製訓練和測試資料的loss

from mxnet import gluonfrom mxnet.gluon import nndef fit_and_plot(weight_decay):    net = nn.Sequential()    net.add(nn.Dense(1))    net.initialize(init.Normal(sigma=1))    # 對權重參數做 L2 範數正則化,即權重衰減。    trainer_w = gluon.Trainer(net.collect_params('.*weight'), 'sgd', {        'learning_rate': learning_rate, 'wd': weight_decay})    # 不對偏差參數做 L2 範數正則化。    trainer_b = gluon.Trainer(net.collect_params('.*bias'), 'sgd', {        'learning_rate': learning_rate})    train_ls = []    test_ls = []    for _ in range(num_epochs):        for X, y in train_iter:            with autograd.record():                l = loss(net(X), y)            l.backward()            # 對兩個 Trainer 執行個體分別調用 step 函數。            trainer_w.step(batch_size)            trainer_b.step(batch_size)        train_ls.append(loss(net(train_features),                             train_labels).mean().asscalar())        test_ls.append(loss(net(test_features),                            test_labels).mean().asscalar())    gb.semilogy(range(1, num_epochs + 1), train_ls, 'epochs', 'loss',                range(1, num_epochs + 1), test_ls, ['train', 'test'])    return 'w[:10]:', net[0].weight.data()[:, :10], 'b:', net[0].bias.data()print fit_and_plot(5)
  • 使用 Gluon 的 wd 超參數可以使用權重衰減來應對過擬合問題。
  • 我們可以定義多個 Trainer 執行個體對不同的模型參數使用不同的迭代方法。

MXNET:權重衰減-gluon實現

相關關鍵詞:
相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.