Theano Study Notes (6)-loading and storage, conditions

Source: Internet
Author: User
Tags theano

Load and save

The pickle mechanism is used to save class entities and reload them in Python. Many theano objects can be serialized (or deserialized). However, pickle has the limitation that the code or data of the serialized category instance is not saved at the same time. Therefore, re-loading the previous version of the class may cause problems.

Therefore, different mechanisms need to be sought based on the expected storage and reload time.

Theano's pickle is feasible for short-term (such as temporary files and network transcription.

For long-term (for example, saving models from experiments), it should not depend on theano's pickle object.

We recommend that you save and load the underlying shared objects in any other Python project.

 

Pickle Basics

Pickle and cpickle have similar functions, but cpickle uses C encoding, which is faster.

You can use cpickle. Dump to serialize (or save or pickle) an object as a file.

 

importcPicklef= file('obj.save', 'wb')cPickle.dump(my_obj,f, protocol=cPickle.HIGHEST_PROTOCOL)f.close()


Cpickle. highest_protocol is used to accelerate the process of saving objects.

The 'B' binary mode is used to maintain portability between UNIX and Windows systems.

 

Use cpickle. Load to deserialize (or load, or unpickle) the file)

f= file('obj.save', 'rb')loaded_obj= cPickle.load(f)f.close()


Multiple objects can be pickle to the same file at the same time:

f= file('objects.save', 'wb')forobj in [obj1, obj2, obj3]:    cPickle.dump(obj, f,protocol=cPickle.HIGHEST_PROTOCOL)f.close()


You can also load data in the same order:

f= file('objects.save', 'rb')loaded_objects= []fori in range(3):    loaded_objects.append(cPickle.load(f))f.close()


Short-term serialization

If you are confident, the entire pickle model is a good solution.

In this case, you perform the same save and reload operations in the project, or the class has been running stably for a long time.

You can define _ getstate _ method and _ setstate _ to control which pickle is saved from the project.

If the model class contains a link to a dataset in use and does not want to pickle each model instance, the above control method will be very practical.

def__getstate__(self):    state = dict(self.__dict__)    del state['training_set']    return state def__setstate__(self, d):    self.__dict__.update(d)self.training_set =cPickle.load(file(self.training_set_file, 'rb'))


Long-term serialization

If the class to be saved is unstable, such as creating or deleting a function or renaming a class member, only the immutable part of the class should be saved or loaded.

Define _ getstate _ method and _ setstate __

For example, you only want to save the weight matrix W and bias item B:

def__getstate__(self):    return (self.W, self.b) def__setstate__(self, state):    W, b = state    self.W = Wself.b = b

 

If the following functions are updated to change the variable name, even if W and B are renamed to weights and bias, the previous pickle file is still available:

def__getstate__(self):    return (self.weights, self.bias) def__setstate__(self, state):    W, b = state    self.weights = Wself.bias = b

 

Condition

-Ifelse and switch

-Switch is more common than ifelse because switch is a bit-by-bit operation.

-Switch calculates both output variables, so it is slower than ifelse (only one ).

 

from theano import tensor as Tfrom theano.ifelse import ifelseimporttheano, time, numpy a,b= T.scalars('a', 'b')x,y= T.matrices('x', 'y') z_switch= T.switch(T.lt(a, b), T.mean(x), T.mean(y))z_lazy= ifelse(T.lt(a, b), T.mean(x), T.mean(y)) f_switch= theano.function([a, b, x, y], z_switch,                   mode=theano.Mode(linker='vm'))f_lazyifelse= theano.function([a, b, x, y], z_lazy,                   mode=theano.Mode(linker='vm')) val1= 0.val2= 1.big_mat1= numpy.ones((10000, 1000))big_mat2= numpy.ones((10000, 1000)) n_times= 10 tic= time.clock()fori in xrange(n_times):    f_switch(val1, val2, big_mat1, big_mat2)print'time spent evaluating both values %f sec' % (time.clock() - tic) tic= time.clock()fori in xrange(n_times):    f_lazyifelse(val1, val2, big_mat1,big_mat2)print'time spent evaluating one value %f sec' % (time.clock() - tic)


Test Results

time spent evaluating both values 0.200000 sectime spent evaluating one value 0.110000 sec

It can be seen that ifelse is indeed faster than doubled, but Vm or CVM must be used as the linker, and CVM will be used as the default linker in the future.


Welcome to the discussion and follow up on this blog, Weibo, and zhihu personal homepage for further updates ~

Reprinted, please respect the work of the author and keep the above text and link of the article completely. Thank you for your support!

Theano Study Notes (6)-loading and storage, conditions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.