[Machine learning] Three line code quickly divide training set and validation set _ machine learning in cross training

Source: Internet
Author: User
Fast partitioning of cross training datasets using Numpy.random.choice () and set ()

Before dividing the training set and validation set, are manually randomly generated index, very stupid.

The new methods learned are as follows:

Import NumPy as NP
# Normal distribution generate raw data
x = Np.random.random.normal (1,0.1,100)
# Divide data by 8:2
X_train_index = Np.random.choice (len (x), round (len (x) *0.8), replace = False)
x_valid_index = Np.array (List (set (len (x)))- Set (X_train_index))

X_train = X[x_train_index]
x_valid = X[x_valid_index]
Summary 1:np.random.choice ()

Definition:choice (A, Size=none, Replace=true, P=none)

Type:function of None Module

Parameters
a:1-d array-like or int
If an Ndarray, a random the sample is generated to its elements. If a int, the random sample is generated as if A was Np.arange (n)
Size:int or tuple of ints, optional
Output shape. If the given shape is, e.g., (M, N, K), then M * n * k samples are drawn. Default is None, in which case a single value is returned.
Replace:boolean, optional
Whether the "sample is" with or without replacement
Whether to include duplicate elements
p:1-d Array-like, optional
The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
Select the element by what probability distribution, and the default is evenly distributed

Returns
samples:1-d Ndarray, shape (size,)
The generated random samples summary 2:set ()

The Python collection (set), like other languages, is a unordered set of distinct elements that include relational testing and elimination of duplicate elements. Summary 3:batch Training

Batch training You can use this method to pick data

 batch_size = to epoch in range (m): Rand_index = Np.random.choice (len (x_train), size = batch_size) rand_ x = X_train[rand_index] rand_y = Y_train[rand_index] ... 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.