Fast partitioning of cross training datasets using Numpy.random.choice () and set ()
Before dividing the training set and validation set, are manually randomly generated index, very stupid.
The new methods learned are as follows:
Import NumPy as NP
# Normal distribution generate raw data
x = Np.random.random.normal (1,0.1,100)
# Divide data by 8:2
X_train_index = Np.random.choice (len (x), round (len (x) *0.8), replace = False)
x_valid_index = Np.array (List (set (len (x)))- Set (X_train_index))
X_train = X[x_train_index]
x_valid = X[x_valid_index]
Summary 1:np.random.choice ()
Definition:choice (A, Size=none, Replace=true, P=none)
Type:function of None Module
Parameters
a:1-d array-like or int
If an Ndarray, a random the sample is generated to its elements. If a int, the random sample is generated as if A was Np.arange (n)
Size:int or tuple of ints, optional
Output shape. If the given shape is, e.g., (M, N, K), then M * n * k samples are drawn. Default is None, in which case a single value is returned.
Replace:boolean, optional
Whether the "sample is" with or without replacement
Whether to include duplicate elements
p:1-d Array-like, optional
The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
Select the element by what probability distribution, and the default is evenly distributed
Returns
samples:1-d Ndarray, shape (size,)
The generated random samples summary 2:set ()
The Python collection (set), like other languages, is a unordered set of distinct elements that include relational testing and elimination of duplicate elements. Summary 3:batch Training
Batch training You can use this method to pick data
batch_size = to epoch in range (m): Rand_index = Np.random.choice (len (x_train), size = batch_size) rand_ x = X_train[rand_index] rand_y = Y_train[rand_index] ...