Generation of random numbers in machine learning algorithms

Source: Internet
Author: User

Numpy,sklearn provides random data generation capabilities, and we can generate data for a particular model ourselves, clean it with random data, convert it, and then select the model to fit and predict with the algorithm.

1.numpy Random data Generation API

NumPy is more suitable for producing some simple sampling data. The APIs are in the random class, and the common APIs are:

(1). rand (D0, D1, ..., dn) is used to generate an array of d0xd1x...dn dimensions. The value of the array is between [0, 1].

(2). RANDN (D0, D1, ..., dn), also used to generate an array of d0xd1x...dn dimensions. However, the value of the array follows the standard normal distribution of n (0, 1).

If you need to obey the normal distribution of ν (μ,δ2), simply change δx+μ on each generated value x on RANDN.

(3). Randint (low[, high, size]), generate random data of size, size can be an integer, a matrix dimension, or a dimension of tensor. The value is in the half open interval [low, high].

For example: Np.random.randint (3, size=[2,3,4]) returns data for dimension 2x3x4. The value range is an integer with a maximum value of 3.

For example: Np.random.randint (3, 6, size=[2,3]) returns data with a dimension of 2x3. The value range is [3,6].

(4). Random_integers (low[, high, size]), similar to the above randint, the difference between the range of values is closed interval [low, high].

(5). Random_sample ([size]), returns the random floating-point number in the half-open interval [0.0, 1.0]. If it is another interval [a, b), it can be converted (b-a) * Random_sample ([size]) + A

For example: (5-2) *np.random.random_sample (3) +2 returns 3 random numbers between [2,5].

Introduction to the 2.sklearn random data Generation API

Sklearn generates random data in the Datasets class, and numpy can generate data that is appropriate for a particular machine learning model. The commonly used APIs are:

(1). Using make_regression to generate regression model data

(2). Generate categorical model data with make_hastie_10_2,make_classification or make_multilabel_classification

(3). Using Make_blobs to PLA class model data

(4). Use Make_gaussian_quantiles to generate grouped data with normal distribution

Generation of random numbers in machine learning algorithms

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.