How to Use the naive Bayes algorithm and python Bayesian Algorithm in python

Last Update:2017-04-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Here we will repeat why the title is "use" instead of "IMPLEMENT ":

First, the algorithms provided by professionals are more efficient and accurate than the algorithms we write.

Secondly, for people with poor mathematics, it is very painful to study a bunch of formulas to implement algorithms.

Again, unless the algorithms provided by others cannot meet your needs, there is no need to "duplicate the wheel ".

The following is a back-to-back example. If you do not know the Bayesian algorithm, you can check the relevant information. Here is a brief introduction:

1. Bayesian formula:

P(A|B)=P(AB)/P(B)

2. Bayesian inference:

P(A|B)=P(A)×P(B|A)/P(B)

In text:

Posterior Probability = anterior probability × similarity/standardization constant

The Bayesian algorithm solves the problem of finding similarity, that is, the value of P (B | ).

3. Three Common Naive Bayes algorithms are provided in the scikit-learn package. The following describes them in sequence:

1) Gaussian Naive Bayes: Assuming that attributes/features are normally distributed (for example,), they are mainly used for numeric features.

Use the data in the scikit-learn package. The code and description are as follows:

>>> From sklearn import datasets # import data in the package >>> iris = datasets. load_iris () # load data> iris. feature_names # display Feature Names ['sepal length (cm) ', 'sepal width (cm)', 'petal length (cm) ', 'petal width (cm) '] >>> iris. data # display data array ([[5.1, 3.5, 1.4, 0.2], [4.9, 3 ., 1.4, 0.2], [4.7, 3.2, 1.3, 0.2] ......> iris. data. size ## data size --- 600 >>> iris.tar get_names # display the category name array (['setopa', 'versicolor', 'virginic A'], dtype = '<u10')> from sklearn. naive_bayes import GaussianNB # import Gaussian Naive Bayes algorithm> clf = GaussianNB () # assign a variable to the algorithm for ease of use> clf. fit (iris. data, iris.tar get) # Start classification. For a large sample, you can use the partial_fit function to classify it to avoid loading too much data to the memory at a time> clf. predict (iris. data [0]. reshape (1,-1) # verify the category. Note: Because the predict parameter is an array and data [0] is a list, you need to convert array ([0]) >>> data = np. array ([6, 4, 6, 2]) # verify Category> clf. predict (data. reshape (1,-1) array ([2])

Here we have a question: how can we determine that the data conforms to the normal distribution? In the R language, there are related function judgments, or direct plotting can also be seen, but they are all P (x, y), which can be directly in the coordinate system.

I have not figured out how to determine the data in the example. This part will be added later.

2) polynomial distribution Naive Bayes: It is often used for text classification. features are words, and values are the number of times words appear.

# Examples are provided in the official documentation. For details, see the first example> import numpy as np> X = np. random. randint (5, size = (6,100) # returns a random integer in the range of [100) 6x100 6 rows, columns> y = np. array ([1, 2, 3, 4, 5, 6]) >>> from sklearn. naive_bayes import MultinomialNB >>> clf = MultinomialNB () >>> clf. fit (X, y) MultinomialNB (alpha = 1.0, class_prior = None, fit_prior = True) >>> print (clf. predict (X [2]) [3]

3) Berner strives for Naive Bayes: each feature is Boolean and the result is 0 or 1, that is, it does not appear.

# Examples are provided in the official documentation. For details, see the first example> import numpy as np> X = np. random. randint (2, size = (6,100) >>> Y = np. array ([1, 2, 3, 4, 4, 5]) >>> from sklearn. naive_bayes import BernoulliNB >>> clf = BernoulliNB () >>> clf. fit (X, Y) BernoulliNB (alpha = 1.0, binarize = 0.0, class_prior = None, fit_prior = True) >>> print (clf. predict (X [2]) [3]

Note: This article is not complete yet. Some instructions in Example 1 need to be written. There are many things recently and will be improved in the future.

The above is all the content of this article. I hope this article will help you in your study or work. I also hope to provide more support to the customer's home!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

How to Use the naive Bayes algorithm and python Bayesian Algorithm in python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

How to Use the naive Bayes algorithm and python Bayesian Algorithm in python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support