How to Use the naive Bayes algorithm and python Bayesian Algorithm in python
Here we will repeat why the title is "use" instead of "IMPLEMENT ":
First, the algorithms provided by professionals are more efficient and accurate than the algorithms we write.
Secondly, for people with poor mathematics, it is very painful to study a bunch of formulas to implement algorithms.
Again, unless the algorithms provided by others cannot meet your needs, there is no need to "duplicate the wheel ".
The following is a back-to-back example. If you do not know the Bayesian algorithm, you can check the relevant information. Here is a brief introduction:
1. Bayesian formula:
P(A|B)=P(AB)/P(B)
2. Bayesian inference:
P(A|B)=P(A)×P(B|A)/P(B)
In text:
Posterior Probability = anterior probability × similarity/standardization constant
The Bayesian algorithm solves the problem of finding similarity, that is, the value of P (B | ).
3. Three Common Naive Bayes algorithms are provided in the scikit-learn package. The following describes them in sequence:
1) Gaussian Naive Bayes: Assuming that attributes/features are normally distributed (for example,), they are mainly used for numeric features.
Use the data in the scikit-learn package. The code and description are as follows:
>>> From sklearn import datasets # import data in the package >>> iris = datasets. load_iris () # load data> iris. feature_names # display Feature Names ['sepal length (cm) ', 'sepal width (cm)', 'petal length (cm) ', 'petal width (cm) '] >>> iris. data # display data array ([[5.1, 3.5, 1.4, 0.2], [4.9, 3 ., 1.4, 0.2], [4.7, 3.2, 1.3, 0.2] ......> iris. data. size ## data size --- 600 >>> iris.tar get_names # display the category name array (['setopa', 'versicolor', 'virginic A'], dtype = '<u10')> from sklearn. naive_bayes import GaussianNB # import Gaussian Naive Bayes algorithm> clf = GaussianNB () # assign a variable to the algorithm for ease of use> clf. fit (iris. data, iris.tar get) # Start classification. For a large sample, you can use the partial_fit function to classify it to avoid loading too much data to the memory at a time> clf. predict (iris. data [0]. reshape (1,-1) # verify the category. Note: Because the predict parameter is an array and data [0] is a list, you need to convert array ([0]) >>> data = np. array ([6, 4, 6, 2]) # verify Category> clf. predict (data. reshape (1,-1) array ([2])
Here we have a question: how can we determine that the data conforms to the normal distribution? In the R language, there are related function judgments, or direct plotting can also be seen, but they are all P (x, y), which can be directly in the coordinate system.
I have not figured out how to determine the data in the example. This part will be added later.
2) polynomial distribution Naive Bayes: It is often used for text classification. features are words, and values are the number of times words appear.
# Examples are provided in the official documentation. For details, see the first example> import numpy as np> X = np. random. randint (5, size = (6,100) # returns a random integer in the range of [100) 6x100 6 rows, columns> y = np. array ([1, 2, 3, 4, 5, 6]) >>> from sklearn. naive_bayes import MultinomialNB >>> clf = MultinomialNB () >>> clf. fit (X, y) MultinomialNB (alpha = 1.0, class_prior = None, fit_prior = True) >>> print (clf. predict (X [2]) [3]
3) Berner strives for Naive Bayes: each feature is Boolean and the result is 0 or 1, that is, it does not appear.
# Examples are provided in the official documentation. For details, see the first example> import numpy as np> X = np. random. randint (2, size = (6,100) >>> Y = np. array ([1, 2, 3, 4, 4, 5]) >>> from sklearn. naive_bayes import BernoulliNB >>> clf = BernoulliNB () >>> clf. fit (X, Y) BernoulliNB (alpha = 1.0, binarize = 0.0, class_prior = None, fit_prior = True) >>> print (clf. predict (X [2]) [3]
Note: This article is not complete yet. Some instructions in Example 1 need to be written. There are many things recently and will be improved in the future.
The above is all the content of this article. I hope this article will help you in your study or work. I also hope to provide more support to the customer's home!