Mainly for the ninth week content: Anomaly detection, recommendation system
(i) Anomaly detection (DENSITY estimation)
kernel density estimation ( Kernel density estimation X (1) , X (2) ,.., x (m) If the data set is normal, we want to know the new data X (test) p (x)
After density estimation, it is a common method to select a probability threshold to determine whether it is an anomaly, which is often used in anomaly detection. Such as:
- Gaussian distribution
The Gaussian kernel function is the kernel function commonly used in kernel density estimation. The one-element Gaussian probability density function is:
You can use the data you already have to predict the overall μ and the σ2 The calculation method is as follows:
The probability density function of the multivariate Gaussian distribution is:
Note: For variance in machine learning we usually divide by m rather than statistically ( m-1 ).
- Anomaly Detection
in a general Gaussian distribution model, for a given data set x (1), X (2),..., x (m) , we want to calculate for each feature μ and the σ2 estimates, based on the model calculation p (x) :
As shown in the following:
for the multivariate Gaussian distribution model, First we calculate the average of all the features, then we calculate the covariance matrix, and finally we calculate the p (x)of the multivariate Gaussian distribution:
(b) Referral system
Based on content
Based on user
Http://www.ccf.org.cn/resources/1190201776262/2010/05/12/h049617016.pdf
Coursera Machine Learning Notes (vii)