This semester we held a machine learning discussion class to prepare for the next semester's "machine learning and Data Analysis" course.
I will explain the book the elements of statistical learning in advance. This book has been published to the second edition. Douban: high rating
Home page: http://www-stat.stanfo
Let's continue the discussion of reading Vapnik ' s book Statistical Learning theory. In the very beginning of the book, Vapnik first described and the fundamental approaches in pattern recognition:the Parametri C Estimation approach and the non-parametric estimation approach. Before introducing the non-parametric approach, which the support vector machine belongs to, Vapnik first address the Foll Owing thr
classifications K is upper bound,when the training data set is linearly separable, the perceptual machine learnsthe original form iteration of the learning algorithm is convergent. But there are many solutions to the perceptual machine learning algorithm, whichSome solutions depend on the selection of the initial value, and also on the selection order of the wrong classification points in the iterative pro
separation of the hyper-plane divides the feature space into two parts, a positive class, and a negative class part.
The formula for the geometric distance of the function interval of a super-planar sample point should be written.
So the objective function is to maximize the interval R. See Statistical learning Method 7.9
Interval boundaries, a very important concept.
The dual algorithm of
These two days I read statistical learning methods and recorded some basic knowledge points.
1. Statistical Learning Method
Starting from a given, finite set of training data for learning, it is assumed that the data is generated in the same distribution independently. In a
conditional probability distribution P (y| X) as a predictive model, that is, the model is generatedTypical generation models are: Naive Bayesian method and hidden Markov modeldiscriminant method by Data Direct learning decision function f (X) or conditional probability distribution P (y| X) as a predictive model, i.e. discriminant modelk nearest Neighbor method, Perceptron, decision tree, logistic regression model, maximum entropy model, support vec
Learning with Kernels
The statistical learning theory, a comprehensive textbook of kernel and SVM, is widely discussed.
Applied multivariate statistical analysis (5th ed .)
A good textbook for multivariate statistical analysis
Statis
conditional random field prediction problem is the given condition with the airport p (Y | x) and the input sequence (observation sequence) x,The output sequence (marker sequence) with the most conditional probability is y*, that is, the observed sequence is labeled. According to the vector form of the conditional random field, the prediction problem of the conditional random field becomes the optimal path problem with the most non-normalized probability, which is solved by the Viterbi algorith
Ascending tree is an ascending method based on the classification tree or the regression tree as the basic classifier. Ascension Tree is considered to be one of the best performance methods in statistical learning.The lifting method actually adopts the addition model ( linear combination of the basis function ) and the forward Step-up algorithmA lifting method based on decision tree is called an ascending tree?Decision tree for classification p
classify the data in K class, instead of K-1 the total of two classes each time, we only need each weak classifier better than the random guessing (i.e. Accuracy rate >1/k)Multi-Class classification algorithm flow:Loss function design for multi-class classifiers:=============== Supplement ===============The ten algorithms of data mining can be studied slowly later:C4.5K-meansSvmAprioriEmPageRankAdaBoostKnnNaivebayesCART=============== Summary ===============Boosting can be used for variable sel
mistakenly classified data (x, y), there is-y (wx + B)> 0 (Buddha said: Too lazy to say ). Then there is a loss function (proving something to die ):
Then the loss function is minimized (-_-zzz ):
The perception machine learning algorithm is drive by mistake (the word "driven" sounds very powerful), and the Stochastic Gradient Descent Method (which will be written later ), evaluate the skewness for W and B respectively:
After the request is
Regression Problemsregression (regression): input and output are continuous variables, used to predict the relationship between input variables and output variables, that is, the selection of input variables to the output variable mapping function, equivalent to function fitting, select function curve to fit the known data and good prediction of unknown data.According to the number of input variables, it is divided into unary regression and two-yuan regression; According to model type, it is di
Tags: blog http OS log HTML EF as HTM TTStatistical Learning Method (I) -- Introduction to statistical learning methods statistical learning method (II) -- Statistical Learning Method o
distributed. The probability of each sample is the same.
2. Obtain m weak classifiers through iterative learning. For the m weak classifier,
2.1 For the training set, the classifier, GM
2.2 calculates the weighted error of the weak classifier.
2.3 calculate the weight of the weak classifier. The smaller the visible error of the log function, the higher the weight, that is, the greater the role of the final strong classifier.
2.4 Key
for classification are conditionally independent under the conditions determined by the class. Naive Bayesian method actually learns the mechanism of generating data, so it belongs to the generation model. Naive Bayes method is a class decision based on the maximum posterior probability (MAP) criterion, and the posterior probability is the same as the denominator, and the classifier can be expressed as the maximum posteriori probability.equivalent to0-1 when the loss functionminimize the risk o
basket, scattered to multiple baskets can reduce the risk of egg fragmentation.
When learning probability models, the model with the greatest entropy is the best model.
The maximum entropy model is to select the largest model in the set of models satisfying the constraint conditions.
Definition: The maximum entropy model is used to determine the most suitable classification model for U-Anze.
The model with the most conditional entropy of
Python code implementation on the perception machine ----- Statistical Learning Method
Reference: http://shpshao.blog.51cto.com/1931202/1119113
1 #! /Usr/bin/ENV Python 2 #-*-coding: UTF-8-*-3 #4 # Untitled. PY 5 #6 # copyright 2013 T-dofan
There are still a few questions, the book's adjustment strategy is: Wi = wi + Nyi * Xi, so it is necessary to multiply the optimization process by X [2]?
-----------
First, k nearest neighbor algorithmK-Nearest Neighbor method (K-nearest neighbor,k-nn) is a basic classification and regression method, the input instance of the eigenvector, the output instance of the category, where the category is preferable to multi-classSecond, K nearest neighbor model2.1 Distance MeasurementDistance definition:(1) When P=1, called Manhattan Distance(2) When p=2, called European distance(3) When p is infinitely large, it is the maximum value of each coordinate distance max|
estimate.
General steps to find the maximum likelihood function estimate:
(1), write out the likelihood function,
(2), log likelihood function take logarithm, and arrange,
(3), derivative number, make derivative 0, obtain likelihood equation
, (4), solve likelihood equation, get parameter is to seek.
Maximum (maximal) likelihood estimation is also an example of empirical risk minimization (RRM) in statistical l
'; " #step2. Compute the days between start date and end dates1= ' Date--date= ' $ ' +%s ' s2= ' date +%s ' s3=$ ((($s 2-$s 1)/3600/24 ) #step3. Excute techbbs_core.sh $ timesfor ((i= $s 3; i>0; i--)) do logdate= ' date--date= ' $i days ago "+%y_%m_%d" C2/>techbbs_core.sh $logdatedoneIv. SummaryThrough three parts of the introduction, the site's log analysis work is basically completed, of course, there are many unfinished things, but the general idea has been clear, the follow-up work only n
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.