ADD1 ()
DROP1 ()
9. Regression Diagnostics
Does the sample conform to the normal distribution?
Normality test: function shapiro.test (X$X1)
The distribution of normality
Learning set/Is there outliers? How to find Outliers
is the linear model reasonable? Maybe the relationship between nature is more complicated.
Whether the error satisfies the independence, equal variance (the error is no
classifier will be severely affected, as shown in:To solve the above two problems, we adjust the optimization problem to:Note: When ξ>1, it is possible to allow the classification to be wrong, and then we add the ξ as a penalty to the target function.Using Lagrange duality again, we get the duality problem as:Surprisingly, after adding the L1 regularization item, only a αi≤c is added to the like limit in the dual problem. Note that the b* calculation needs to be changed (see Platt's paper)KKT d
This is what we have learned (except decision tree)Here is a typical decision tree algorithm, with four places to choose from:Then introduced a cart algorithm: By decision Stump divided into two categories, the criterion for measuring subtree is that the data are divided into two categories, the purity of these two types of data (purifying).The following is a measure of purity:Finally, when to stop:Decision tree may be overfitting, reducing the number of Ein and leaves (indicating the complexity
In this section, a linear model is introduced, and several linear models are compared, and the linear regression and the logistic regression are used for classification by the conversion error function.More important is this diagram, which explains why you can use linear regression or a logistic regression to replace linear classificationThen the stochastic gradient descent method is introduced, which is an improvement to the gradient descent method, which greatly improves the efficiency.Finally
This section is about the nuclear svm,andrew Ng's handout, which is also well-spoken.The first is kernel trick, which uses nuclear techniques to simplify the calculation of low-dimensional features by mapping high-dimensional features. The handout also speaks of the determination of the kernel function, that is, what function K can use kernel trick.In addition, the kernel function can measure the similarity of two features, the greater the value, the more similar.Next is the polynomial Kernel, w
. DrawingT=[0:0.01:0.98]Y1=sin (2*pi*t)Plot (t,y1) % drawingOnY2=cos (2*pi*t)Plot (T,y2, ' R ')Xlabel (' time ')Ylabel (' value ')Legend (' Sin ', ' cos ') % legendTitle (' My Plot ')Print-dpng ' myplot.png ' % saved as picture fileClose % Closes the current diagramFigure (1) % Create a diagramCLF % Empty chart Current ContentsSubplot (1,2,2) % graph cut to 1*2 grid, draw 2nd gridAxis ([0.5 1-1 1]) % axis changed to x belongs to [0.5,1],y belonging to [ -1,1]Imagesc (The Magic ()), Colorbar,colo
IntroductionThe systematic learning machine learning course has benefited me a lot, and I think it is necessary to understand some basic problems, such as the category of machine learning algorithms.Why do you say that? I admit th
This column (Machine learning) includes single parameter linear regression, multiple parameter linear regression, Octave Tutorial, Logistic regression, regularization, neural network, machine learning system design, SVM (Support vector machines Support vector machine), clust
This column (Machine learning) includes single parameter linear regression, multiple parameter linear regression, Octave Tutorial, Logistic regression, regularization, neural network, machine learning system design, SVM (Support vector machines Support vector machine), clust
Preface: "The foundation determines the height, not the height of the foundation!" The book mainly from the coding program, data structure, mathematical theory, data processing and visualization of several aspects of the theory of machine learning, and then extended to the probability theory, numerical analysis, matrix analysis and other knowledge to guide us into the world of
watch all the course videos at any time, download handouts and notes from Stanford CS229 course. This course includes homework and small tests, which mainly explain the knowledge of linear algebra, using the Octave library.
Caltech learning from data at the California Institute of Technology: You can ta
) The principle of big data Large data rationale
Large amounts of data can greatly improve the final performance of the learning algorithm, rather than whether you use more advanced algorithms, etc., so there is a sentence:
"It's not a who had the best algorithm that wins. It's Who's have the most data.
Of course, based on the two-point premise hypothesis:
1. Assume that the characteristics of the sample ca
Learning notes for "Machine Learning Practice": two application scenarios of k-Nearest Neighbor algorithms, and "Machine Learning Practice" k-
After learning the implementation of the k-Nearest Neighbor Algorithm, I tested the k-
rigorously, because one of the objective functions in statistical learning is to maximize the prediction of the correct expected probability, we only consider the common loss function.
Loss function is an important index to approximate the quality of the model, the greater the value of the loss function is, the greater the prediction error of the model, so what we need to do is to update the parameters of the model and minimize the value of the loss
mainly explain the knowledge of linear algebra, using the Octave library.
Caltech learning from data at the California Institute of Technology: You can take this course on edx, which is explained by Yaser Abu-mostafa. All course videos and materials are available on the California Institute of Technolog
Machine learning is a comprehensive and applied discipline that can be used to solve problems in various fields such as computer vision/biology/robotics and everyday languages, as a result of research on artificial intelligence, and machine learning is designed to enable computers to have the ability to learn as humans
The motive and application of machine learningTools: Need genuine: Matlab, free: Octavedefinition (Arthur Samuel 1959):The research field that gives the computer learning ability without directly programming the problem.Example: Arthur's chess procedure, calculates the probability of winning each step, and eventually defeats the program author himself. (Feel the idea of using decision trees)definition 2(Tom
As an article of the College (http://xxwenda.com/article/584), the follow-up preparation is to be tested individually. Of course, there have been many tests.
Apache Spark itself1.MLlibAmplabSpark was originally born in the Berkeley Amplab Laboratory and is still a Amplab project, though not in the Apache Spark Foundation, but still has a considerable place in your daily GitHub program.ML BaseThe mllib of the spark itself is at the bottom of the three
, the minimum value of the price function jval provided by us, of course, returns the solution of the vector θ.
The above method is obviously applicable to regular logistic regression.5. Conclusion
Through several recent articles, we can easily find that both linear regression and logistic regression can be solved by constructing polynomials. However, you will gradually find that more powerful non-linear classifiers can be used to solve polynomial reg
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.