Want to know machine learning algorithms cheat sheet? we have a huge selection of machine learning algorithms cheat sheet information on alibabacloud.com
(First chapter above)1.2.5 Linalg Linear Algebra LibraryBased on the basic operation of matrices, the Linalg Library of NumPy can satisfy most linear algebra operations.. determinant of matrices. Inverse of the Matrix. Symmetry of matrices. The rank of the matrix. The reversible matrix solves the linear equation1. Determinant of matrices from Import * in[#N-order matrix determinant operation in [6]: A = Mat ([[[1,2,3],[4,5,6],[7,8,9]]) in [print]det (A):"6.66133814775e-162. Inverse of the Matrix
Decision tree is to select the most information gain properties, classification.The core part is to use information gain to judge the classification performance of attributes. The information gain is calculated as follows:Information entropy:Multiple categories are allowed.Calculates the information gain for all attributes, choosing the largest root node as the decision tree. Then, the sample branches, continuing to determine the remaining properties of the information gain.Information gain has
threshold of the class, and it is saved for clustering. This method of finding EPs mainly takes into account that data sets of different densities should be based on the density of each data. The appropriate thresholds were selected for clustering. Because the parameters used in clustering can only determine the density difference in the same class of data in the cluster results, the error caused by the parameter selection will not have a great effect on the clustering result.2.2 DBSCAN cluster
about boost Algorithm
The boost algorithm is a set of integrated learning Algorithms (ensemble learning) based on the PAC learning theory (probably approximately correct). The fundamental idea is to construct a strong classifier with high accuracy by using several simple weak classifiers, and the PAC
ReferenceNB: High efficiency, easy to implement;LR: Less assumptions about data, strong adaptability, can be used for online learning, and the requirement of linearDecision tree: Easy to interpret, independent of data linearity or not; easy overfitting, no online supportRF: Fast and scalable, with few parameters, possibly over fittingSVM: High accuracy, processing of non-linear sub-data (high-dimensional data processing); Memory consumption, difficult
value;If it becomes smaller, the new puzzle will replace the original;If it becomes larger, the probability of replacing the old one with the new one depends on the current temperature value, where the temperature will begin to slow down at a relatively high value, which is why the algorithm is more receptive to relatively poor performance in the early stages of execution, so that we can effectively avoid the possibility of falling into the local minimum, when the temperature reaches 0, The alg
Perception Machine: This is the simplest machine learning algorithm, but there are a few points to note. The first is the selection of the loss function, and in order to minimize the loss function, the gradient descent method used in the iterative process, finally obtains the optimal w,bThe visual interpretation is to adjust the value of the w,b, so that the sepa
. Or after the derivation of the formula can not be interpreted, or the number of unknown parameters is greater than the number of equations. At this point, the iterative algorithm is used to find the optimal solution step-after-step.
In particular, if the optimization function is a convex function, then there is a global optimal solution, if the function is non-convex, then there will be many local optimal solutions, so the importance of convex optimization is self-evident. People always wan
]) *double (Dy[i])#Sqx = double (Dx[i]) **2Sumxy= VDOT (Dx,dy)#returns the point multiplication of two vectors multiplySQX = SUM (Power (dx,2))#Square of the vector: (x-meanx) ^2#calculate slope and interceptA = sumxy/SQXB= meany-a*MeanxPrintA, b#Draw a graphicPlotscatter (XMAT,YMAT,A,B,PLT)7.1.4 Normal Equation Group methodCode implementation of 7.1.5 normal equation set#data Matrix, category labelsXarr,yarr = Loaddataset ("Regdataset.txt")#Importing Data Filesm= Len (Xarr)#generate x-coordinat
value of 3.For example: Np.random.randint (3, 6, size=[2,3]) returns data with a dimension of 2x3. The value range is [3,6].(4). Random_integers (low[, high, size]), similar to the above randint, the difference between the range of values is closed interval [low, high].(5). Random_sample ([size]), returns the random floating-point number in the half-open interval [0.0, 1.0]. If it is another interval [a, b), it can be converted (b-a) * Random_sample ([size]) + AFor example: (5-2) *np.random.ran
algorithm to initially estimate the number of K.2) How to choose the initial K pointsThe common algorithm is random selection. But often the effect is not very good, also can be similar to the method, the line uses the hierarchical clustering algorithm to divide the K clusters, and uses these clusters ' centroid as the initial centroid.3) method of calculating distancesCommonly used such as European distance, cosine angle similarity degree.4) Algorithm Stop conditionThe maximum number of iterat
other.Suppose we choose the attribute R as the split attribute, DataSet D, R has K different values {v1,v2,..., Vk}, so d according to the value of R into K-group {d1,d2,..., Dk}, after splitting by R, the amount of information required to separate the different classes of DataSet D is:information gain is defined as before and after the split, two of the amount is only poor:The following example uses Python to illustrate a decision tree construct using the information gain method:The main steps
application thread exists in the contents of the set logs, and modify the corresponding remembered sets, this step needs to pause the application, parallel running.Survival Object calculation and cleanup ( Live Data counting and Cleanup )It should be noted that in G1, it is not that final marking pause is executed, it is certain to perform cleanup this step, because this step needs to suspend the application, G1 in order to achieve quasi-real-time requirements, It is necessary to reasonably pla
Logistic regression is used to classify, and linear regression is used to return.Linear regression is the addition of the properties of the sample to the front plus the coefficients. The cost function is the sum of squared errors. Therefore, in the minimization of the cost function, you can directly derivative, so that the derivative equals 0, as follows:Gradient descent can also be used to learn the same gradient as the logistic regression form.Advantages of linear regression: simple calculatio
Continuous update ...1.k-Nearest Neighbor algorithmAdvantages: High precision, insensitive to outliers, no data input settingsCons: High computational complexity, high spatial complexityApplicable data range: Numerical and nominal typeApplicable scenarios:2.ID3 Decision Tree AlgorithmAdvantages: The computational complexity is not high, the output is easy to understand, the missing middle value is not sensitive, can process the irrelevant characteristic dataDisadvantage: May cause over-matching
can be processed.Cons: Easy to fit.How to avoid overfitting:(1) dimensionality reduction, can use PCA algorithm to reduce the dimension of the sample, so that the number of theta of the model is reduced, the number of times will be reduced, to avoid overfitting;(2) regularization, the design of regular items regularization term.The regularization function is to prevent some properties before the coefficient weight is too large, there has been a fitting.Note that the way to resolve overfitting i
training samples.The above two or three can be done in the case of inverse existence, but what if the characteristics of the data are more than the sample points, because the inverse is not present at this time? You can use the ridge regression method to solve this problem, that is, it will be converted to, the other and the previous approach is similar.Of course, there is a method called forward stepwise regression, it is through each step to a certain weight increase or decrease a small value
paper is usually European-style distance, Pearson coefficient or cosine similarity.Assuming that a matrix A is established, the M*n matrix, the rows are all users, n is all items, each element of the matrix represents the user's rating of the item, then the item-based or user-based recommendation is to calculate the similarity of all columns or all rows. In real life, this matrix is very sparse.Topic: Recommend users to buy TOPN itemsThe Matrix C is a m*n matrix, each row represents each user,
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.