1. Linear Template and Minimum square
• Linear regression can also be used for simple classification, although boundary is simple, but the model is bound to be inaccurate.
· A problem exists :
ESL P13: Two Scenarios
Scikit-learn:
Linearmodel.linearregression ()
classlinearregression (Linearmodel, regressormixin):""" Ordinary least squares Linear Regression. Parameters----------fit_intercept: (Fit Intercept) Boolean, optional whether to calculate the intercept for this mod El. If set to False, no intercept'll be used in calculations (e.g. data are expected to be already centered). Normalize:boolean, optional, default False If True, the regressors X would be normalized before regression. Copy_x:boolean, optional, default True If True, X would be copied; else, it may be overwritten. N_jobs:int, optional, default 1 the number of jobs to the computation. If-1 all CPUs is used. This would only be provide speedup for n_targets > 1 and sufficient large problems. Attributes----------coef_: coefficient, slope. Array, shape (n_features,) or (N_targets, n_features) estimated coefficients for the linear regression problem. If multiple targets is passed during the Fit (y 2D), this is a 2D array of shape (n_tArgets, N_features), while if only-one target is passed, this is a 1D array of length n_features. Intercept_: Intercept. Array independent term in the linear model.
2. Nearest Neighbor Model
• European-style distance between samples
k-nearest-neighbors: With the increase of k, the accuracy of classification increases, the error rate decreases and vice versa, but it also causes overfitting. In fact, the valid parameters are n/k, not K. Show the relationship between N/k,k,error.
Scikit-learn:
3. Comparison
Minimum squared: Low variance, high deviation;
K-nearest-neighbors: High Variance, low deviation.
Least Squares & Nearest Neighbors