This is according to the (Shanghaitech University) Wang Hao's teaching of the finishing.
Required pre-Knowledge: score, higher garbage, statistics, optimization
Machine learning: (Tom M. Mitchell) "A computer program was said to learn from experience E with respect to some CL The performance of the tasks T and measure p if its performance at the tasks in T, as measured by P, IM proves with experience E ".
? What is experience:historical data
? How to learn:learning models and algorithms
? Performance Measure:cost functions (Error, penalty)
Machine learning, a branch of artificial intelligence, concerns the study and
Construction of systems that can learn and predict from data
The core of machine learning deals with representation and generalization:
? representation/explanation of data instances and functions evaluated on these instances is part of the All machine L Earning systems
? Generalization (prediction) is the property that the system would perform well on unseen data instances
Machine learning tasks is typically classified into three broad categories
Supervised learning? supervised learning: The computer is presented with example inputs and their desired outputs, given by a "teacher ", and the goal is to learn a general rule, this maps inputs to outputs.
"Semi-supervised Learning"
? Unsupervised learning: No labels is given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (feature Lea rning).
? Reinforcement Learning: A computer program interacts with A dynamic environment in which it must perform A Certai N goal (such as driving a vehicle), without a teacher explicitly telling it whether it had come close to its goal. Another example is learning to play a game by playing against an opponent.
Learning Tasks
? Classification
? Regression regression
? Clustering Clustering
? Density estimation
? dimensionality Reduction Descending dimension
Methods:regression, decision Trees, K?means algortihm, support vector machine, Apriori algorithm, EM algorithm, PageRank, KNN, Naive Bayes, neural networks ...
The difference between machine learning and data mining:the overall goal of the data mining process are to extract Information from a data set and transform it to an understandable structure for further use.
Machine learning also have intimate ties to optimization:
? The three pillars:statistical modeling, feature selection, learning via optimization (Netflix prize)
? Many learning problems is formulated as minimization of some loss on a training set of examples
Optimization Algorithms/techniques
? Sparse optimization
? iteratively reweighted Least Squares algorithm (IRLS)
? Gradient descent Methods
? Online Gradient Methods
? Stochastic Gradient Methods
? Newton method
? Quasi-Newton Method (BFGS)
? Limited Memory BFGS
? Coordinate descent
? Alternating Direction methods of multipliers
? Penalty method, augmented Lagrangian
? Gradient Projection method
? Iterative-thresholding Method (IST)
? Active Set method
? Recursive least squares
? Line search, Convergence rate, duality, kkt/optimality conditions
Bibliography:
1 for machine learning methods: "Machine learning, A probabilistic Perspective", Kevin p. Murphy, the MIT press.
2 for optimization knowledge: "Numerical optimization", Stephen Wright, Jorge Nocedal, 2nd Edition, Springer.
3 for Optimization techniques on machine learning: "Optimization to machine learning", Suvrit Sra, Sebastian Nowozin, and Stephen J. Wright, the MIT press.
4 Some lectures'll be based on these books, and not all of them. Reading the textbooks isn't required, but it's recommended. You aren't responsible for textbook material, that's not covered in lecture.
Optimization and machine learning (optimization and machines learning)