Machine Learning is a subject that focuses on computer science.ProgramHow can I automatically improve performance with experience?
Broad Definition of machine learning:
Definition: For a certain type of task T and performance measurement P, if a computer program's performance measured by P on T is self-improved with experience E, we call this computer program learning from experience e.
Example:
Checkers learning problems:
• Task T: checkers
• Performance Standard P: Percentage of opponents defeated in the competition
• Training experience E: peer with yourself
Handwriting Recognition learning problems:
• Task T: Recognize and classify handwritten text in Images
• Performance Standard P: classification accuracy rate
• Training experience E: handwritten text database with known categories
Robot driving learning:
• Task T: driving on a four-lane expressway using visual sensors
• Performance Standard P: average error-free mileage (errors are determined by human supervision)
• Training experience E: A series of images and driving instructions recorded while watching a human drive
Design a Learning System
Checkers learning problems:
• Task T: checkers
• Performance Standard P: Percentage of opponents defeated at World Championships
• Training experience E: peer with yourself
To complete the design of this learning system, you need to select:
1. Exact types of knowledge to be learned
2. Representation of this target knowledge
3. A Learning Mechanism
Select target function
We found that the problem of improving the performance of task t can be simplified to learning a specific target function like choosemove. Therefore, the selection of objective functions is a key design issue.
The choosemove function can be a target function, that is, the next step is to calculate the cursor position.
But it will be difficult to learn it, because the materials we learn are indirect.
It can also be an evaluation function V, which is easier to evaluate scores for a board game.
Selection of evaluation functions:
What is the exact value of the target function V for any game? Of course, any evaluation function that gives a higher score to a better game board is applicable.
However, it is best to define a specific target function v among the many methods that produce the best pair of operators. As you can see, this will make the design of a trainingAlgorithmEasy to use. Therefore, for any game status B in Set B, we define the target function V (B) as follows ):
1. If B is the final winner, then V (B) = 100
2. If B is the final negative result, then V (B) =-100
3. If B is a final sum, then V (B) = 0
4. If B is not the final game, then V (B) = V (B '), where B' is the final game that can be achieved after both parties take the optimal round effect from B.
However, this is inefficient,Is not operational.
In this way, the learning task is simplified to discoveringOperational description of the ideal target function v.
For any given checkerboard status, the function v compute can be calculated using a linear combination of the following checkerboard parameters:(V ^ has many forms, and the more complicated it is to approach the ideal function V)
As a result, the Learning Program sets the number of redskins threatened by sunspots.
V Branch (B) is a linear function.
V release (B) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6
Summary:
Part of the design of the checkers Program
• Task T: checkers
• Performance Standard P: Percentage of opponents defeated at World Championships
• Training experience E: Testing with yourself
• Target function: V: B → R
• Target Function Representation: V Branch (B) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6
The first three articles describe the learning task, and the last two articles formulate a design scheme for implementing this learning program. Note that the key role of this design is to simplify the problem of learning the checkers strategy as the coefficient W0 to w6 in the description of the learning objective function.