Cs229-machinelearning-12 Intensive Learning Notes

Source: Internet
Author: User

Ng's machine learning class, course resources: cs229-Courseware NetEase Open Class-Video

Problem Mathematical Model:

Five tuples {S, A, Psa, γ, R}, respectively, corresponding to {state, behavior, state S under the probability of a behavior, constant, return}.

Optimization objectives:

Choose a policy to get the best reward: E[r (S0) +γr (S1) +γ2r (S2) + ...], the existence of constant gamma ensures that the proceeds are obtained as quickly as possible.

Optimization function:

According to the Behrman equation,

R (s) represents the direct benefit of executing this policy, and the subsequent heap is the proceeds from subsequent behavior after the policy has been executed.

The optimal strategy satisfies:

Then the most strategy in S state is to satisfy the behavior of the following equation:

In this way, you can iterate over the calculation.

Solution Method:

But the actual operation of the PSA is unknown, so need to count the number of times, for the class of the robot moving example, Ng explained that can let the robot walk, statistics to reach each state number of times.

So the complete implementation of the intensive learning process is this:

Cs229-machinelearning-12 Intensive Learning Notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.