Stanford University public Class machine learning: Advice for applying machines learning-deciding to try next (how to determine the most appropriate and correct method when designing a machine learning system)

Source: Internet
Author: User
Tags new set

If we are developing a machine learning system and want to try to improve the performance of a machine learning system, how do we decide which path we should choose Next?

In order to explain this problem, to predict the price of learning examples. If we've got the learning parameters and we're going to test our hypothetical function on a new set of house samples, we'll find that there's a huge error in predicting the house price, and now our question is, what do we do to improve the algorithm?

In fact, we can think of many ways to improve the performance of the algorithm, one of which is to use more training samples. Specifically, through telephone surveys, home-based investigation, to obtain more different home sales data. Unfortunately, a lot of people spend a lot of time collecting more training samples, and they always think that if there are twice or even 10 times times the number of training data, it will definitely solve the problem. But sometimes, getting more training data doesn't actually work, and then we'll explain why. Another way, we may think of is to try to choose fewer feature sets, such as x1,x2,x3 and so on. We may be able to take a moment to carefully select a small part of these features to prevent overfitting. Or maybe you need to use more features, and if the current feature set isn't very helpful to you, you want to collect more data from the perspective of getting more features. Similarly, you can extend the problem to a large project, such as using a telephone survey to get more housing cases, or land surveying to get more information about the land, and so on, so this is a complex issue. In the same vein, we really want to know how well it works before we spend a lot of time doing it. We can also try to increase the polynomial characteristics of the method, such as X1 squared, X2 Squared, the product of X1,X2. We can spend a lot of time thinking about this approach, and we can also consider other ways to reduce or increase the value of a regular parameter lambda.

The 6 reasons listed above can be extended to a project for six months or longer. Unfortunately, the standard that most people use to choose these methods is based on the feeling that most people choose to choose one of these methods casually, such as they would say "let's find some more data" and then spend six months collecting a lot of data, and then maybe another person said, " Let's find some more features in the data from these houses. " A lot of people spend at least six months to complete one of their random choices, and after six months or more, they regret to find that they have chosen a way of no return.

Stanford University public Class machine learning: Advice for applying machines learning-deciding to try next (how to determine the most appropriate and correct method when designing a machine learning system)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.