Stanford Open Class: Statistical the wrong choice in learning

Source: Internet
Author: User

4.4 R1 in which of the following problems are Case/control sampling LEAST likely to make a positive impact?

A. Predicting a shopper ' s gender based on the products they buy

B. Finding predictors for a certain type of cancer

C. Predicting if an e-mail is Spam or not Spam

Correct answer:a

Explanation: Case/control sampling is more effective when the prior probabilities of the classes is very unequal. We expect this to is the case for the cancer and spam problems, but not the gender problem.

4.5 R1 Suppose in Ad Clicks (a problem where do I try to model if a user would click on a particular Ad) it's well kno The majority of the time an ad is shown it won't be clicked. What is another the saying that?

A. Ad Clicks has a low Prior probability.

B. Ad Clicks has a high Prior probability.

C. Ad Clicks has a low Density.

D. Ad Clicks has a high Density.

Correct answer:a

Explanation: Whether or not an ad gets clicked is a qualitative Variable. Thus, it does not has a density. The Prior probability of Ad Clicks is low because most ads was not clicked.

4.6 R1 Which of the following is not a linear function in x:

A. f (x) = a + b^2x

B. The discriminant function from LDA.

C. \delta_k (x) = x\frac{\mu_k}{\sigma^2}-\frac{\mu_k^2}{2\sigma^2} +\log (\pi_k)

D. \text{logit} (P (y = 1 | x)) where p (y = 1 | x) is as in logistic regression

E. P (y = 1 | x) from logistic regression

Correct answer:eexplanation:p (y = 1 | x) from logistic regression are not linear because it involves both an exponential function of X and a ratio.

5.1 R2 What is reasons why test error could is less than training error?

A. By chance, the test set has easier cases than the training set.

B. The model is highly complex, so training error systematically overestimates test error.

C. The model is not very complex, so training error systematically overestimates test error.

Correct answer:a

Explanation:training error usually underestimates test error when the model was very complex (compared to the Training set Size), and is a pretty good estimate if the model is not very complex. However, it's always possible we just get too few hard-to-predict points in the test set, or too many in the training set.

---restore content ends---

Stanford Open Class: Statistical the wrong choice in learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.