Machine Learning V.S. Data Mining/Artificial Intelligence/Statistics

Source: Internet
Author: User
Keywords machine learning artificial intelligence data mining
Lec1-4 machine learning details
Case-Does the bank issue credit cards to customers?
Data: customer's information [input]

Expected result: Issue a card to the customer, or not issue a card【output】

Machine learning: learn how the bank issues cards (will make the best profit later)

Potential model: There are some indicators that can help determine whether a user should be issued a card. When manually operated, customer information will be comprehensively considered, so that if the user is issued a card (or not), the bank's revenue will be greater.


Symbolic

input: x∈X (customer information)

output: y∈Y (card issuing/no card issuing)

The latent mode that the machine needs to learn (target function): f: X -> Y The ideal formula from X to Y.

Starting from the simplest one-dimensional data, suppose the training data D={(x1, y1), (x2, y2), ……, (xN,yN)}. The real f(x) is unique, but not We know that what machine learning has to do is to start from the data X and help us analyze the hypothetical function g(x), so that g(x) ~= y, that is, g(x) is infinitely close to f(x).


Then you can assume that g(x) is a function you might think of, such as a linear function, a quadratic function, etc. For a certain machine learning algorithm, the type of g(x) should be roughly selected, and the training process is just to adjust the function Parameters to fit the real data curve, so that y = g(x) is constantly approaching the real objective function y = f(x). g(x) tries to guess and simulate f(x) so that the result of g(x) can be approximated to f(x) on the known data.




There are many possibilities for g(x), but as the amount of information increases, the range of options for g(x) will be reduced. The data uses machine learning algorithm A to select a fitting function g that approximates the objective function f from the hypothesis set hypothesis. If machine learning has learned potential patterns and has improved skills, it is expected that the more similar g and f, the better.


Possible g(x) in credit card problem


g is hk selected from the hypothesis set H. hk is the possible value of g.

h1: Annual income is greater than 80w

h2: Debt is greater than 100,000

h3: work less than two years

...

Hypothesis set may contain good hypotheses or bad hypotheses. What machine learning does is to use algorithm A to select the best g from the set of hypotheses H.


The machine learning model refers to algorithm A and hypothesis H.

Complete process


Lec1-5 Machine Learning V.S. Data Mining/Artificial Intelligence/Statistics

Machine learning

Use the data to find a hypothesis g(x) that is similar to the desired target function f(x).

Data mining

Use the data to find out some interesting things. (For example, after a supermarket user purchases one thing, will he want to buy another thing-find out the correlation between the goods).

Machine Learning V.S. Data Mining-Same or Related
If the interesting thing is to find out hypothesis g similar to target function f like prediction, then machine learning and data mining are the same.

If something interesting is related to finding a hypothesis g similar to the target function f, then data mining can help machine learning do better, or machine learning can help data mining to dig out interesting things.

Slightly different

Traditional data mining also focuses on efficient calculations in large amounts of data.

very close

These two fields are very close, and it is difficult to find researchers who do only one of them.

Machine learning V.S. artificial intelligence-machine learning is a way to realize artificial intelligence
Artificial intelligence hopes that computers can perform smart things like chess and driving. Predicting is a very clever thing. Finding a g is very close to the f we want. From this perspective, machine learning is a way to realize artificial intelligence. There are many ways to achieve artificial intelligence.

Chess case

Traditional artificial intelligence: tree diagram-analyze the advantages and disadvantages of this next step;

Machine learning artificial intelligence: learn how to play chess from the data of chess players or play chess by yourself.

Machine learning V.S. statistics-statistics is a way to achieve machine learning
Statistics use data to make some inferences. For example, we don't know the probability of tossing a coin.

g is an inference result, f is something we don't know. From this perspective, statistics is actually a way to achieve machine learning.

Slightly different

Many traditional statistics tools will use machine learning, but statistics are based on mathematics. Many things will find ways to write down some hypotheses, and finally use provable results to say what deductions can be proved under such statistics. . Traditional statistics are mostly mathematical inferences. Machine learning starts from data calculations. Many algorithms pay more attention to how to calculate, rather than mathematical results.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.