Machine learning gentle guide

Source: Internet
Author: User
Tags machine learning machine learning application reinforcement learning supervised learning unsupervised learning

Machine learning is the most advanced aspect of the field of artificial intelligence today, and more beginners have begun to enter this field. In this article, machine learning and NLP experts, MonkeyLearn co-founder & CEO; Raúl Garreta generalizes to beginners the use of important concepts, applications and challenges in the machine learning process, designed to enable readers to continue to explore machine learning knowledge.

Machine learning is a branch of artificial intelligence that lets computers learn by building algorithms and uses them on data sets to accomplish tasks without explicit coding.

Do you understand? We can let the machine learn how to do things! When I first heard about it, I was very excited. That means we can code the computer and let them learn things by themselves!

The ability to learn is one of the most important aspects of intelligence. Applying this ability to the machine should be a big step toward making the computer smarter. In fact, machine learning is the most advanced aspect of the field of artificial intelligence today; it is now a fashionable topic, and the use of machine learning is also likely to create smarter machines.

This article will give a beginner a brief introduction to machine learning. I will generalize the use of important concepts, applications and challenges in the machine learning process. Given the formal and detailed description of machine learning is not the purpose of this article, but to introduce some preliminary concepts to the reader, so that readers can continue to explore machine learning knowledge.

Machine learning true face

Ok, not everything is as good as I heard, machine learning has its limitations. We can't build intelligent machines like Data in Star Trek or Hal 9000 in 2001 Space Roaming. However, we have enough real-world applications, and machine learning plays a magical role here. Here are some of the most common categories of practical machine learning applications:

Image Processing

The problem of image processing basically requires analyzing the image to obtain data or doing some conversion. Here are some examples:

Image tagging, such as in Facebook, the algorithm can automatically detect your or your friend's face as it appears in the photo. Basically machine learning algorithms learn from photos you manually tag.

Optical Character Recognition (OCR), an algorithm that converts manuscripts or scanned text into digital versions. The algorithm needs to learn to convert handwritten character images into corresponding digitized letters.

One of the mechanisms for auto-driving cars that let cars automate their driving through image processing. The machine learning algorithm learns where the edge of the road is, whether there is a stop sign or whether there is a car approaching, through each frame of the image taken by the camera.

Text Analysis

Text analysis is the process of extracting or classifying information from text files such as tweets, emails, chats, documents, and more. Here are some popular examples:

Spam filtering is one of the best known and most commonly used text categorization applications (for text categorization). Spam filters learn how to classify messages as spam based on content and subject matter.

Sentiment analysis, another application of text categorization, must learn to classify a viewpoint as positive, neutral, or negative based on the emotions expressed by the author.

Information extraction, from the text, learn to extract specific information or data blocks, such as extracting addresses, entities, keywords, and so on.

Data mining

Data mining is used to discover patterns or make predictions from data. This definition is a bit ordinary, but you can understand it as mining useful information from massive database tables. Each line can be our training instance, and each column can be used as a feature. We might be interested in predicting a new column with the remaining columns in the table, or finding a pattern to group the rows. such as:

Anomaly Detection: Detecting outliers, such as credit card fraud detection, you can detect which shopping methods are abnormal behavior from a user's usual shopping pattern.

Association rules: For example, in a supermarket or e-commerce site, you can discover the customer's buying habits by observing which products will be purchased together. This information can be used for marketing purposes.

Grouping: For example, on the SaaS platform, users can be grouped by user behavior and data.

Prediction: Another variable (one column in the database) is predicted from the remaining variables. For example, you can learn and predict new customers' credit scores by rating existing customer profiles and credits.

Video games and robots

Video games and robots are a huge area of machine learning applications. Generally we have an Agent (game character or robot) that must act according to the environment (the virtual environment in the video game or the real environment for the robot). Machine learning allows the agent to perform tasks, such as moving to an environment while avoiding obstacles or enemies. In these situations, one of the most popular machine learning techniques is reinforcement learning. Agents perform tasks by learning the reinforcement factor of the environment (if the Agent encounters the obstacle reinforcement factor is negative, if it is the target is positive).

Ok, I now know what machine learning is, but how does it work?

One of the first books on machine learning that I read about 10 years ago was Machine Learning by Tom Mitchell. This book was written in 1997, but the overall concept in the book is still useful today.

In that book, I like the formal definition of machine learning in the book, as follows:

For a certain type of task T and performance metric P, if a computer program is self-improved with experience E on the performance measured by P on T, then we call this computer program learning from experience E.

For example, a human game player must learn to play chess (task T) by looking at a previous chess game or playing against a mentor (experience E). Its performance P can be measured by the ratio of its winning game to the human player.

Let us illustrate with more examples:

Case 1: Entering a picture into the system, the system needs to determine whether there is Barack Obama's face in the picture (generally, it is similar to Facebook's image auto-tagging).

Case 2: Enter a tweet into the system to determine if the tweet has positive or negative emotions.

Case 3: Enter some information about someone into the system, and the system calculates the probability that the person will repay the credit card loan.

In case 1, the system task was to detect when Barack Obama's face appeared in the image. You can use the photos of which photos he appears or which photos do not appear as experience. The performance of the system can be measured by the proportion of times the system correctly recognizes the face of the Obama.

In Case 2, the system task is to conduct an emotional analysis of a tweet. The experience of the system can be a set of tweets and emotions corresponding to them. The performance of the system can be measured by the system's correct proportion of new tweet sentiment analysis.

In case 3, the system task is to perform a credit score. The system can use a range of user profiles and corresponding credit scores as an experience. The squared error (the difference between the predicted and expected scores) can be used as a performance metric.

In order for the algorithm to learn to convert the input to the desired output, you must provide a training example or training example, which is the experience E defined by Mitchell. A set of training sets is a collection of instances that will be used as examples, and machine learning algorithms learn from these examples and perform the intended tasks. Very understandable, isn't it? It's like you show the child how to throw the ball, you throw a few balls to teach him how to do it, and then by watching those examples, he starts to learn to throw the ball himself.

Each training instance is usually represented as a set of fixed attributes or features. Features are the way to represent each instance. For example, in case 1, a picture can be represented by the gray level of each pixel. In case 2, the tweet can be represented by the words that appear in the tweet. In Case 3, the credit history can be expressed in terms of the person's age, salary, occupation, and the like.

Calculating and selecting reasonable features to represent an instance is one of the most important tasks in the process of using machine learning, which we will discuss later in this article.

Types of machine learning algorithms

In this section we will discuss two broad categories of machine learning algorithms: supervised learning and unsupervised learning algorithms. The main difference between the two types of algorithms is the training examples we provide to the algorithm, the way the algorithms use the samples, and the categories in which they solve the problem.

Supervised learning

In supervised learning, machine learning algorithms can be thought of as the process of converting a particular input into a desired output.

Machine learning needs to learn how to convert all possible inputs into correct/expected outputs, so each training sample has specific inputs and expected outputs.

In the case of an artificial chess player, the input can be a specific checkerboard state, and the output may be the best way to play in this state.

Depending on the output, we can divide supervisory learning into two subcategories:

Classification

This is a classification problem when the output values belong to discrete and finite sets. Case 2 can be seen as a classification problem, and the output is a finite set: positive, negative or neutral. Our training examples are like this:

Return

When the output is a continuous value, such as probability, then this is a regression problem. Case 3 is a regression problem because the result is a number between 0 and 1, which represents the probability of a person repaying debt. In this case, our training examples are like this:

Supervised learning is one of the most popular types of machine learning algorithms. The drawback of using this method is that for each training example, we need to provide the correct output corresponding to it. In most cases, this will cost a lot of manpower, material and financial resources. For example, in the case of sentiment analysis, if we need 10,000 training cases (tweets), we need to mark each tweet with the correct emotion (positive, negative or neutral). This will require a group of people to read and tag each tweet (very time consuming and boring work). This is often the most common bottleneck for machine learning algorithms: collecting correctly labeled training data.

Unsupervised learning

The second type of machine learning algorithm is called unsupervised learning. In this case, the training data only needs to be input into the algorithm, and there is no need to have a desired output corresponding thereto. A typical use case is to find hidden structures or relationships between training examples. A typical case is the clustering algorithm, and we learn to find similar instances or a set of instances (clusters). For example, we have a news, we hope to recommend a similar news. Some clustering algorithms such as K-means learn from input data.

Machine learning algorithm

Ok, now I’m talking about math and logic. In order to convert the input to the desired output, we can use different models. Machine learning is not the only algorithm, you may have heard of support vector machines, naive Bayes, decision trees or deep learning. Those are different machine learning algorithms that solve the same problem: learning to convert the input to the correct output.

Those different machine learning algorithms use different paradigms or techniques to perform the learning process and present what they have learned.

Before we explain each algorithm, we need to understand that the most common principle is that machine learning algorithms try to achieve generalization. In other words, they try to explain things with the simplest theory, which is called the Occam razor principle. All machine learning algorithms, regardless of the paradigm they use, will attempt to create the simplest hypothesis (the one that makes the least hypothesis) to illustrate most of the training examples.

There are many machine learning algorithms, but let's briefly introduce the three popular algorithms:

Support Vector Machine: This model attempts to construct a hyperplane high-dimensional space set that attempts to distinguish instances of different classes by calculating the maximum distance from the nearest instance. This concept is intuitive and simple, but the model is sometimes very complex and powerful. In fact, support vector machines for some areas are one of the best machine algorithms you can currently use.

Probabilistic models: These models usually predict the correct response by modeling the probability distribution of the problem. The most popular of these algorithms may be the Naïve Bayes classifier, which uses Bayes' theorem and the independence assumption between features to construct the classifier. One of the advantages of this model is that it is simple and powerful, and it is very useful not only to return the predicted value but also to return the certainty of the predicted value.

Deep Learning: A new field of machine learning based on the famous artificial neural network model. Neural networks have a way of working together, trying to mimic (in a very simple way) how the brain works. Basically, they consist of a set of interrelated neurons (the basic unit of processing) that are organized into layers. In simple terms, deep learning builds new structures using deeper layers, improves algorithms through high-level abstractions, not only improves learning styles, but also constructs structures that automatically represent the most important features.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.