What data skills are needed to get started with machine learning?

Source: Internet
Author: User

in fact, Machine Learning has been addressing a variety of important issues. For example , in the mid-decade, people have begun to use neural networks to scan credit card transactions to find fraudulent behavior; at the end of the year,Google Use this technology for Web search.

but at that time, machine learning was not a common engineer. To develop a machine learning system, you need to read a PhDand find a group of like-minded friends.

Now, machine learning is finally stronger and more pro-people.

An ordinary software engineer, do not need to go to specialized melted down re-build to read a graduate student, can use machine learning to develop a very good system.

of course, ordinary yards to use a good machine learning, but also to fill some classes, learn some data skills. Inforworld 's article describes techniques and strategies to help developers use machine learning more effectively.

Let the data speak

In a good software engineering practice, you can often get the desired design by inference, write the software part, and then test the solution directly and independently.

Sometimes you can even prove mathematically that your software is correct. But this is often difficult to achieve in practical matters, especially when it comes to human participation, but if you have good specifications, you can still implement the right solution.

But machine learning is not the same. Basically, you don't need a strict specification. You have data that can represent the past experience of the system, and then you need to build a system that works in the future.

to test whether the system really works, you need to evaluate its performance in real-world situations. Switching to this " heavy data, light elaboration " development pattern can be a big drag, but this is a key step in building your machine learning system.

Learn to identify better models

It's easy to compare the size of two numbers. Assuming they are all valid values (not non-numeric types), you just need to decide which value is larger and end.

When comparing the accuracy of machine learning, the problem is not so simple.

The model you want to compare has a lot of output, without a definite answer. A very basic ability to build a machine learning system is to determine which model is more in line with your problem scenario by observing the decisions made before the two models.

To make this judgment, you need to think about the data as a whole rather than a single value. This also usually requires you to visualize data very well, such as using histograms, scatter plots, and many other relevant data representations.

Remain skeptical of your conclusions.

As much as judging which model is better, it's also important to remain skeptical about your own conclusions.

Is your result just a statistical coincidence, and the data becomes more and more then no longer valid? Has the situation changed since you evaluated, so did the previous decision still work?

Building an embedded machine learning system means that you need to always be sure that your system is still doing the tasks you started. This skepticism is the quality necessary to make a vague comparison in a changing reality.

Build multiple models to filter

In the software industry there is an old saying that the first version of the system you build is doomed to be thrown away. The implication of this sentence is that until you actually build an effective system, you can fully understand the problem to better build the system. So you can build a version to accumulate experience, then apply the lessons learned to the design and build the actual system.

For machine learning, the situation is the same or even more. Building a system to practiced hand is not enough, you should be ready to build hundreds of dozens of of versions. Some versions may use different learning styles, or just different parameter settings, while others are a completely different restatement of the problem or training data.

For example, you might find that you can train a model with other alternative signals in addition to the signals you want to predict. In this way, you may have 10 times times the original data to train. Or you can try to retell the problem in a different way, making it easier to solve.

The world is changing rapidly. For example, when you build a model to detect fraud, even if you have built a successful system, you still need to make changes in the future. Because scammers will identify your loopholes and change their behavior. You will be forced to take a new approach.

So to be successful, you need to build a series of machine learning models to discard. Do not expect to have a universal model that is permanently applicable.

Fearless in changing

It's not always right to start a problem scenario that you want to use machine learning to solve, and maybe even wrong. As a result, you may encounter models that are not trained at all, or collect data that is not used for training, or the optimal results that the model trains are limited in value.

A re-examination of the problem may make a simple model highly valuable.

I've had a problem with the recommended product, even with some big skills, and it's hard to get a little bit of the revenue from the microblog.

But in fact, the high-value issue that we should focus on is when good goods go public. As long as you know this point in time, there are a lot of good products to choose from," recommend what product " This problem is solved.

Redefine the problem to make the whole project easier to solve.

" starting from a small place . "

It is valuable to apply your original system to a few simple situations or to a sub-problem. This will allow you to focus on gaining expertise in the problem area and getting support from your peers as you build your model.

" from the big place . "

Make sure you have enough training data. In fact, if possible, you need to collect as much data as you would have expected.

Expertise is still important

In machine learning, figuring out how a model is making or predicting is one thing, and more importantly, figuring out where the real problem is.

In this regard, if you already have a lot of expertise, you are more likely to ask the right questions so that machine learning can be used in a viable product. Professional knowledge is critical to correctly determine where a careful examination is required.

Programming ability is still important

There are many tools that you can use to simply drag and drop the process of building a machine learning model. In fact, much of the work of building machine learning systems has nothing to do with machine learning or models, but rather in collecting data and building systems that can use the results of the model output.

Therefore, having good programming skills is particularly important.

Although there are some stylistic differences between people in the code that handles data, it's not difficult to understand each other. So the ability to develop is very useful in many machine learning problems.

Now there are many tools and emerging technologies that enable almost all software engineers to develop a machine learning system for interesting problems. Basic program development skills will be useful in this build process, but you need to focus on the data when using them.

The best way to master these new skills is to start building something interesting from now on.

Source: Phoenix Technology

What data skills do I need to learn to get started with machine learning?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.