What is the status of Kaggle's game in the field of machine learning?

Source: Internet
Author: User
Tags xgboost
What kind of people can generally win prizes on the Kaggle? Can you learn something? is the award on Kaggle a bright spot for job seekers or graduate students?

Reply content:

A year or two after Netflix prize, this type of competition is beginning to prevail. The first ones to get a good grade are Daniel, Xiangliang, who was known as the second member of Netflix prize.

In the past few years with the game more and more, winner solution are everywhere, routines are also more and more familiar. No matter what the game, lr+gdbt+fm+nn up a bit, and then ensemble, always get a good result.

What you get on the kaggle doesn't mean anything. If you can demonstrate the ability to analyze problem solving problems in the course of the competition, especially to put forward the result plan, we can embody the true level. For example, the Shanghai Jiaotong University Apex Laboratory team participated in KDD Cup 11 after developing the SVD Feature; participating in the KDD Cup 12 get first place some of the tree-related techniques are Tianqi ICML paper and Xgboost Foundation. Conversely, if you turn up some of the solution shares that have recently participated in the Kaggle competition, most of them are follow some specific process to go through, and nothing new.

In addition, now everyone is a team to participate in, really can play a decisive role is one of the few people, there are a few soy sauce like to take the results of their own out to blow, all of these people are still more careful.

It is different to participate in the competition and to do the work. Many people do not need to understand the model and the details of the algorithm, take a few open-source packages run can get good results. Doing * Good *research requires a deeper understanding of the model and the application. So, for the job search is a bright spot, but for graduate students, not necessarily. The winners are the people who have real skills, a good place to practice a few topics on it, and prove your practice and understanding of the field of data science. It's almost the norm when I'm hiring now:

    1. I will read my resume when I have participated in the Kaggle contest.

    2. Once a 10%, I will give the phone interview.

    3. After 2 or more 10%, I will give an interview on site.

    4. Once before 10, we'll be laughing.

Landlord refueling. There may be little influence in the mainstream research community, but it is still useful in industry. If you have some special highlights of the results, it is very convincing. Just as the answer is mentioned in @lau Phunter.

If you want to get a good result in the Kaggle competition, it is unavoidable to do a lot of experiments: about parameter selection, model selection, and feature engineering and so on. In order to accomplish these experiments efficiently, we should have good experimental ideas, and have a solid code foundation to complete pipeline design and architecture. This is a great test of people's comprehensive ability, is the industry needs talent.

Even so, Kaggle's game has been much more streamlined than what the real-world machine learning has to do. Where do we spend our usual time?
* Determine what is the problem to be solved
* Clear Optimization indicators
* Collect the right data
* Data Cleansing
* Do all kinds of experiments
* Ask other groups to do A/b Test together.
* Integrate machine learning pipeline into the pipeline of other products
* It's really useful to sell our models on various occasions ...

So there's not much time to actually run the experiment. But no matter how, at least kaggle to get a good result shows that you can really do the experiment systematically, is a very big bright spot, but also very strong indicators. Graduate students should not use the egg, did not take this thing to apply for work, do not know. As for the ability to learn things, see what kind of game, and whether intentions.

Some of the game data is too simple to download and run a xgboost, just 10%. But if you try something new, you can reap it.

Some of the data is troublesome, the processing of data on time, feature engineering more abundant room, sometimes to write some rules or write loss function, these competitions can learn a lot.

Winning is not easy. Simple game because of the number of participants too much, the method is too homogeneous, time spent on tuning parameter and ensemble, the prize needs good luck. Complex games, methods vary, but also engage in tuning parameter that set is not the spirit, top team time is spent on the key to catch the problem, who caught the key more, who score higher, this will take a lot of time and thinking, more exciting. What kind of people have. The prize-winning competition has many large companies with "pros", and the most entry-level players with low bonuses. Finding a job with a company that lacks professional machine learning talent will help. Applying for graduate students is less useful than publishing a top-level article. Feel no status, belong to turn professional to data scientist Sharp weapon. This kind of actual combat game is very good, we imitate kaggle to run a big data game, welcome to play.
Bonus 1w, time hack: Find the program to create Time Master Portable Cloud Calendar Products Big Data Mining & Online Programming ContestOne suggestion, do a crappy PhD, or get a job. Just don ' t get a master ' s. Unless it's funded or in US. It's not difficult to get into the list, the front row is easier, a few difficult before the row. All are routines, skilled + simple thinking is good. There's no difference between a bottle opener and an excavator.

What is the status of Kaggle's game in machine learning field? For more information, please pay attention to topic.alibabacloud.com (www.php.cn)!

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.