Data mining engineer Interview Guide

Source: Internet
Author: User

From: http://xccds1977.blogspot.com/2012/03/blog-post_14.html

Link: http://www.discoverycorpsinc.com/interviewing-data-miners-and-m/

The data mining field is a unique industry, and the general recruitment interview method may not be suitable for the characteristics of this industry. When recruiting a Qualified Data mining engineer, the company generally focuses on the following three aspects:

  • Is he smart? Smart means that problems can be constructed through complicated information and solved in the correct way. Smart people can also gain experience from failures.
  • Can he focus on projects? Focusing means that projects can still be completed independently or collaboratively in various difficult environments.
  • Can he work with the team. Teamwork requires good communication skills, and the concepts, problems, models, and conclusions involved in the work must be correctly communicated between members before they can be clarified.

To check whether a candidate has the potential of a data mining engineer, an interview is required within one hour. The following five steps are used:


1 Overview
Like the cold talk at the beginning of the conversation, the introduction is to relax the candidates. You can first introduce the situation of the company and then answer the other party's questions. If the problem is complex, you can put the answer at the end of the interview and try again.

2. Data Mining Projects
This is the most important and time-consuming interview phase, asking candidates about the situation and handling methods of the data mining project recently claimed. Questions include:

  • How he described the project at the beginning
  • How long does the project last?
  • What are the key issues of this project?
  • How is the problem solved?
  • What is the most difficult stage in a data mining project?
  • What is the most interesting stage?
  • In his eyes, what is the customer?
  • How do other members of the team behave?
  • What kind of experience have you gained?

In this interview stage, we should not only ask questions about "what", but also many questions about "why. This is because excellent data mining engineers must be able to face customers and clearly demonstrate and support their ideas.

3. Data Mining Process
It is necessary to examine the candidate's understanding of the workflow. If he talks about the Cross-Industry Data Mining Process Specification (CRISP-DM), it would be a good sign. Many times, candidates disagree with these rules. Although it is an innovation to look at the problem from different perspectives, innovation also needs to be built on solid process standards. Because it ensures that there will be no major leakage.

When necessary, you can use a whiteboard to draw a flowchart for candidates. And asked him to comment on the most important or reflective aspects of such work. Because modeling cannot be completed at one time, it is common to repeatedly refine problems and establish models.

In addition, you can conduct in-depth research in a certain mining process, such as asking the other party how to avoid over-fitting, how to filter from a large number of candidate variables, and how to evaluate or compare the effects of the model.

4. Solve the Problem
Software companies' interviews generally include "code tests", as should data mining engineers. One reference is to provide a defective analysis report. Allow candidates to study the report, express the meaning of the conclusions in the report, propose existing problems or deficiencies, and propose methods for improvement or remedy.

5. Conclusion
At the final stage of the interview, you need to answer other questions from the candidate and trust the company's competitive position in the industry and its role in your career. After completing the interview, you need to archive the interview records immediately.

Interview is a tough task, but it is also an opportunity for communication and learning. Through the interview, we can learn about other people's problems and how they solve them.

Note: This document is authorized by discovery and is excerpted from part of the document. According to my understanding, discovery mainly uses SPSS
The model is used for data mining, while the graphic presentation tool is tableau.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.