1. Data analysis and data mining linkages and differences
Contact: are engaged in data differences: data analysis of the statistical, visualization, reporting and reporting, the need for strong expression ability. The data mining partial algorithm, the heavy model, needs the very deep Code Foundation, wants the code code, many = =. 2. How to get started please Baidu "How to become a data analyst" or "How to become a data mining engineer". English good on Quora, do not know, look at the entry information. 3. Choose which books to look at the entry information to provide you with the book, there is electronic version of the electronic version, no electronic version of the paper books, spend not much money. 4. What language data is used for analysis: Excel is required, R is basic, and Python is advanced. SAS and Matlab to local tyrants to play. Data mining: Python is a must, Java/c/c++ is the foundation, Hadoop/mapreduce/spark first, because not all companies have such a large amount of data. 5. Need to learn math? Data analysis: Statistics, probability theory, data mining: high number/mathematical analysis, numerical analysis, linear algebra, convex optimization, operations research (these are basic) digital signal processing, pattern recognition, matrix theory (Advanced) 6. To graduate students in general, Only fresh students to find a job will be more important education, because you have no other to show your ability. But with a long working time (two years +), your ability is far beyond the school you are in, education is not important. If you want to read, suggest reading computational mathematics/probability theory/pattern recognition/Computer graduate, strive for paper (high-quality), or when the application has no impact, of course, some companies may be in the initial screening of the time according to the qualifications of the screening of people, normal, really want to go to work more than a few years to recruit into Bai, Education is not good can not blame others on the wrong. After a few years of work, if you feel the bottleneck, you can go to school, nothing, this time you may be more aware of what you need.  7. Choose which company to go to big company core position first >> medium company core position > Big company Edge position > Excellent start-up small company Core position > Mid-size company Edge jobs > Deceptive Small Business edge position reason: 1. Large companies have more data and more talents. Being able to access core project core positions is the best choice. (BAT, NetEase Youdao, Microsoft, etc.) 2. Medium-sized companies develop fast, have more opportunities, have great pressure and grow fast. (American Regiment drops 58) 3. Start-up companies carefully selected, if there is a start-up company offer, a look at their poor money, and see their project win not profit, three see team technical atmosphere is not strong. Not bad Money + profit but the technical atmosphere is not strong, can go, but not suitable for the pursuit of high technology people; not bad money + technical atmosphere thick but temporarily not profitable, you can consider, but to understand the profit model. Profit+ Technical atmosphere thick but now the money, can consider, strive to become a core member, once the financing is very bad. If the poor money and non-profit technical atmosphere is poor, forget it, can't afford. Do not know how to choose, just look at two points: 1. Data size 2. Technical atmosphere. Less money can be earned later, the technical atmosphere is the most important. 8. How to interview 1. Honesty 2. Be honest 3. Show your potential 9. No project experience what to do fresh students to what project experience, undergraduate talk about their graduation thesis, digital model/acm/Ali's participation experience or award experience, and may have internship experience. Graduate students say laboratory projects, responsible for work, completion of results and published papers. Don't exaggerate, say truthfully. 10. Training is not a different subject, but not recommended. If you do not even have the ability to search and self-study, even if you enter the line, it will be very painful. Not to mention the high tuition fees and fake resumes. 11. The choice of data analysis or data mining code ability to directly engage in algorithms, weak first to do data analysis, slowly, not anxious. Think I graduated only MATLAB, and then led by the leadership of two weeks to learn R, one months to learn python, their own in the amateur learning Java,hadoop and spark, but also step by step, do not want to eat into a big fat. 12. What's the future? I have doubled my salary from internship to 5 times times, how do you say. If you only go to the money, do the sales, if you like the pure number, to do research, if you like to find something interesting from the data and apply it, then do data analysis/data mining it. 13. A day's work is probably how to get to the company, run data, see results, tune, run data, see results, see Papers, change code, tune, run data, see results ... 14. There is no recommended site Google (before the wall, some night to recover, immediately to return, now can not be used to buy a VPN bar, Mac recommended shadowsocks software) Stack Overflow: Fix bug Artifact GitHub: Open source Dafa! Quaro: Data collection/experience reference (and various recruitment sites: I do not want to learn the time to see this, see and their ideal job there are still many gaps, beating chicken blood. 15. There is no recommended idesublime text+securecre/iterm enough (Mac, Windows, then the next notepad++ and Linux) all languages of the IDE can be the next, tune the trial. There is no recommendation, which is handy with which. 16. With what computer money directly on the server, no money to buy a high-profile, there is no money to buy a can knock on the line. After the work of the rich again the whole good. 17. How to cast a resume the school recruit has the net application, I did not have the school recruit,Not very clear about the process. If you miss the school recruit, it is best to pull the Zhou Botong and other vertical sites to cast. Not very recommended 58/market/connected, these jobs on the site of the main type of mainstream low-end, the probability of encountering an unreliable company is larger. 18. How to look at the job requirements to know my work content an easy way: All data analysis class requirements written in the requirements will be excel,ppt and so on the job is the statistician's work! All data analysis class requirements written in the requirements will be GA,PU,UV analysis of the operations department! All data mining positions are written in the requirements only hadoop,spark,etl of the work of the Data Warehouse! Other self-view, data mining has several kinds of positions: Advertising CTR estimate, machine learning, referral system, natural language processing and so on. Choose it yourself. Anyway, getting started can be a try. In short, the entry is easy to go into difficult, math is not good can learn, but will restrict your development, code is not good can, but also will restrict your career, so those who say "I think I am not good at math code ability is not strong feel r is difficult to read the language of the foreign web site learn more will not lose hair will not find the Girlfriend Blahblah ", you are happy, your career in the hands of the master. Finally, I would like to thank the original Peng teacher, Gao teacher and Li Haixiong teacher and so on, so I want to go back to see you!!!!!!! above is the work for six months to think of all the new people may ask questions, as well as my own tread on the pit of some summary, not comprehensive, there is no conditioning, we will see it. In addition to the above questions have what to ask me again, work time do not ask Oh ~ (although there is no time to work, tease Cat read the book Brush mobile phone on sleep = =). Do not think I am good, I think I am now is the entry level, around colleagues engaged in deep learning code codes of the fly quickly be abused miserably,
Go to data analysis/Data mining entry-level tips for contestants