is Java necessary for data mining engineers?

Source: Internet
Author: User
I statistics Department data Mining direction, has been using the Python implementation algorithm, then the introductory textbook is "machine learning combat", which is also used in Python. But recently found that the recruitment requirements of data mining engineers generally have Java, and the NPC Data Mining Center also recommended that students learn java. Do data mining engineers need Java in addition to mastering Python?

Reply content:

I python is the main language, also use Scala to write Spark ml program, hadoop with pig latin+udf do some batch processing. C + + and MATLAB are used in the study of the time, and now forget almost. After working on the front end is also interested in the study, Html,css,javascript can also be used (then node. JS also played). Then I heard that Ruby is more elegant than python, and just looked at as's "Future of code" and tried out ruby, Very like mix-in multiple inheritance, so I later write Python also use mix-in way to write multiple inheritance, feel good.

I am responsible for the company's Big Data platform architecture design, RTB launch and mobile SDK effect tracking statistics system design and development, large-scale user portrait system development. Now I've been discussing the Hadoop architecture with other Hadoop engineers from the company, discussing the angular.js and react.js frameworks with front-end programmers, and talking to back-end colleagues about tornado,flask,tomcat,play these back-end frameworks. Extensive learning has allowed me not only to become a data mining engineer, but also to become the company's core technical staff. I already think I'm not a data mining engineer, so I changed the title to a programmer. I don't think it's necessary to have skills, you're happy.

PS: I majored in learning control, the study is to do is a robot, now the main research direction is the natural language processing (I want to do tall on the artificial intelligence!) The need to familiarize yourself with Java is to make it easy for you to build a complete set of Hadoop-related infrastructure, understand the internal way of working, and a variety of operations that cannot be avoided. Based on this, most of the work of statistics and recommendation itself can be replaced by Python.
Similarly, spark-based application development does not necessarily require familiarity with Scala.
In practical work, I usually want to be able to get acquainted with the bottom-up implementations of Hadoop and spark, so the ability to solve problems is much stronger. Language is just tools, tools, tools!!! I am the people of the DMC, but I was just a novice, but also looking at the backs of Daniel's efforts

My mentor gave us the guidance that statistics on data mining, the equivalent of computer-based still have to have, at least have to master a language. For those of us who do not have strong programming ability, it is difficult for several teachers to learn python,java better than to get started, but for those who have strong programming ability, it is no harm to learn Java of course, after all, Hadoop is implemented in Java.

I do not have the data mining internship experience, do not know how the industry is looking, feel @ Ji Road The answer means all roads to Rome. Of course, a person's rhythm may not be suitable for another person, LZ may wish to consult the Ox Guide, the Great God, as the same as the novice who is interested in data mining, mutual encouragement! Because the Apache family Project Java is mostly not necessary, data mining includes a wide range of jobs, do not need to pursue every aspect
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.