Three big data portals

Source: Internet
Author: User

The popularity of big data makes many people want to develop in this direction and do some work such as data mining and data analysis. But where should I start? How can we quickly learn useful knowledge and skills? I think there are three entry points, which can be selected in order based on personal characteristics.

1 machine learning/data mining mainly relies on most machine learning algorithms. In recent years, machine learning has become popular due to the development of deep learning algorithms and the practice of self-driving cars and other applications, however, machine learning is a deep discipline, and there are not many schools dedicated to this course. For master students, it is easier to learn if they have studied optimization, but undergraduates must learn well, it requires a solid foundation of probability theory and mathematical statistics. I have read many of these books before and feel very painful, but I feel more and more that this is so important? In my opinion, if it is not a doctor, there is no need to keep machine learning/deep learning deep. Algorithms are very important, but programmers do not have to practice algorithms as they do with ACM players. We are learning machine learning to use it, and the basic algorithms have been developed. What we need to know most is how to use them, and just a few algorithms, I only learned how to use it several times, so I highly recommend that you learn and apply it to the actual situation. Based on your own interests, find some data and see if you can find any useful information, this also has a sense of accomplishment. Here I recommend a book: Machine Learning: practical case analysis. At the same time, it is recommended to learn a new language: R language. If you don't want to learn it, you can use both C and python. (R cannot be used for ultra-large scale data.) In the end, I don't think this part must be learned first. I don't need to be familiar with every algorithm. First, I must master one or two of them.
2Hadoop is basically synonymous with big data, because it provides a platform that allows us to process ultra-large data. What can we get after processing. Although hadoop is only a piece of software, it has a very complicated principle. We need to know how it divides big data into several computers and how MapReduce works. Then, it is how it works. We strongly recommend that you install hadoop by yourself (to configure the cluster, cut the Virtual Machine by yourself), and compile a small program on it to practice. Another feature of Hadoop is that it has many additional services, each of which has its own functions and is very complicated. However, Hive and HBase are very important, you also need to know how they work and how they are used. Since this part is mostly in practice, it is not so boring to learn from it, so I think this part can be used more time to familiarize myself with the principles and methods, and be familiar with the Linux environment, of course, the language is JAVA.
3. Database big data is also data and cannot be separated from the database. Many people do not have the foundation of the database, so this is also essential. We need to understand the characteristics of various databases, SQL statements must also be used skillfully. Even if big data is not popular, the database technology will be very important.

I think if you do all the above three points, you must be a comprehensive Big Data talent, and you can find a good job at will. However, I think that the methods and skills of data mining are on the one hand, and on the other hand, they are at the level of consciousness, that is, how much sense of smell you have on the industry and commerce, after you mine the information, can you go through your own thoughts and turn it into a point of view that is directly beneficial to the company and even to humans. Therefore, we recommend that you pay more attention to the development trends of the Internet and other industries at ordinary times. More comprehensive talents are real talents, and Big Data talents can be used without being an ordinary programmer.

After preparation, I will write some blog posts on big data. I am very happy to share my knowledge with you.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.