Three big data portals

Source: Internet
Author: User

The popularity of big data has made many people want to develop in this direction, such as data mining and data analysis. But where should I start? How can we quickly learn some practical knowledge and skills? I think there are three entry points that can be selected in order based on personal characteristics.

1. Machine Learning/data mining mainly relies on most machine learning algorithms. In recent years, machine learning has become popular due to the development of deep learning algorithms and the practice of self-driving cars and other applications, however, machine learning is a very deep discipline, and there are not many schools dedicated to this course. For master students, it is easier to learn if they have studied optimization, but undergraduates must learn well, it requires a solid foundation of probability theory and mathematical statistics. I have read a lot of books and think it is very painful. But I think it is more and more important? In my opinion, if it is not a doctor, it is not necessary to keep machine learning/deep learning very deep. Just as algorithms are very important, programmers do not have to practice algorithms as they do with ACM players. We are learning machine learning to use it, and the main algorithms have been developed almost the same. What we need to know most is how to use them, and just a few algorithms, I only learned how to use it several times, so I highly recommend that you learn and apply it to the actual situation. Based on your own interests, find some data to see if you can find any useful information, this also has a sense of accomplishment. Here I recommend a book: Machine Learning: practical case analysis. At the same time, it is recommended to learn a new language: R language. If you don't want to learn it, you can use both C and python. (R cannot be used for ultra-large-scale data.) In the end, I don't think this part must be started first, and you don't need to be familiar with every algorithm. First, you must master one or two of them.
In practice, hadoop is basically synonymous with big data. It provides a platform that enables us to process ultra-large data, regardless of how to process and what we can get after processing. Although hadoop is only a piece of software, it has a very complicated principle. We need to know how it divides big data into several computers and how mapreduce works. Then, it is how it works. We strongly recommend that you install hadoop by yourself (to configure the cluster, cut the Virtual Machine by yourself), and compile a small program on it to practice. Another feature of hadoop is that it has many additional services, each of which has its own functions and is very complicated. However, hive and hbase are very important and need to know their working principles, and usage. Because this part is mostly practical, and it is not so boring to learn from it, I think this part can be used more time, master the principles and methods, and be familiar with the Linux environment at the same time, of course, the language is Java.
3. After all, big data in databases is also data and cannot be separated from databases. Many people do not have the foundation of databases, so this is also indispensable. You must understand the characteristics of various databases, SQL statements must also be used skillfully. Even if big data is not popular, the database technology will be very important.

I think if you do all the above three points, you must be a comprehensive Big Data talent, and you can find a good job at will. However, I think that the method and skill of data mining are on the one hand and on the other hand, at the consciousness level, that is, how big your sense of smell is on the industry and commerce, after you mine the information, can you go through your own thoughts and turn into a point of view that you have direct advantages for the company and even humans. Therefore, we recommend that you pay more attention to the development trends of the Internet and other industries at ordinary times. More comprehensive talents are real talents, and Big Data talents are not just simple programmers.

After preparation, I will write some blog posts on big data. I am very happy to share my knowledge with you at the same time.

Three big data portals

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.