My opinion of Big data

Source: Internet
Author: User

Now big data is a hot word (buzzword), I also gather a lively. Now the front does not add a "big" word is embarrassed to say that the data, big is really one of the characteristics of big data. There have been data before, why has it been getting bigger lately? The development of information technology, the development of hardware, the development of network technology makes the acquisition of massive data, storage, processing becomes easy, so the data becomes larger. "Big" is only a feature of data now, there are MapReduce, Hadoop, Spark and other tools to cope with the large data. Words must be called Hadoop tools such as people do not really understand the data analysis, after all big data we also do data analysis, then we use the sampling method (sample).


Data analysis requires three aspects of knowledge, it skills, math, and domain knowledge. It skills include the use of new tools such as the one mentioned above, as well as the use of older tools such as databases, SQL, and Hadoop,mapreduce, which I think is not the most critical technology. Mathematical knowledge including probability theory mathematical statistics, linear algebra and other branches of mathematics, which I think is relatively more important, a data scientist can not use hadoop,mapreduce and other tools, but these mathematical knowledge must be known. To do data analysis, the data is not the most important, we want to answer by the data what the question is more important. Domain knowledge is used to raise these questions. Analyzing the commodity data of e-commerce, analyzing the protein and gene of bioinformatics, analyzing behavioral economics, need different domain knowledge. So the big data analytics team needs a member with three competencies.


Data analysis has a description (descriptive Statistics), inference (Statistics inference), application and several other aspects. The description is relatively simple, which is difficult to infer, predict, and apply. So the person who claims to be a big data expert depends on what level he is at.

I think the following three sentences are useful for people who are working on data analysis.

(1) Correlation does not represent cause and effect.

(2) Insight is more important than tools.

(3) The problem is more important than the data.


I read a few articles on the public number CSDN big data, data guests, intimacy (qinmishu.org), and an introductory public course on data science at Hopkins University, which sums up the view that it is a layman to analyze data, and to paint a big blueprint for himself. Instead of getting tangled up in a specific tool from the start.






My opinion of Big data

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.