Common misconceptions about large data

Source: Internet
Author: User
Keywords We we these we these very we these very big data we these very big data misunderstood

I often hear entrepreneurs say that their company produces/records a lot of data every day, although they haven't figured out how to use the data for the time being, but they're saving it all. They often say that through these data their products/services will be greatly improved, as if the data is the company's savior. I don't want to talk about the right thing, but I want to explain two common misconceptions about big data here:

Data is not equal to information

People often use data and information as synonyms. In fact, the data refers to a raw data point (whether through numbers, text, pictures or video, etc.), the information is directly linked to the content, need to have information (informative). The more data, not necessarily can represent the more information, more can represent the information will be increased in proportion. Let's look at two simple examples:

Backup。 Many people are now regularly backing up their hard drives. This is not a lot to explain, each backup will create a new set of data, but the information does not increase.

Information on multiple social networking sites. Many of us are active on a number of social networking sites, and the more we have on social networking sites, the more data we get, and the more information we get, but not proportionately. Not only are we forwarding friends ' tweets (or content on other social networking sites) to each other, but also because many of them are very similar, and some of them are very similar to each other, although the text is different.

Second, the information is not equal to Wisdom (Insight)

Well, now that we've gone over all the duplicated parts of the data, and we've consolidated the same data, now that we have all the information, is it going to be useful for us? Not necessarily, information should be converted into wisdom, at least to meet the three criteria:

can be deciphered. This may be a big data age-specific problem, and more and more companies are producing a lot of data every day, but they haven't figured out how to use it, so they store it temporarily unstructured (unstructured). The unstructured data is not necessarily deciphered. For example, you have recorded a customer on your site three times the interval: 3 seconds, 2 seconds, 17 seconds, but forgot to mark what this three time in the end represents what, these data is information (not repeatable), but not decipher, so it is impossible to become wise.

Relevance. We have explained the importance of relevance. There is no more detail here, nothing more than noise.

Novelty. This is similar to the example of the social networking site I Wenju, but the novelty of this is often not judged by the data and information we have. For example, a http://www.aliyun.com/zixun/aggregation/8002.html "> e-commerce company, through a set of data/information, analyzes the customer is willing to pay for the day of delivery products more than 10 yuan, The same content is then obtained through another set of completely independent data/information, in which case the latter is not novel. Unfortunately, most of the time, we can only judge the novelty of a large amount of data and information.

To say so much, is to express, in fact, we do not have the useful data we think so much-the big data itself is a gimmick. In today's era, an average start-up can produce more than 1GB of data a day, and a slightly larger company produces more TB of data every day. But before spending money on big data analysis, we need to be aware that data does not represent information or wisdom.

(Responsible editor: Schpeppen)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.