With the listing of Alibaba, Ma Yun became China's richest man, the cat's eye to see people about Alibaba's big data analysis of the contents of endless. Most of them use sensational remarks, accusing Alibaba of foreign listings leading to large data leakage, affecting national security, the economic operation of China caused incalculable losses.
Many cat friends from the past common sense and logic to determine that Alibaba's "big data" does not affect the national security of the conclusion, but the concept of large data concepts are relatively small, comment on the impossible to start.
This article hopes to use the most simple language description of large data, so that everyone can have a basic understanding of large data. At the same time, a large number of data statistics on the following applications: that is, through large data statistics analysis of the Netizen Sima 3 taboo on the impact of the Korean lawsuit.
Statistics are used to analyze probabilities and trends by statistical methods.
Since traditional methods cannot sample each terminal sample in detail, many economic and social data can only be collected by sampling survey.
For example, the ratings survey. Television stations were unable to get data on the programs of each family, so they had to sample.
In the network age, each network service provider does not need to do the sampling survey, but establishes the huge database, records all user's behavior characteristic, uses these characteristics as the data Foundation. This is the big data, in different ways for these data extraction, collation, analysis means is large data analysis.
The simplest example is if you open any of the stock software, it is based on large data. Each stock from the listing of all relevant figures are accurately recorded, from No omission.
Will Alibaba's big data affect national security?
Personally think not, and large data cannot be hidden. The reasons are as follows:
1. Alibaba's big data is the purchase record of each product, which only indicates the sales trend of the product.
2. Each company's own large data is the core assets of the enterprise, such as any U.S. investors, the U.S. State Department wants to obtain large data Alibaba, also need to pass the court's approval, even if it is, the enterprise has the right to refuse. The US State Department has repeatedly asked for user information to counter the need for counter-terrorism that Apple has rejected.
3. Large data itself has no way to hide, such as Alibaba's product sales are in each of the product display page clearly clear show. As long as the simplest network and programming technology, can write software, with the help of large server matrix, releasing countless reptiles, each page for information extraction and collation, to get large data.
For example, for example, if you want to obtain large data of the cat's eye, you can use 20m fiber broadband for about 2 days to complete the image preservation of the entire forum data.
4. If the Chinese government believes that large data has been collected by the United States, it can be reversed, as mentioned above, by retrieving information from the U.S. Amazon, Facebook, and pushing the principal-level crawler to extract data from the United States.
Next, through large data analysis Sima bogey to sue Han Han, the impact on Han cold.
It can be seen in the October 09, as shown in the red circle, the media reports from the previous flat like water, to a wave of small orgasms. Does Han Hei feel a little bit excited?
Hey, the picture above is only September 12 to October 11. Let's change the chart to see the big data for the last six months.
From this picture can be seen, Sima avoid suing Han's influence on the media.
Red Circle 1 is Han's later will be the period of publicity media coverage.
Red Circle 3 is Sima the media report.
So, the red Circle 2 than the red Circle 3 media coverage more strongly.
Red Circle 2 is September 11 media reports, that day Han cold what happened?
A glance at the following picture:
Originally Sima bogey prosecution influence is not as good as Han's wife born child, haha!
The above is the introduction and analysis of large data, as well as examples of applications.
Large data is a good thing, as long as the random mining data, we can make the cognition of things to break through the limitations of our own vision, at least not to behave very ignorant.
For example, many Han Hei think Sima avoid suing Han Han, will bring a fatal blow to Han cold.
But the big data tells us, our cognition is limited in the cat's eye, but the big data digs the entire Internet, the internet most media concerns the national father-in-law wife Health child more.
Similarly, large data can be beneficial in making decisions, and try to prove the following:
such as a lot of brain remnants of black powder imagined, about Han-han ghost of the public opinion is overwhelming, it's a big blow to Han. And in fact, through the excavation data to Han's ghost and Han to analyze can be seen, Han Han Ghost's attention (blue line) is always a straight line of the x-axis This is the voice of the questioning of Han-Han ghost basically unchanged, not to high And not to be low This also reveals the tireless "revelation" of this part of Han's voice has not been reduced, and no increase in Han's concern with his movements appear high and low fluctuations.
To take the least attention of Han, the concern index is 5720, Han-han Ghost's concern index is 132, the total concern ratio of 2.308%
If you are Han, or Korean Han's brokerage firm, then you will come to the following conclusions:
1. The rate of concern about Han's ghost is only 2%
2. The people who questioned Han's Han were resolute and persistent, but their doubts did not spread.
Then you will make the following judgments and decisions:
1. You can not change the people who questioned Han Han, they will, the future will adhere to the Han-han ghost.
2. The above population will not become more and less.
3. You don't have to try to change, because the cost of your attempt is not proportional to the profit.
4. The best way to do this is to question them, because they only account for 2%, and not much of any star's anti-fan.
Original link: http://club.kdnet.net/dispbbs.asp?id=10423842&boardid=1