We are entering the era of "big Data". According to IDC monitoring statistics, the total amount of global data in 2011 has reached 1.8ZB (1ZB equivalent to 1.3 billion Chinese per person a 1000G capacity of the computer storage of information), and this number is still doubling every two years, the rate is expected to http:// Www.aliyun.com/zixun/aggregation/33722.html ">2020 year Global will have a total of 35ZB of data volume, growth nearly 20 times times." The development of large data benefits from the following: (1) Moore's Law. Moore's law content is that the microprocessor's performance is increased by one time every 18 months, or the price drops by half, its core is accompanied by the high speed of hardware performance and reduce the cost of hardware, which provides the storage of large data in hardware and cost. (2) Cloud computing/pervasive computing. Cloud computing improves the efficiency of large data processing and reduces processing costs effectively. (3) Social networks. The development of social networks has greatly extended the dimensions and capacity of the data.
The Internet and large data are a pair of twins born of the development of information technology. The internet can easily and accurately record user-related data, leading the world into the era of data explosion. The development of Internet has greatly accelerated the data accumulation speed, and produced a large number of semi-structured, unstructured data, which also facilitated the development of relevant large data analysis technology. At the same time, the Internet is one of the most widely used in the field of data analysis, such as the search engine's technical basis is based on large data analysis algorithm design, internet users of precision marketing is based on large data analysis.
Data is an important asset to the site. It can be said that who can better grasp and use of data, who will grasp the future of the Internet. As a result, some big internet companies put a lot of human and material resources into the data, such as Alibaba's move to the data strategy to the platform strategy (CAT, Taobao), financial strategy (Alipay) place. With the accumulation of Internet related data, how to analyze and excavate relevant data correctly and effectively has become a major challenge for internet companies. At the same time, the data application also involves the user privacy problem, how to use the relevant data properly and reasonably is a problem that needs to be clarified. These problems will be discussed in more detail later.
Zhou Hongmei
, Renmin University of China, Ph. D. In electronic commerce, PhD, Master of Statistics, has been in cnnic and the consulting work, has served as the consulting Vice president and chief statisticians, E-commerce has a more in-depth study.