"Data has penetrated into every industry and business functional area today, becoming an important factor in production." The excavation and application of massive data indicates a new wave of productivity growth and consumer surplus. The arrival of the big data age was first proposed by McKinsey, a world-renowned consulting firm.
Indeed, this is the age of large data. On Wall Street in the south of New York Manhattan District, investment banking sells stocks based on popular sentiment; hedge funds analyze the sales of corporate products based on customer reviews at the shopping site; at the very end of the World Cup in Brazil, Google's [microblogging] cloud computing platform, through an analysis of a large number of team data, Successfully predicted the winner of every game in the top 16 World Cup competition.
Large data is after the cloud computing, the Internet of things, the IT industry, another major disruptive technological revolution, when the cloud for data assets to provide custody, access to the site and channel, the data is truly valuable assets. Whether it is the transaction information inside the enterprise or the commodity logistics information in the Internet world, its quantity and real time will go far beyond the existing enterprise IT structure and the carrying capacity of the infrastructure.
For a long time, the big data remained at the conceptual level, and the development of the Internet and mobile devices made the concept come true. What does big data mean? How to activate these data assets to promote business innovation and profit growth? These are the core issues of big data.
Large data value
In literal terms, big data is "big" enough first.
A group of data, called the "one-day on the Internet", tells us that throughout the day, the entire content of the Internet can be engraved with 168 million DVDs, and 294 billion more emails (equivalent to two years of paper letters in the United States); The number of phones sold is 378,000, more than 371,000 babies born every day in the world ... As of 2012, the magnitude of the large data increased rapidly, with data volumes jumping from TB (1024GB=1TB) levels to PB (1024TB=1PB), EB (1024PB=1EB) and even ZB (1024EB=1ZB) levels. Ibm['s research says that 90% of all the data available to the entire human civilization have been generated in the past two years. By 2020, the world will produce 44 times times the size of today's data. No wonder Amazon [Weibo] 's former chief scientist Andreas Weigend said, "The data is new oil." ”
Gege, 1th, the chairman of the store, said in an interview with the media that the value of large data is reflected in four stages. The initial data is primitive and fragmented, the surface does not see the law, after filtering and organizing into information, and then the relevant information integration and effective rendering into knowledge, the deep understanding of knowledge sublimation to understand the essence of things, and can analogy become wisdom. So data is the source, the cornerstone of decision making and value creation.
For different industries, large data has its own significance and value.
In the Internet industry, big data refers to the phenomenon that internet companies generate and accumulate user network behavior data in their day-to-day operations. To Baidu [micro-blog], Alibaba [micro-Bo], Tencent and other giants as the representative of the Platform-oriented enterprises brought together a large amount of users and businesses, gathered into the eco-system with tension. Their large data applications are no longer limited to the enterprise itself, but gradually become the blood that nourishes the whole ecosystem.
Alibaba Data Committee chairman Cheping has bare Alibaba's big data strategy. "In the data-operation phase, the data has value, you use it consciously, but you don't pay attention to it." And when you find that the data has been merged with the strategy, you realize that you have to be aware to collect it and manage it. "If you compare the big data of Alibaba to the ingredients, then cook your own raw materials, and compare the ingredients to other chefs, the two have a different focus on the material," he said.
Gao Zhao, vice president of Easy Media mobile operations Research and Development Center, said in an interview with the China News reporter that large data meant "data that would bring immediate purchase and return", which immediately facilitated the user's purchase of data.
"There is no doubt that Internet companies attach great importance to data, but they are slightly different on the level of large data." These enterprises are close to the end of final consumption, they themselves are data manufacturers, have a huge amount of user consumption data. They also have the ability to process data and data mining. "said Gao.
One side of the coin is the skilful use of large data by internet companies, while the other is that traditional industries are completely subverted by internet companies in the context of large data.
In the traditional sense, the financial system is based on the database, many financial business systems have been built, such as BI, information analysis. According to Coase theorem, however, direct financial transactions based on large data may result in the disappearance of intermediary value of financial institutions. It is assumed that the transaction costs will be much reduced when the Internet supports the full internet of financial markets and is entirely a direct transaction between the supply and demand parties.
Some experts believe that large data can improve the operational efficiency of financial institutions, reduce costs. If the unstructured data of internet banking and online insurance are planted on large data platform, it can provide comprehensive data analysis and integration for financial institutions in the background of historical data and new data increment.
"The importance of large data is self-evident for financial institutions," said Deng Jianpeng, a professor of law at the Central University of Nationalities and the Director of the Journal of Internet Finance. The use of business and customer data can be very useful for financial institutions to tap quality borrowers and identify risks.
Large Data Transformation
In the big data age, it is said that data will become a fundamental resource in the economy, like land, oil and capital. Data scientists are thought to be the hottest jobs for the next 10 years.
In fact, while you are still using social platforms such as Twitter as a lyric or an instrument of discussion, Wall Street's wealth-making gurus are digging through the internet's "Data riches", and have made good gains by using their pre-contract market trends. The number of petabytes of data is effectively transformed and utilized, and its value is renewed.
Alibaba, with its vast data ecology, has saved more than 100PB of processed data, equal to 104857600GB, equivalent to 40,000 Seattle Central Library and 58 billion books. For Ali Finance, the database is its core asset.
Based on the collection of massive enterprise data, Ali will collect and include the business Platform certification and registration information, historical transactions, credit records and other structured data, as well as user comments and other unstructured data, but also the introduction of external collection of electricity, bank credit and other data, to make loans or loans, the amount of accurate decision-making.
For Taobao sellers, Ali will synthesize their monthly turnover, shipping address, cell phone number, home address, gender and other data, as one of the dimensions of credit evaluation. Through the quantitative analysis of sellers, with the help of "Amoy data", "Data Cube", "Poly Stone Tower" and other data products, Alibaba accurate conversion and use of the platform's massive data.
Gao Zhao that the use of large data by companies is essentially a data-help brand to establish a precise insight into the consumer. Enterprises can obtain more new data through the original data, in order to improve product performance, to achieve product replacement.
At the same time, the process of large data transformation is also the construction of large data ecology. The ecology consists of data producers, data loggers, and data-processing analysts, and even users.
The upstream and downstream of large data ecology is the industrial chain of hundreds of billions of magnitude. "said Gao. The intelligent lighting system, which pervades all corners of the city, can be a collector of large data in the city, just as the smart hand ring can provide users with the exact health data.
The vast amount of data in the creation of Internet ecology, but also make the internet and financial boundaries more and more blurred, from data mining generated by financial innovation, is profoundly changing the traditional financial institutions operating mode. As a data-intensive industry, how the financial industry should make decisions through mining and analysis of data becomes an important issue today.
Deng Jianpeng said that the bank has a single customer data, there are many data banks do not grasp, such as the user's monthly water and electricity gas charges, travel by train plane, and online shopping footprint. Therefore, if banks can further enrich the scope of data, all-round development of customer data, for banks to develop more quality borrowers, further identify the risks are beneficial.
In fact, with the advent of the big data age, banks have also begun to exert their force. such as people's livelihood, Citic, Everbright and other banks have carried out supply chain financial services, to achieve from the "Offline manual processing" to the "online integration of multiple systems" change. The specific approach is to online integration and convergence of the various processes, the establishment of business, financial services and logistics services to connect the work of the channel, so that financing online available; At the same time, the integration and sharing of banks, core enterprises and upstream and downstream enterprises, and logistics partners fragmented information, so that supply chain management and services clearly visible.
At the same time, the bank also began to dabble in the Internet platform, such as the construction Bank of good and financial commerce, ICBC's electric business platform, are intended to large data.
Like Internet enterprises, traditional financial institutions will eventually form a unique large data ecology. Deng Jianpeng said that the traditional financial institutions themselves have the data, if completely rely on their own systems to dig and transform data, the cost is very high. So working with the internet giants is a good way to do it. Banks can use the data of Internet enterprises, carry out various services, and finally achieve a win-lose ecological circle.
"Attention" to big data
The more data is collected, the more variables there are, and the more "noise" it brings. In the ocean of large data, a considerable portion is useless data. Some of the data for the enterprise temporarily useless, some will never be useful. Large data itself good and bad, how to better distinguish data value?
Ebay[Lin, CEO of Greater China, said that those data, which now appear to be ineffective, may be digested as technology progresses in the next two to three days, and can only be stored first.
Big Data maker Teradata Company CTO Baoliming said, can not be ignorant of the seemingly useless data, they also contain value, they are accurate to say that the low value density data. The enterprise simply has not found its value embodiment, so it can be retained with a Low-cost storage server. For example, there are some habitual spelling mistakes when people search through the engine. These error data, while seemingly meaningless, can be found in a large number of user habits and patterns by collecting the data.
"There is no universal law of large data. Each enterprise needs to develop its own recipes to help them digest the data better. "said Gao.
Gao Zhao says that every business needs different data. For example, the amount of data that a car needs is small, but the value of a single data is high, and the data will be more valuable over time. In contrast, users of fast-selling products tend to continue to buy, so the fast-selling industry's large data systems are at the millions level. Therefore, the industry is different, the data mining cycle, the dimension is also different.
Some experts point out that although the data is true, but it will be biased, different analysis methods, there will be different interpretations, so it may not be completely objective. The correct information can be obtained by carefully handling the data in the correct way.
If the useless data will bring data noise, then the security problem is hanging on the big data head of the Sword of Damocles.
Based on large data can predict the state and behavior of people, not properly processed large data will cause great harm to user privacy. Social network research shows that user attributes can be found by group characteristics. For example, by analyzing the user's Twitter information, you can find the user's consumption habits and the likes of the team.
Some experts say that the user's privacy should be protected, for example, through data encryption, only to those who need to know the data to understand, contact or access the data.
Privacy intrusion problem exists not only in individual users, but also in enterprises. From the heart bleeding loopholes, ctrip and other events can be seen, hackers use large data analysis to the enterprise launched a more accurate attack. The high picture shows that in the large data level, the most important thing for enterprises is to build their own data platform. On this platform, only insiders can access and use it under certain permission.
Some experts say, reliable data storage, security mining analysis, strict operation and supervision is a big data age of enterprise security, security industry chain synergy becomes inevitable trend. The information security needs to be under the unified coordinated control of the government departments, and the enterprises of the industrial chain should open the safety data and technical capability.
"Large data security is an eternal topic, and it is important to reduce security risks through technical means." "Deng Jianpeng said.
Now, the information storms brought by big data are changing our lives, our work and our thinking, and we have opened a major transformation of our time. Victor Maire Schoenberg, who is hailed as "the first person in big data business applications", points out that the biggest shift in the big data age is to abandon the desire for causation and instead focus on the relationship.
Indeed, big data has created unprecedented quantifiable dimensions in our lives, and big data has become a source of new inventions and services, and more changes are poised to be made.