With the maturity and development of large data technology, large data has become more and more widely used in commerce, and more and more examples of the interaction, integration, Exchange and transaction of large data are also increasing. In this paper, some problems of large data transaction and the necessity of building large data exchange are discussed and studied. We believe that the establishment of large data exchange is imperative market demand.
Currently, the following companies and institutions usually have large data:
Large entity commercial companies or e-commerce companies, such as large chain stores walmart,sears, or Amazon, Alibaba. Most of these companies have a large number of customers, long-term customer purchase records, customer payment history. These companies are most interested in customer shopping preferences and consumption habits. The big data applications of these companies now include recommending associated products and launching other new products and services.
Large service companies, such as banks, telecommunications services and other companies. Such companies also have historical consumer records of some aspect of their customers, such as a bank that may have a customer's financial account income and expenditure information, and a telecoms company with a customer's phone or network history. Such companies are generally interested in introducing new products and services into the industry, as well as in finding potential customers and reducing business risks, such as hot referral systems.
Large manufacturing enterprises, such as the Ford Motor Company. Because of their large customer base, such companies can often use large data technology and applications in launching new product services.
Large network services companies, such as Google, Baidu, Yahoo and so on. This kind of company because in its service industry monopoly, accumulates the massive user in the network fictitious world the behavior information. This kind of company can dig out a lot of valuable application products and services through induction and machine learning. At present, the best company to use large data is Google.google advertising system AdSense is the use of large data technology to achieve. In addition, Google can use large data to make some predictions, such as the outbreak of influenza, the prediction of political events. Google has further launched such large data applications as autopilot, and Google glasses combined with large data collection and applications.
Large social networking sites such as Facebook,twitter and other social networking sites such as LinkedIn and other active forums. Users around the world generate a lot of content on social networking sites every day. Facebook alone needs to deal with more than 500TB of social information each day. The data is currently being used by a large number of individual developers and technology companies to make a variety of business service recommendations or new products.
Public data on government departments and scientific research institutions, such as weather, traffic, roads, geology, environment, and progress in scientific research. In particular, the federal government has proposed to open up data from the federal government to the public, which includes automatic driving and intelligent traffic monitoring systems.
In addition to the large numbers of these commercial institutions, national institutions have a wealth of sensitive information on national security. This article only discusses commercial applications, so it does not discuss the application and interaction of this part of the large data.
A data expert who worked in Teradata said that many commercial companies store about 15% of their data, while the rest of the 85% data is stored on other external companies or websites. The technology of large data age makes the integration and interaction of large data and external data within enterprise more important.
At present, the application of large data by some commercial organizations is not limited to the analysis of the large data which they own, but also needs the big data of other aspects.
Example 1: Some financial firms, such as banks, want and use social information from their users to integrate with the information that the financial enterprise has, and to launch more new products and a better customer experience.
Example 2: A client of a medical insurance company who travels to a foreign city and publishes this information on Weibo, a medical insurance company that receives the client's permission to obtain this information from the social media (Weibo), according to the customer's personal special physical condition, The medical insurance company immediately sent the client a message to avoid some local food.
Example 3: A business that operates a hotel chain, in addition to their own web site, all over the room occupancy and other circumstances, I hope to be able to obtain other major tourism data, such as the number of tourist attractions, car rental company, the number of customers, rental cars, such as the level of changes, such as the hotel price pricing, business expectations and so have a strong
Example 4: A start-up that uses public information about urban traffic (government information), in combination with the real-time urban traffic condition uploaded by its user group (the user-generated information or social information produced by the interconnected terminals), the traffic route and the forecast arrival time are forecasted, so as to provide better service for the car traffic in the city.
Business companies to the external large data integration and interaction is the future trend, many foreign companies have begun to start this technology and services, such as Alteryx, Qlikview, tableau, factual and so on.
For large data information published by the Government or scientific research department, business firms can integrate and analyze such things as population surveys, GDP statistics, real estate information (which is open in the United States). Many large data technology companies have also been exerting force in this respect, such as factual, Infochimps, socrata and other companies.
By 2017, according to Gartner, about two-thirds of the big data consolidation projects will be the integration of external data beyond the corporate firewall.
There are at least the following kinds of large data interactions between commercial companies:
Mode one: two or more than two commercial companies, they are engaged in different service industries, have different aspects of customer information, their service industry has a more strong relevance, integration, interactive information for one or all parties to add new value.
Mode two: The business company to the social networking site customer personal information data integration, expect to bring new business growth point or implement better customer service.
Mode III: Commercial companies to public information on government departments, the large-scale data-level integration and interaction, the emergence of new business models, new business, or improve customer service.
Mode four: In the future, there will be new external data integration will produce value, such as a commercial company to carry out a large number of external weak related data integration, when the total amount reached a certain scale, will still produce the business company's own business has great value of information.
Large data types among commercial companies, in almost all cases, the integration of data between the two companies is only helpful to one side of the business, or to the other side of the business to help the value of the wrong, such as social media information for the mass merchandising companies. Therefore, the likelihood of buying large data is much larger than simple data exchange or data interchange. How to guide, standardize large data transactions, and provide trading methods, tools, etc., has become the relevant departments and large data technology companies to study the important issue.
We believe that in order to carry out large data transactions, a number of problems need to be addressed, such as:
How to guide more enterprises to open large data? The application of large data needs more enterprises to develop their own industry and field data, the more market participants, the greater the market choice and the value to be found. Our government should encourage more companies to open their big data. Large data between enterprises through more interaction and transaction, can be the greatest value.
How to protect the right and privacy of large data? Large data is often the integration of personal information, our country for personal information privacy protection has clear regulations and guidance, large enterprises pay special attention to the user's privacy protection. Large data used to trade between enterprises must comply with state laws and protect personal privacy and important information. Therefore, the market can provide large data should be more processed, hidden personal sensitive information, or directly according to the region, population age, income and other classification of the integration of information. The relevant departments may formulate regulations on large data transactions, and guide market participants to provide special protection and handling of national security information, personal privacy and trade secrets while providing large data.
How to better Open government department information? Governments are opening up more public information, setting up open large data platforms, and making better use of large data to serve and value society. My Government also has a bright side in this regard.
How do I find valuable external data? Commercial companies are interested only in the external data related to their business, and how to find strong or weak related external data is an important issue. Big Data startups can do something in the direction of providing tools and building an open API. Various cloud computing platforms can also provide APIs for large data. We believe that the industry, such as government or large data technology companies, should create some basic data processing, classification and analysis tools. For commercial companies looking for external large data integration and application, provide services and convenience.
How to measure the quantity and quality of large data? In general, a large packet, if it contains a large amount of data in one aspect, the longer the coverage of the crowd or service direction, the higher the value. But the same big data may have different value for different potential buyers. For example, the customer consumption record of an E-commerce site, for a large integrated sales company, and for a small single product sales company value, the difference is huge. How to classify the quantity and quality of large data products is a problem that must be solved in large data transaction.
How to standardize the reusable use of large data products? A large packet of data may be valuable to different external businesses, sometimes without conflicts of interest. Theoretically a large data commodity may be sold several times. Is there a right change in the use of large data transactions? Can it be sold again and resold? Can you sell a competitor to a buyer? And so on, should be clear and stipulated.
How to build large data commodity interactive technology platform, open API, unified API? Due to the large amount of data, specifications and many other characteristics, most of the time, the direct transfer of large data is very difficult or unrealistic, buyers often need to use the API for large data products, how to establish a unified API, It is also a great challenge to build a technology platform for large data interaction.
In addition, large data products, closer to the original goods, market participants each bring their own goods to a market for trading, similar to the stall. Because of the characteristics of these large data products, it is more necessary to establish a standard and convenient trading place.
Transactions of large data items may include the following procedures:
Sellers of their own large data preprocessing, to ensure that the large data used in transactions to comply with the relevant national laws and regulations;
Sellers describe their large data packages and describe past trading histories, including historical buyers ' industry descriptions;
Buyers in large data trading platform to find their own business to help large data products;
The buyer and seller of the use of data, the transfer of data, whether the data can be sold again (time, competition restrictions, etc.), whether to entrust Third-party technology companies to conduct data analysis, etc., to reach an agreement;
The buyer pays the transaction amount and the large data goods are transferred to the buyer;
The buyer will analyze or apply large data products to realize the value of large data products.
We believe that, due to the particularity of large data products, the establishment of large data exchange, can be authoritative for large data transactions to ensure the security of transactions, while providing tools and help to market participants.
Significant role of large data exchange:
Large data exchange can deepen national laws on large data products, especially to ensure that the buyers and sellers of large data transactions comply with national laws on privacy, national security, trade secrets and so on to protect consumers ' information security and other rights and interests;
Large data exchange can guide the specification of large data commodities, and guide the quantitative and pricing of large data.
Large data exchange should establish a certification system to ensure the authenticity and value of large data products;
Large data exchanges should provide technical assistance to market participants to help market participants find suitable counterparties;
Large data exchange should and can provide legal protection for the transfer and use of large data;
Large data exchanges should and can provide data security technical support for the transfer and use of large data;
Large data exchange, which should ensure the transfer and security of funds;
Large data exchanges can also open large data futures, that is, to deal with the large data that will be generated in future periods.
Since market participants are mostly commercial companies, large data transactions are more like physical commodity transactions than stock trading. As the transaction progresses and market participants increase, the variety of large data products becomes richer and attracts more market participants.
Participants in large data transactions include at least the following categories:
First-end sellers, that is, to provide a certain aspect of information on the large data sellers, such users may be engaged in the industry services, accumulate some data;
Terminal buyer, to the relevant industry service information by the demand of commercial services companies, buy large data to upgrade their own services or products;
Large data investors, such participants find or recognize the value of a large data commodity, you can first buy, and then buy the demand for large data terminal buyers;
processors, due to the high technology content of large data products, large data technology companies may first buy raw data, after processing, integration, and then sold to terminal buyers.
Market participants may have multiple trading identities, both as large data providers and as large data consumers. The transactions of various market participants can make the large data exchange market more active, increase the liquidity of the market, and attract more large data products to join and trade.
To sum up, the establishment of large data exchange, although there are a series of technical, legal and procedural problems to be solved, but we believe that this is a step-by-step, step-by-step solution to the process. We believe that the establishment of large data exchange is very necessary and feasible, the establishment of large data exchange is imperative market demand.
(Responsible editor: The good of the Legacy)