In the 2012, if there was a concept that was hotter than cloud computing, that would be big data. Large data applications are widespread, especially with the advent of the Internet, mobile Internet and various sensing networks, which are most consistent with the current characteristics. How to find value in the data analysis to guide the enterprise's daily business decision? Big data will give an answer. Recently, Taiwan Jingcheng Group Cloud Center and ETU brand director Cus Yu told reporters that the current large data is still in the early stages of development, the entire Asian region behind the United States, and the United States to talk about large data only 4, 5 years of time.
Mortem large data
In China to talk about large data, more stay in the data collection, storage and processing level, but also lack of real data analysis and insight application cases. Courio said that the big data market to enter a more robust growth period, the need to complete from the project market into the programme market across, optimistic that perhaps 2014 can be crossed, or later. From the Asian region, the real big Data computing work of the application, the number of projects are very small.
Mortem the value and prospect of large data, we must have a clear understanding of the definition and connotation of large data. At present, the structured data involved in enterprise management software only accounts for 15% of all the data in the traditional enterprise, and the remaining 85% comes from semi-structured and unstructured data in various information activities, E-commerce, Internet of things, or outside social network in the enterprise. This requires the use of this large data processing platform for further analysis.
In general, three "V" can be used to interpret large data more accurately: The first V is volume, must be large to a certain extent; the second v is velocity, only real-time, to show the latest data analysis benefits; The third v is produced, which is about multiple data formats or structures. These three v together, can draw a triangle line, each use scene can draw an ellipse on these three axes, different manufacturers may draw the oval shape is not the same, these are belong to the big data to analyze the scope of processing.
However, Courio that the current industry in the understanding of large data there are multiple misunderstandings, for example, large data is not storage technology, it must be stored and processed at the same time, and the second is that large data originated from the Internet, but not only for the Internet, in any industry has a great application value; Three is big data not just bi, Traditional bi is good at dealing with structured data and poor ability to solve semi-structured and unstructured data. It's easy to get lost in big data without getting out of these myths.
Hadoop All-in-one Machine
Since most industries and enterprises are not clear about the need for large data, it is difficult to achieve rapid popularization and application by pushing the landing of large data applications on a single project. Large Data Integration machine ETU appliance, can provide enterprises from software, hardware to data analysis, processing integration of the solution, the standard product form can promote the application of large data. Therefore, in the middle of this year, Jingcheng Group Cloud Center officially launched in Asia ETU brand large data products, for each application scenarios to provide large data solutions.
As a one-stop product of Hadoop, Etu appliance is called "appliance" because it is an all-in-one machine product, neither hardware nor software, but a combination of soft and hard, highly optimized equipment. ETU Appliance users do not need high Hadoop technology to be able to quickly deploy, compute and store one, and deploy 100 nodes within 10 minutes. This greatly shortens the enterprise application of large data cycle.
The smallest ETU appliance cluster architecture consists of a master node and two working nodes. Data and tasks are run on the work node, the master node is responsible for scheduling the entire cluster resource allocation. When the data volume is more and more large, the current architecture and capacity can not be processed, just add a work node, in the cluster has been running without downtime can be directly expanded, can expand up to 2000 units around the scale. Compared with the large data integration products of Oracle and IBM in the market, ETU has more flexibility, is a customized product, and Oracle products lack of flexibility.
Large Data applications
Because the analysis of large data is more urgent, the data is concentrated on the unstructured data level, so it is the most obvious field of large data application demand. ETU know intention in this also issued a precision recommendation system ETU Recommender, this is based on large data integration technology, using distributed cloud computing advantages, collect a large number of user behavior logs, and for different users to produce personalized recommendations, from data collection, analysis, to the rendering of the result set fully automated, No need for market personnel involved, is entirely based on user browsing and purchase of real behavior to produce recommendations, at the same time, based on large data in the horizontal expansion of the characteristics of the electrical business users with the growth of traffic, at any time to expand the system's computing and storage capabilities.
It is reported that this product has been applied in the field of electrical business in China. Cus said to reporters, in addition to the field of electrical business, operators can now use these large data processing to do a lot of work, for example, they want to know through the 3G mobile network to the end of these users, regardless of the mobile phone or the use of the ipad tablet computer, in the end where have gone to see what information? This is for operators to do follow-up value-added services. For network or telecommunications equipment optimization, equipment is more expensive, if it can be optimized to a certain extent can be to save operator costs. So these things are actually places where the big data itself can be used.
In addition, like the financial industry may be taken to do risk management, the bank can according to the amount of credit through the large data processing analysis, can set the amount of users. Again, like the law part, different supervision units in different countries have different supervision needs, these can be obtained from large data, when there is a certain risk, the user how to support themselves and not violate, must extract evidence. In addition, large data in medicine, manufacturing and other industries have a wide range of applications, but now in the early stages of development, enterprises on the application of large data has not entered the stage of large-scale investment.
At present, large data technology and solutions are more and more, but in the client side seems to be a tepid situation. What is preventing the application of large data from advancing? In this regard, Courio that the main three points led to this situation, one is the lack of successful application of the guidelines in the industry; the second is that the enterprise is unfamiliar with the big data it technology, unlike the internet company brave to try new technology, such technology, usually have a few years to verify, test whether this technology can eventually commercialization, is now in the early stage, the third is the talent gap, large data talent ecology is not perfect.
(Responsible editor: The good of the Legacy)