The wave of data that began with the analysis of some Web service providers is spreading to the general business. This is because, even if the current conditions are not perfect, but in order to maintain competitiveness and maintain normal business, to make full use of large data. In this case, the following article will introduce the enterprise's intelligence of the top person to have to master the 5 items of large data.
▲ Screen 1:hortonworks Web <http://hortonworks.com/>
First, the situation requires enterprises, grasp the big data.
Analysis of large data, initially by Google, Yahoo and Twitter and other U.S. Web service providers began to do. This is because these enterprises need to maximize the use of information resources from users. However, from the current trend, we believe that even the general enterprises in order to maintain competitiveness and business operations normally, one day in the future will be a good use of large data.
Jo Maitland, head of Gigaom Research in the US, points out that some companies in the United States, despite their small size, have a lot of data, such as a hedge fund that has a lot of data. In addition, McKinsey and Company, a consultancy in the United States, recently predicted that in the next few years, a wide range of industries, including public institutions, health management, retailing, and manufacturers, would be able to obtain the corresponding financial profits by analyzing large data.
Eric Baldeschwieler, US yahoo Hortonworks, the Hadooplinux release, said that a climax is now emerging to realize that mastering big data is an essential project for businesses. This trend has universal significance and is applicable to customers in many fields. This is because the collection of transaction information and analysis of it will enable enterprises to better understand the consumer trend toward consumption. Because these data can be used in addition to new product development and new services, but also to use the fastest speed to solve future problems, to provide convenience.
Second, information and data that are useful to businesses are ubiquitous in everything.
Some people may feel that there is not a single most important amount of data available at hand, but we can say that the day will come when you can master the data. The Baldeschwieler points out that the big data you want is simply "formed by collecting scattered data."
For example, a computer Operation log folder (log file) on the server may be a large data. The server tells everyone about the different sections of the different areas that he's looking for. By tracking the data, you know what the customer needs. Parse the action log the matter itself is very early, but in the new situation, it will be at a higher level of analysis, its analytical precision will be further improved.
At the same time, the data taken from the sensor will become large data. In recent years, some securities analysts have already begun to talk about how low-cost sensors connected to the network, the continuous issue of data information, in order to understand the "material" of the flow and consumption process, that is, "things" network phenomenon. Now we can imagine that this data can come from the information provided by vending machines for vehicles, bridges and beverages. Microsoft Kevin Dallas points out that the real value of the IT industry's equipment is to make it collect data and analyze the information it collects in order to improve business efficiency.
Third, to have new expertise on large data.
"The most important thing in introducing a large data analysis system is to recruit professionals who are proficient in the use of analytical data tools," says James Kobielus, a Forrester Research analyst in the US.
Large data is interdependent with the model of Entity data (solid deta). As such, James Kobielus points out that companies must focus their work on data science. What we need in this field is statistics mining and text mining and professional and psychological professionals. That is, even securities analysts who are familiar with business intelligence tools may not have these skills.
It is also possible, of course, that there is currently a lack of professionals to master these skills. It is reported that as of 2018, the United States with highly analytical skills of the professionals will be missing 140,000 to 190,000 people. In addition, there will be a shortage of managers and securities analysts with analysis and full use of large data and efficient decision-making, with a vacancy of 1.5 million.
Another essential skill is the ability to manage the large amount of hardware needed to preserve and classify data. James Kobielus also points out that managing 100 servers and managing 10 servers are two different things. We recommend the employment of computer-management personnel from a number of local universities and research institutes.
Four, there is no need to compile large data beforehand.
▲ Screen 2:mapr Web <http://www.mapr.com/>
If the CIO can skillfully master a technology, that is, to EDW the business intelligence Data Warehouse, to develop a rigorous set of detailed plans, it will not be difficult for him to use large data. In other words, the law of large data is the first to collect the data, and then consider how to use and make full use of the data.
For data related to a business intelligence database, it is necessary to design a data log before collecting data. In this regard, Jack Norris, the MAPR director of the United States, believes that it means it is necessary to master the object in advance. He added that this was done to prevent data from being generalized and thus losing data. If you have a change in your original idea, then it is too late to analyze the data afterwards.
Jack Norris further points out that we can understand the knowledge base of large data as a kind of waste storage, and if necessary, we can take it for analysis, many companies don't even know what to look for, and wait until the data is collected before they begin to understand what to do.
The big data is not synonymous with Hadoop.
Many people believe that large data is the platform for data analysis of Hadoop. The James Kobielus, however, argues that Hadoop is, of course, the software technology that most companies value in relation to corporate budgeting and talent recruitment, but it is also possible for companies to end up using another product.
Lexusnexus, a large company based in the US legal consultancy, recently-HPCC BAE's analysis platform to the public. We need to know that Lexusnexus is very good at Big data analysis. In addition, American Enterprise MarkLogic also uses its own database of unstructured data-marklogic Server, with large data link. And recently, such as Server system log files, used to search and analyze the data produced by the machine, Splunk is also very popular at present. Curt Monash of American Enterprise Monash study also noted that Splunk would be able to take advantage of any data taken from server system log files.