Big data will become the cloud of the year. This is the inevitable result: over time, the enterprise produces more and more data sets, including customer purchase preference trends, site visits and habits, customer review data, etc. so how can you put so much data into a comprehensive form? Traditional business Intelligence (BI) tools (relational databases and desktop math packages) are a bit out of the way to deal with such a large amount of data in a business. Of course, the data analysis industry also has development tools and frameworks that enable data researchers and analysts to dig large datasets and withstand information loads.
For larger companies, massive data processing is nothing new. Twitter and LinkedIn, for example, are already well-known users of big data. The two companies have each formed a distinct competitive advantage by tapping into their large data warehouses to identify trends. So what about the midsize enterprise CIO? Fortunately, there are tools available at your fingertips that allow you, or more specifically, your business analyst to support large data processing without chew.
One of these tools is free, the Java-based Apache Hadoop programming framework. The framework has gained significant market access in large data areas over the past year to 1.5. Global industry experts and users call Hadoop the de facto data mining standard. Given the fact that the Apache Hadoop1.0 version was released at the end of November 2011, looking at the performance of other large data products in existence, it is indeed surprising that Hadoop has received such recognition. Hadoop is so popular that Hortonworks CEO Eric Baldeschwieler predicts that it will handle more than half of the world's data in 2017 years. The chances are that Hadoop will somehow approach your organization in the coming year.
Hadoop is primarily intended for developers. Its main framework MapReduce support programmers to handle the large amount of data in a distributed computer group. The disadvantage is that it is a very heavy product. Also, Hadoop distinguishes the technical crowd that directly operates the data warehouse from the data consumers and data translators.
Given the budget constraints of a midsize enterprise CIO, here are some suggestions to help overcome the challenges of massive data:
Don't overlook the trend. Large data does not disappear, and large data analysis and conversion capabilities and data analysis trends cannot be ignored. Take the time to understand the functionality and structure of Hadoop and other large data products. Think about the way you have data that can bring improvements to your company.
Find budget space for qualified data scientists. These people are the percussion instruments of your bi symphony. Qualified data scientists in the market are very scarce. Even at the Hadoop World Congress last November, training became a big topic. Use the freedom of your training budget to hire the best people and keep their data analysis skills top-notch.
Understand storage hints for a large number of datasets. Large data is actually mining huge amounts of data from multiple locations and multiple databases at near-real-time speeds without being hampered by structural barriers. This complicates the way that storage works in your infrastructure. Can cloud storage be more flexible and agile for these slave tables? Work with your data mining strategy team to prioritize the types and quantities of storage requirements that utilize Hadoop processing power.
A toolset ready to use Hadoop. Understand Microsoft's debut in this field, experiment with Hadoop-excel and Hadoop-sql server integration to see what type of results you can deliver. Also take a look at IBM's tools to see which is better suited to your existing input on desktop and end-user software.
The contest for big data has already begun. You may have lagged behind in data mining changes. to ignore data analysis The CIOs are actually risking their careers. However, CIOs who have jumped into the big data field and extracted key insights will be in their hands all over the world.
(Responsible editor: The good of the Legacy)