Microsoft enterprise-wide data analysis strategy: integrating Hadoop

Source: Internet
Author: User
Keywords Microsoft big Data can

A few months ago, Microsoft announced its own version of the Hadoop release hdinsight for Big Data management, analytics and mining. The reporter contacted the senior product marketing manager, Val Fontama, of SQL Server, hoping to learn more about Microsoft's corporate-class data.

About the growth trend in the size of datasets in an enterprise:

The ocean of data has been growing. There is a forecast that the volume of business information is doubled each year. For example, Gartner finds that the amount of information around the world grows at a rate of at least 59% per year, and about 85% of the data is "unstructured"-such as video clips, RFID tags, and web logs. These unstructured data are not easily handled by traditional data management systems. In addition, in many scenarios, customers are discovering that data growth rates are increasing as they collect new data in real time.

Customers will need a modern data platform that is compatible with the development of the business and the data collected. For global companies, big data creates a lot of business opportunities for finding new and viable insights from collected data, whether structured or unstructured. Because in the end, the biggest prospect for big data is to drive smarter decisions from data. Intelligent decision-making requires the collection of views from all types of data.

Hdinsight is Microsoft's solution to Big data:

Microsoft hopes to promote the application of Hadoop by providing portability, superior performance, security, and ease of deployment, through the Hadoop release, which supports Windows Server and Windows Azure. Microsoft will also enhance the security of Hadoop by integrating Active Directory in Hdinsight. This will enable IT departments to use the same consistent security policy for all IT assets, including the Hadoop cluster.

In addition, through integration with System Center, Hdinsight simplifies the management of Hadoop and enables IT departments to manage Hadoop clusters, SQL Server databases, and applications on the same panel.

The Windows platform application based on Hadoop integrates Microsoft's Business Intelligence (BI) tools such as Excel, Power view, and PowerPivot to easily analyze a large amount of business information to create unique, differentiated business value.

To achieve the absolute compatibility with Apache Hadoop, Microsoft's Hadoop release version hdinsight is based on Hortonworks Data Platform (HDP). As a result, customers are able to move their mapreduce jobs from their Windows servers to the cloud, or even to the Apache Hadoop release version running on Linux. No other vendor has yet provided this functionality. In addition, these features are available on Windows Server and Azure platforms, enabling customers to easily extract viable ideas from data using familiar tools such as Excel, PowerPivot for Excel, and power view.

How SQL Server adapts to this solution:

One of the most important differences between SQL Server 2012 and SQL Server 2008 in helping businesses handle large datasets is compatibility with Hadoop. Hadoop allows users to handle a large amount of structured and unstructured data and get ideas quickly from it, and because Hadoop is open source, it costs less. Hadoop's compatibility with SQL Server 2012 was developed by Microsoft in collaboration with Hortonworks, and Microsoft recently announced Microsoft Hdinsight Server and Windows Azure hdinsight Service is already available for preview, which allows users to use the Hadoop connector developed by Microsoft to get the best ideas from the data. By connecting SQL Server to Hadoop via hive ODBC driver, customers can now analyze various types of data, including unstructured data, in SQL Server 2012 using Microsoft's BI tools such as PowerPivot and power view. In addition, using the new data Quality Services in SQL Server 2012, customers can improve data quality by converting raw data into reliable and consistent data that is appropriate for modeling.

Microsoft recently announced some new features in office 2013 and described how developers should use these features to build services that build applications and process data. It is not surprising that Microsoft itself is using this to provide large data services in Excel:

Excel is one of the main client tools that support large data analysis on the Microsoft platform. In Excel 2013, our main tools are the data modeling tool PowerPivot and the data visualization tool power View, and just as they are all built in, no additional downloads are required. This enables users at all levels to use the familiar Excel interface for self-service bi-analysis.

With the Excel Hive plug-in, our Hdinsight service is easy to integrate with the BI tools in Office 2013, enabling users to easily analyze massive, structured or unstructured data with familiar tools.

In addition to Excel, Microsoft offers other large data interaction tools: BI Professionals can use bi Developer studio to design OLAP cubes or in SQL Server analysis Design a scalable PowerPivot model in services. Developers can continue to use Visual Studio to develop and test MapReduce programs written in. Net. Finally, it operators can manage the Hadoop cluster on hdinsight using the System Center they currently use.

Overall, Microsoft's strategy seems to be to provide the easiest way for customers to use large data-extending existing tools, such as SQL Server and office, so that they can seamlessly handle new data types, allowing companies to take advantage of existing investments when dealing with new businesses.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.