The industry has divergent views on the concept of large data. One of the most notable is the definition of the authoritative research institute Gartner: Large data is the ability to gather, manage, and process data for its users over an acceptable period of time, beyond the common hardware environment and software tools. Large data is not a simple data capacity, data speed, complexity and diversity are the key characteristics of large data.
Big data often comes from new data sources, where unstructured data is the absolute mainstay. Unstructured data refers to data that does not facilitate the use of database two-dimensional logical tables, including all forms of Office documents, text, pictures, XML, HTML, various reports, images, and audio/video information. The IDC report points out that global data volumes are doubling every 18 months, and that the amount of data produced globally is up to 40EB a year (1EB=1000PB), and these crazy growth figures are largely from unstructured data.
With the deepening of large data research, the role of unstructured data is becoming more and more obvious. According to a joint study by Capgemini Consulting and The Economist Information Department, 58% of executives rely on unstructured data analysis to make business decisions. However, unstructured data has long exceeded the storage and processing limits of traditional databases, and many vendors now treat it as a separate technical challenge.
To further help businesses deal with growing unstructured data, Red Hat, the world's largest open-source technology manufacturer, has launched open source storage software solutions for unstructured data--red hat Storage Server 2.0, also known as Red Hat Storage 2.0.
Red Hat Storage 2.0: Large Data management tool
Red Hat is a world-renowned open source solution provider that uses a community-driven approach to deliver reliable and high-performance cloud, virtualization, storage, Linux, and middleware technologies. As the first open-source vendor to break 1 billion dollars, Red Hat believes open source business models have unlimited potential. Wichs, president and chief executive of Red Hat, said in a visit to China last year that Red Hat would break 3 billion dollars in sales in the next five years.
Red Hat's most popular product is the Red Hat Enterprise Linux, which is the world's most widely used Linux products, but also to create the peak impact of Red Hat. However, the development of Red Hat is not limited to this, its product strategy follows the trend of it constantly change. In recent years, with the continuous deepening of cloud computing industry, Red Hat put forward to cloud computing as a breakthrough to virtual focus on the construction of hybrid cloud ecosystem product strategy. With the advent of the first year of 2013 data, Red Hat's product strategy also seems to be starting to tilt to large data, according to Wichs forecast, in the next 20 years, large data will become mainstream technology, will change the core value of many enterprises.
Storage 2.0 is the breakthrough for Red Hat to enter big data. Red Hat Storage 2.0 is an external expansion of open source storage software solutions, mainly for the management of massive unstructured data, is the industry's first and object storage easily integrated file storage solutions, effectively expanding to meet the requirements of unstructured data explosion. It can be configured on a preset platform or in a private cloud, public cloud, or mixed cloud environment to optimize storage-intensive enterprise workloads.
This open-source storage software comes from the Red Hat October 2011 acquisition of gluster--, an open source software start-up that focuses on Scale-out storage, developed Glusterfs open source file systems and Gluster storage platform software stacks as core technologies, Provides support for storage management and access to large data. Among them, Glusterfs open source file system is a set of scalable open source clustered file system, and can easily provide customers with global namespaces, distributed front-end and up to hundreds of PB level of scalability.
Glusterfs is similar to the HDFs in Hadoop, but its biggest advantage over HDFs is that it implements a large-scale expansion of network-attached storage by using its own elastic hash algorithm, without the use of metadata to implement this process. Metadata is the data that is used to describe data, and in some cases may be the culprit for HDFS failure or the impediment to linear scalability. This feature of Glusterfs greatly improves the speed of data addressable access, and eliminates the bottleneck of other large data systems, such as single point failure, data redundancy load and infinite expansion.
In addition, the Red Hat Storage 2.0 is also perfectly compatible with Apache Hadoop, Storage 2.0 provides large data storage management and access, Hadoop provides the technical framework. Glusterfs can be integrated either by Hadoop HDFs or as an alternative to HDFs to achieve faster file access. The Red Hat storage Hadoop plug-in provides an entirely new storage option for the enterprise, providing an enterprise-class storage feature for users while ensuring API compatibility and local data access to Hadoop.
Red Hat Big Data solution: Maximize Open source
Red Hat Storage 2.0 provides users with High-performance and scalable solutions for storage management and data access at large data levels. In addition to storage, Red Hat's big Data solution also includes Linux, JBoss Middleware, enterprise virtualization and other product families, and through the open mixed cloud model to meet the user's large enterprise data requirements. Specifically, the following solutions are included:
• Red Hat Enterprise Linux: As the flagship product of Red Hat, Enterprise Linux is the best platform for managing large data. Because Red Hat Enterprise Linux is good at leveraging distributed systems to address key requirements for large data, users can build Red Hat storage on Enterprise Linux systems for cost-effective, highly scalable, high-availability configurations. At the same time, based on Red Hat Enterprise Linux can also develop a variety of safe and reliable, easy to expand the large data applications, there is reason to further transform the data into commercial value.
• Red Hat Enterprise virtualization: Red Hat Enterprise Virtualization (Rhev) is a complete virtualization management solution for server and desktop virtualization, and is the first mature, fully open source enterprise virtualization platform. Compared with proprietary virtualization vendors, Rhev provides a real strategy virtualization alternative to businesses looking for better TCO, faster ROI, rapid balance of payments, and avoidance of vendor lock-in. The combination of enterprise virtualization and storage enables users to more securely access shared storage pools managed by Red Hat storage, as well as reduce operational costs, increase scalability and availability, and improve performance for the enterprise.
• Red Hat Open hybrid cloud: The Open hybrid cloud is a red Hat cloud computing product strategy that enables easy migration of large data workloads between public and private clouds. Cloud computing and large data are closely related, cloud computing provides a good platform for large data storage and processing, can mobilize a lot of resources in a short time to deal with large data, large processing will bring more applications to the cloud, but also to promote the development of the cloud computing market.
• Red Hat JBoss middleware: Red Hat JBoss Middleware is an open source platform for service-oriented architecture (SOA), providing strong technical support for creating and deploying new large data applications, and able to interact and integrate with large data technologies such as Hadoop and MongoDB to help enterprises seize large data opportunities To meet the challenges of big data.
Red Hat storage combined with Enterprise Linux, Enterprise virtualization, JBoss Middleware and open hybrid cloud composed of a complete large data ecosystem, to provide users with flexible, secure large data solutions to meet the enterprise's current and future needs of large data.
Summary
Overall, the Red Hat product's biggest feature is "open source", and will open source to achieve the ultimate. Open source is the soul of large data, with the advantage of open source, Red hat large data solutions play a huge potential. In the future, with the development of Red hat in large data field, the ecosystem with Red Hat storage will provide one-stop large data solution. By then, the Red Hat cloud computing and the big Data product strategy will be two-pronged, complement each other, build open source technology innovation platform together.
(Responsible editor: The good of the Legacy)