Hortonworks a new Hadoop object storage environment--ozone to extend HDFs from file systems to more complex enterprise tiers.
Some members of the Hadoop community today proposed adding a new object storage environment to Hadoop, which would enable Hadoop to store data in the same way as cloud storage services such as Amazon S3, Microsoft Azure, and OpenStack Swift.
As more and more companies adopt Apache Hadoop,hadoop have become "data lakes" of various enterprise data, Hortonworks, a Hadoop publisher, said in a Tuesday blog post. Many of these data types suitable for large data analysis applications are ideal for use with HDFs, but in some industry application cases the HDFS is difficult, which requires extending the storage dimension of Hadoop. For example, object storage or Key-value storage has the reliability, consistency, and availability of the Hadoop HDFs, but unlike the requirements for syntax, APIs, and scalability, Hadoop's storage systems need to evolve to generalists to accommodate new storage application requirements.
Data type data sources related to large data analysis in different industries: Hortonworks
Hortonworks a new Hadoop object storage environment--ozone to extend HDFs from file systems to more complex enterprise tiers. (Editor's note: Although Hadoop already supports Third-party object data storage, such as OpenStack Swift in the Amazon S3 Cloud and datacenter, the native object storage capabilities of Hadoop are still valuable to developers who want to use Hadoop as a storage tier for future applications.) )
In the past, the HDFS architecture separated Meta data management from the data storage layer into two separate layers. The file data is stored in a storage tier that contains thousands of storage servers (nodes), and the metadata is stored in the file metadata layer-a relatively small number of server clusters (name nodes). HDFs This separation allows the application to gain high throughput expansion space when reading and writing data directly from the storage disk.
Ozone enables the HDFS block storage layer to further support the system data that is not of a file nature, and the HDFs file block schema will also be able to support storing key values and objects. Similar to HDFs namespace metadata, ozone's metadata system is also based on block storage layer, but ozone metadata is dynamically allocated to support a large number of bucket spaces. (pictured above)
Hortonworks that HDFS will naturally evolve into a complete enterprise data storage system, and ozone will be open source in the Apache Project (HDFS-7240).
Hortonworks has planned the following goals for ozone:
Scalable support for trillions of data objects.
Wide support for various object sizes, from several KB to several 10 trillion.
Ensure no less than HDFS reliability, consistency, and availability.
Data block layer based on HDFs.
Provides a rest based API to access and manipulate data.
To achieve higher availability, data replication between data centers can be supported.
(Responsible editor: Mengyishan)