HDFS (Hadoop Distributed http://www.aliyun.com/zixun/aggregation/19352.html"> File System) is a core sub-project of the Hadoop project and is the foundation of data storage management in distributed computing. To be honest, HDFS is a Good distributed file system, which has many advantages, but also has some shortcomings, including: not suitable for low-latency data access, can not efficiently store a large number of small files, does not support multi-user write and modify files.
While the Apache Software Foundation was founded, HDFS has been looking for ways to improve its performance and usability, to be fair, it may be appropriate for pilot projects, unconventional projects, and less demanding environments, but for some Hadoop users , They have high requirements on performance, availability and enterprise features, and focus on the direct attached storage (DAS) architecture, especially the old version of Hadoop does not have a high-performance main node. Then the next eight products will replace HDFS The great program.
1. Cassandra (DataStax)
Not a complete file system, but an open source, NoSQL key-value store. This adds an HDFS choice to Web applications that rely on fast data access. In simple terms it incorporates Hadoop inside Cassandra, enabling Web applications to quickly access data through Hadoop, while Hadoop provides quick access to data flowing into Cassandra.
2. Ceph
Ceph is an open source, multi-pronged operating system, and some even consider it a successor to HDFS based on Hadoop because of its high-performance parallel file system since researchers were looking for this feature since 2010 .
3. Cleversafe: Scattered Storage Network
Cleversafe on Monday announced it will incorporate Hadoop's parallel programming technology and its own decentralized storage network. The rationale is that by distributing the entire metadata in a cluster (not relying on a single primary node, not relying on replication), Cleversafe said it is faster, more stable, and more scalable than HDFS.
4. GPFS (IBM)
IBM has been selling its parallel file system to users with high performance requirements, including the world's fastest supercomputer, launched Hadoop-based GPFS in 2010 and announced that GPFS will not share cluster versions much faster than Hadoop because
It runs at the kernel level, rather than running for example HDFS in the operating system.
5. Isilon (EMC)
EMC has been offering Hadoop distributions for a year, but in January 2012 it switched to a new HDFS enterprise-class solution, Oneilon's OneFS file system. Because Isilon can read NFS, CIFS, and HDFS protocols, a single Isilon NAS system can consume, process, and analyze data.
Luster
In a 2011 note, Xyratex, an HPC storage provider, wrote that Luster-based clusters are faster and cheaper than HDFS-based clusters.
MapR file system
The MapR file system is already well known in the industry. Not only MapR has announced that its own file system is 2-5 times faster (in fact, 20x faster) than HDFS, but it also has the image, snapshot, and high performance features that business users like.
8. NetApp Hadoop open program
NetApp redesigned the physical Hadoop architecture: Place HDFS in a disk array to get faster, more stable, and more secure Hadoop work.
Original link: http: //www.leiphone.com/122712-keats-hadoop-isnt-perfect-8-ways-to-replace-hdfs.html