Introduced
Gora is an open-source ORM framework that provides the memory data model and the persistence of data for big data. Currently Gora supports the storage of column data, key-value data, document data and RDBMS data, and also supports the use of Apache Hadoop to analyze Big data
Characteristics
Although there are many good ORM frameworks for relational databases, the data model-based framework such as JDO still has some shortcomings, such as storage and persistence of the column data model. Gora makes up for this problem by making it easy for users to model and persist large data at the time of memory, and to support Hadoop for analyzing big data.
Gora is a representation and persistence framework for big data, which has the following characteristics
Data persistence: It can persist column data, such as hbase,cassandra,hypertable; Key-value data for persistence, such as Voldermort,redis, SQL database persistence, such as MYSQL,HSQLDB, can also be HDFs storage of files.
Data access: Easy access to data using the Java API
Indexes: You can persist objects to lucene or SOLR, and you can use the Gora API to query
Analysis: You can use Apache pig,hive,cascading to analyze your data
Mr Support: native support for Hadoop's Mr Framework, which has been used on Nutch 2.0.
Support for data persistence, indexing and analysis using pig,lucene,hive
For more information, please refer to: http://gora.apache.org
Disadvantages
Currently, in addition to NUTCH2, the use of Gora in other open source products seems to have not seen
The NUTCH2 series extends the storage layer through Gora, with the option to store data in HBase, Accumulo, Cassandra, MySQL, Datafileavrostore, Avrostore, but some of them are immature.
At present, Gora still needs to be improved. For the pursuit of the ultimate performance of friends, nutch2.x is not stable, the proposed use of nutch1.x, using HDFS and MapReduce data localization and natural parallelism, can be optimized to very fast speed.
Apache Gora Introduction