First download the mongo-hadoop Adapter
Git clone https://github.com/mongodb/mongo-hadoop.git
Git checkout release-1.0
Go to the mongo-hadoop directory and find build. SBT to change hadooprelease in thisbuild to the following:
Hadooprelease in thisbuild: = "0.20"
Then run./SBT package (about SBT https://github.com/harrah/xsbt/wiki)
During operation, you need to roll over the wall.
After the running is complete, a jar file mongo-hadoop-core_0.20.205.0-1.0.1.jar is generated under the core/target directory, and then the MongoDB driver package is downloaded.
After downloading the wget -- no-check-certificate https://github.com/downloads/mongodb/mongo-java-driver/mongo-2.7.3.jar, you can start developing the mongo-hadoop program.
Run the built-in example: first import the data to MongoDB. The command is as follows.
./sbt load-sample-data
Then, create a new project, such as Treasury, and copy the source files and resource files in mongo-hadoop/example/treasury_field to the new project.
:
Then modify the MongoDB URL in the mongo-treasury_yield.xml file and store the collection
<Property>
<! -- If you are reading from mongo, the URI -->
<Name> mongo. input. uri </name>
<Value> mongodb: // 127.0.0.1/mongo_hadoop.yield_historical.in </value>
</Property>
<Property>
<! -- If you are writing to mongo, the URI -->
<Name> mongo. output. uri </name>
<Value> mongodb: // 127.0.0.1/mongo_hadoop.yield_historical.out </value>
</Property>
Modify TreasuryYieldXMLConfig. java as follows:
Configuration. addDefaultResource ("resources/mongo-treasury_yield.xml ");
Configuration. addDefaultResource ("resources/mongo-defaults.xml ");
And then package the project into a jar file.
Run hadoop jar treasury. jar com. mongodb. hadoop. treasury. TreasuryXMLConfig to run the hadoop program. The running result is data in mongodb.