How to Use Hadoop MapReduce to implement remote sensing product algorithms with different complexity
The MapReduce model can be divided into single-Reduce mode, multi-Reduce mode, and non-Reduce mode. For exponential product production algorithms with different complexity, different MapReduce computing modes should be selected as needed.
1) low-complexity product production Algorithms
For the production algorithms of low-complexity remote sensing products, generally only one MapReduce computing task is needed. In this case, select the multi-Reduce mode or the no-Reduce mode.
When the input data involved in the index product algorithm only contains one file (for example, to produce a global environmental monitoring index product, only the Land level 2 product data in one HDF format is required ), select the non-Reduce mode. The Map stage is responsible for implementing the core algorithms of the index product. The specific computing process is as follows:
When the input data involved in the index product algorithm contains multiple files (for example, to produce a grassland drought index product, different products such as the surface reflectivity, surface temperature, and rainfall need to be used ), select the multi-Reduce mode. The Map stage is responsible for organizing input data, and the Reduce stage is responsible for implementing the core algorithms of the index product. The specific computing process is as follows:
2) product production algorithms with high complexity
For the production algorithms of highly complex remote sensing products, a MapReduce computing task is often difficult to meet the production requirements. In this case, multiple MapReduce tasks are required to complete the production tasks of the product. In this case, you can use the Oozie workflow engine to control the workflow of multiple MapReduce computing tasks and solve the dependency problem between tasks. For the oozie introduction and installation tutorial, refer to another blog:
Spark subverts the sorting records maintained by MapReduce
Implement MapReduce in Oracle Database
MapReduce implements matrix multiplication-implementation code
MapReduce-based graph algorithm PDF
Hadoop HDFS and MapReduce
MapReduce counters