Parallel mapreduce pls algorithm and its application in spectral analysis
Yang Huihua, Du Lingling, Lee Dexterity, Tang Tianbiao, Guo, Liang Jonglin, yiming, Luo Guoan
The partial Least squares (PLS) algorithm is a common spectral modeling algorithm, however, for the massive spectral processing, modeling and optimization time overhead are very large in a single computer. Based on MapReduce programming model, a parallel mapreduce pls regression algorithm is proposed, which includes two processes of parallel data normalization and parallel principal component extraction. This paper constructs the Hadoop cloud computing cluster platform on many common computers, and takes the near infrared spectrum processing as an example, and carries out the verification experiment of the algorithm.
Keywords-parallel partial least squares; near-infrared spectroscopy; MapReduce; parallel computing; Hadoop; cloud computing
Temp_12100114403856.pdf