Sometimes, the slow program may not be a problem with the algorithm, but the slow library usage; or it may not be slow in the library itself, but your writing is not efficient enough. After countless painful waits, I decided to compare these so-called efficient linear algebra libraries (OpenCV, although its goal is computer vision, but it also provides a wealth of algebraic computing capabilities) to see how their performance is. Someone has done similar things, such as OpenCV vs. Armadillo vs. Eigen on Linux revisited. This guy compared the efficiency of these libraries in various matrix operations and summarized them completely. However, this is not enough in the computer vision field, for example, the calculation of Similarity Measure. Of course there are many other methods. Here we will consider the most basic SAD (Sum of Absolute Difference) method. Simply put, we will subtract two matrices (or vectors) and obtain an Absolute value,. This computation seems simple, but I was surprised by the comparison results. Paste the code first. [Cpp] // PerformanceTest. h # pragma warning (disable: 4344) # define DEIGEN_NO_DEBUG # define DNDEBUG # include <emmintrin. h> # include <opencv. hpp> # include <vector> # include <iostream> # include <armadillo> # include <Eigen/Dense> # include "Timer. h "using namespace std; [cpp] // PerformanceTest. cpp # include "cetcetest. h "int main (void) {Timer timer; // timer double elapsedTime; // time in millisec Ond double res; // SAD value int I; // loop variable float bnd = 1e5; // loop times // Armadillo arma: mat arqp (4, 1); arma:: mat armonoclonal (4, 1); timer. start (); for (I = 0; I <bnd; ++ I) {res = arma: accu (arma: abs (armara-armonoclonal )); // res = 0; // for (int idx = 0; idx <4; ++ idx) // {// res + = abs (arpaa (idx, 0) -Arab (idx, 0); //} elapsedTime = timer. getElapsedTimeInMilliSec (); cout <"arma time: "<ElapsedTime <" ms "<endl; // Eigen: Vector4d eiA; Eigen: Vector4d eiB; Eigen: Vector4d eiC; timer. start (); for (I = 0; I <bnd; ++ I) {res = (eiA-eiB ). cwiseAbs (). sum (); // res = 0; // for (int idx = 0; idx <4; ++ idx) // {// res + = abs (eiA (idx, 0)-eiB (idx, 0); //} elapsedTime = timer. getElapsedTimeInMilliSec (); cout <"eigen time:" <elapsedTime <"ms" <endl; // OpenCV cv: Mat ocvA (4, 1, CV_64F); cv: Mat ocvB (4, 1, CV_64F); timer. start (); for (I = 0; I <bnd; ++ I) {res = cv: sum (cv: abs (ocvA-ocvB) [0]; // res = 0; // for (int idx = 0; idx <4; ++ idx) // {// res + = abs (ocvA. at <double> (idx, 0)-ocvB. at <double> (idx, 0); //} elapsedTime = timer. getElapsedTimeInMilliSec (); cout <"opencv time:" <elapsedTime <"ms" <endl; // pointer operation double * a = (double *) _ mm_malloc (4 * Sizeof (double), 16); double * B = (double *) _ mm_malloc (4 * sizeof (double), 16); int len = ocvA. rows; timer. start (); for (I = 0; I <bnd; ++ I) {res = 0; for (int idx = 0; idx <len; ++ idx) {res + = abs (a [idx]-B [idx]);} // cout <"I =" <I <endl;} elapsedTime = timer. getElapsedTimeInMilliSec (); cout <"array operation:" <elapsedTime <"ms" <endl; // release resource _ mm_free (a); _ mm_free (B ); ret Urn 0;} The timing function uses the cross-platform High-precision timing class provided by Franz Kafka. You can download the High Resolution Timer from the following address. The results obtained with the above Code in release are as follows: [plain] arma time: 0.87827 MS eigen time: 0.13641 MS opencv time: 179.599 MS array operation: 0.135591 MS can be seen that the time of Eigen and the time of directly using the array operation is equivalent, Armadillo time is 6 ~ Around seven times, and OpenCV has been unable to bear the eye, do not know what OpenCV is thinking, the gap is a little disparity. Next, we made another comparison and calculated the SAD part in the loop in the way similar to the array. The result is as follows [plain] arma time: 0.145423 MS eigen time: 0.134772 MS opencv time: 0.134362 MS array operation: 0.139278 MS this computation time is basically equivalent. Through these comparisons, we can draw two conclusions: 1. Although these databases may be more efficient in operations such as matrix multiplication, however, some low-level operations may be less efficient. 2. Accessing data through templates is not less efficient than Array Operations, and the performance is basically equivalent.