Blas library Learning

Source: Internet
Author: User

First, the naming rules:

There are three levels of blas operations,

 

Level 1
Vector operations, e.g. Y =/alpha x + yVector operations
Level 2
Matrix-vector operations, e.g. Y =/Alpha a x +/beta yMatrix and vector operations
Level 3
Matrix-matrix operations, e.g. C =/Alpha a B + COperations on Matrices and Matrices

Each routine has a name which specifies the operation, the type of matrices involved and their precisions. Some of the most common operations and their names are given below,

 

 

 

Dot
Scalar Product, X ^ t y
Axpy
Vector Sum, /Alpha x + y
MV
Matrix-vector product, A x
SV
Matrix-vector solve, Inv (a) x
Mm
Matrix-matrix product, A B
Sm
Matrix-matrix solve, Inv (a) B

 

The type of matrices are,

 

Ge
General
GB
General band
Sy
Symmetric
Sb
Symmetric band
SP
Symmetric packed
He
Hermitian
HB
Hermitian band
HP
Hermitian packed
Tr
Triangular
TB
Triangular band
TP
Triangular packed

Each operation is defined for four precisions,

 

S
Single real
D
Double real
C
Single Complex
Z
Double Complex

Thus, for example, the name sgemm stands for "Single-precision general matrix-matrix multiply" and zgemm stands for "double-precision complex matrix-matrix multiply ".

 

Therefore, for example, the function named sgemm indicates "Single-precision normal matrix multiplication", and zgemm indicates "double-precision complex matrix multiplication ".

 

More, can refer to: http://www.netlib.org/blas/blasqr.pdf

 

The following example is displayed on the csdn Forum. You can use gotoblas2 to make a successful call.

 

Program Description: The following matrix. the C file calls C code, Blas Level 1 function (ddot), Blas Level 2 function (dgemv), And Blas Level 3 function (dgemm) to complete matrix calculation: yours_multiply is the c source code, it directly relies on the compiler to generate optimization code. Ddot_multiply and dgemv_multiply use gotoblas2 to call partial matrix operations. Dgemm_multiply directly calls the matrix computing function of gotoblas2.

// Simple minded matrix multiply <br/> # include <stdio. h> <br/> # include <time. h> <br/> # include <stdlib. h> <br/> # include <Common. h> // gotoblas2 <br/> # include <cblas. h> // gotoblas2 </P> <p> void print_arr (int n, char * Name, double * array); <br/> void init_arr (int n, double * A); <br/> void dgemm_multiply (double * a, double * B, double * C, int N); <br/> void dgemv_multiply (double *, double * B, double * C, int N); <B R/> void ddot_multiply (double * a, double * B, double * C, int N); <br/> void yours_multiply (double * a, double * B, double * C, int N); </P> <p> int main (INT argc, char * argv []) <br/>{< br/> clock_t start, stop; <br/> int I, j; <br/> int N; <br/> double * A; <br/> double * B; <br/> double * C; <br/> If (argc <2) <br/> {<br/> printf ("Enter matrix size N = "); <br/> // please enter small number first to ensure that t He <br/> // multiplication is correct! And then you may enter <br/> // a "Reasonably" large number say like 500 or even 1000 <br/> scanf ("% d", & N ); <br/>}< br/> else <br/> {<br/> N = atoi (argv [1]); <br/>}</P> <p> A = (double *) malloc (sizeof (double) * n); <br/> B = (double *) malloc (sizeof (double) * n); <br/> C = (double *) malloc (sizeof (double) * n ); </P> <p> init_arr (n, a); <br/> init_arr (n, B); </P> <p> Start = clock (); <br/> yours_multiply (A, B, C, N); <br/> stop = clock (); <br/> printf ("roll_your_own_multiply (). elapsed time = % G seconds/N ", <br/> (double) (stop-start)/clocks_per_sec ); <br/> // print simple test case of data to be sure multiplication is correct <br/> If (n <7) <br/>{< br/> print_arr (n, "A", a); <br/> print_arr (n, "B", B ); <br/> print_arr (n, "C", c); <br/>}< br/> free (a); <br/> free (B ); <br/> free (c); </P> <p> // ddot m Ultiply <br/> A = (double *) malloc (sizeof (double) * n); <br/> B = (double *) malloc (sizeof (double) * n); <br/> C = (double *) malloc (sizeof (double) * n); <br/> init_arr (n, ); <br/> init_arr (n, B); </P> <p> Start = clock (); <br/> ddot_multiply (A, B, C, N ); <br/> stop = clock (); </P> <p> printf ("ddot_multiply (). elapsed time = % G seconds/N ", <br/> (double) (stop-start)/clocks_per_sec); <br/> // print simpl E test case of data to be sure multiplication is correct <br/> If (n <7) <br/>{< br/> print_arr (n, "", a); <br/> print_arr (n, "B", B); <br/> print_arr (n, "C", C ); <br/>}< br/> free (a); <br/> free (B); <br/> free (C ); </P> <p> // dgemv multiply <br/> // reallcoate to force cash to be flushed <br/> A = (double *) malloc (sizeof (double) * n); <br/> B = (double *) malloc (sizeof (double) * n); <br/> C = (double *) m Alloc (sizeof (double) * n); <br/> init_arr (n, a); <br/> init_arr (n, B ); </P> <p> Start = clock (); <br/> dgemv_multiply (A, B, C, N); <br/> stop = clock (); </P> <p> printf ("dgemv_multiply (). elapsed time = % G seconds/N ", <br/> (double) (stop-start)/clocks_per_sec ); <br/> // print simple test case of data to be sure multiplication is correct <br/> If (n <7) <br/>{< br/> print_arr (n, "A", a); <br/> Print _ Arr (n, "B", B); <br/> print_arr (n, "C", c); <br/>}< br/> free (); <br/> free (B); <br/> free (C ); </P> <p> // dgemm multiply <br/> // reallocate to force cash to be flushed <br/> A = (double *) malloc (sizeof (double) * n); <br/> B = (double *) malloc (sizeof (double) * n); <br/> C = (double *) malloc (sizeof (double) * n); <br/> init_arr (n, a); <br/> init_arr (n, B ); </P> <p> Start = clock (); <br/> dgemm_multiply (a, B, c, N); <br/> stop = clock (); </P> <p> printf ("dgemm_multiply (). elapsed time = % G seconds/N ", <br/> (double) (stop-start)/clocks_per_sec ); <br/> // print simple test case of data to be sure multiplication is correct <br/> If (n <7) <br/>{< br/> print_arr (n, "A", a); <br/> print_arr (n, "B", B ); <br/> print_arr (n, "C", c); <br/>}</P> <p> free (); <br/> free (B); <br/> free (c); </P> <p> return 0; <br/>}</ P> <p> // brute force way of matrix multiply <br/> void yours_multiply (double * a, double * B, double * C, int N) <br/> {<br/> int I, j, k; <br/> for (I = 0; I <n * n; I ++) <br/> C [I] = 0; <br/> for (I = 0; I <n; I ++) <br/> {<br/> for (j = 0; j <n; j ++) <br/> {<br/> for (k = 0; k <n; k ++) <br/> {<br/> C [N * I + J] + = A [n * I + k] * B [N * k + J]; <br/>}</P> <p> // The ddot way to matrix multiply <br/> void DDO T_multiply (double * a, double * B, double * C, int N) <br/>{< br/> int I, j; <br/> int incx = 1; <br/> int incy = N; <br/> for (I = 0; I <n; I ++) <br/> {<br/> for (j = 0; j <n; j ++) <br/> {<br/> C [N * I + J] = cblas_ddot (n, & A [n * I], incx, & B [J], incy); <br/>}< br/> // dgemv way of matrix multiply <br/> void dgemv_multiply (double *, double * B, double * C, int N) <br/>{< br/> int I; <br/> double alpha = 1.0, Beta = 0 .; <br/> int incx = 1; <br/> int incy = N; <br/> for (I = 0; I <n; I ++) <br/>{< br/> cblas_dgemv (cblasrowmajor, cblasnotrans, N, N, Alpha, A, N, & B [I], n, beta, & C [I], n); </P> <p >}< br/>}</P> <p> // dgemm way. the prefered way, especially for large matrices <br/> void dgemm_multiply (double * a, double * B, double * C, int N) <br/>{< br/> int I; <br/> double alpha = 1.0, Beta = 0 .; <br/> int incx = 1; <Br/> int incy = N; <br/> cblas_dgemm (cblasrowmajor, cblasnotrans, cblasnotrans, N, Alpha, B, n, A, N, beta, c, N); <br/>}</P> <p> // initialize array with random data <br/> void init_arr (int n, double *) <br/> {<br/> int I, j; <br/> for (I = 0; I <n; I ++) <br/> {<br/> for (j = 0; j <n; j ++) <br/> {<br/> A [I * n + J] = (I + J + 1) % 10; // keep all entries less than 10. pleasing to the eye! <Br/>}</P> <p> // print array to STD out <br/> void print_arr (int n, char * Name, double * array) <br/>{< br/> int I, j; <br/> printf ("/n % s/n", name ); <br/> for (I = 0; I <n; I ++) <br/> {<br/> for (j = 0; j <N; j ++) <br/>{< br/> printf ("% G/T", array [N * I + J]); <br/>}< br/> printf ("/N"); <br/>}< br/>

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.