"Parallel Computing" using MPI for distributed memory Programming (I.)

Source: Internet
Author: User
Tags mathematical functions

With the previous introduction to the Parallel Computing preparation section, we know that MPI (Message-passing-interface messaging interface) implements parallelism as a process-level message passing through the process through communication. MPI is not a new development language, it is a library of functions that defines what can be called by C, C + +, and FORTRAN programs. These libraries are primarily concerned with functions that communicate between two processes. MPI can have corresponding libraries in both Windows and Linux environments, and this article takes WINDOWS10 as a demonstration development environment.

1, Windows10+vs 2015 to build the MPI development environment
    • Download MPI for Windows

In order to be compatible with MPI, Windows made a set of MPI implementations based on general personal computers. If you want to install the true meaning of MPI, please go directly to www.mpich.org download, in accordance with the corresponding system to download the corresponding version.

Because I need to experiment on my personal notebook, I use Microsoft HPC Pack R2 ms-mpi Redistributable Package with Service Pack 4-Chinese (Simplified),: Http://www.micros oft.com/zh-cn/download/details.aspx?id=14737.

    • Installing MPI

My computer is 64-bit, so installed is Mpi_x64.msi, default Ann in C:\Program Files\Microsoft HPC Pack R2, here, in order to debug the code after convenient, it is best to set the environment variables: in the user variable path, Join: C:\Program Files\Microsoft HPC Pack R2\bin\.

    • Configuration 2015

Configuration directory, that is, loading the include and Lib libraries

Load dependencies

    • Compile

Almost everyone's first program was started from the "Hello,world" program, and I also wrote a sample of such tests:

#include"mpi.h"#include<stdio.h>intMainintargcChar*argv[]) {    intrank, numproces; intNamelen; CharProcessor_name[mpi_max_processor_name]; Mpi_init (&AMP;ARGC, &argv); Mpi_comm_rank (Mpi_comm_world,&rank);//Get Process NumberMpi_comm_size (Mpi_comm_world, &numproces);//number of processes returning the communication childmpi_get_processor_name (Processor_name,&Namelen); fprintf (stderr,"Hello world! process%d of%d on%s\n", Rank, numproces, processor_name);    Mpi_finalize (); return 0;}

In the above code, line 1th of the # include "mpi.h" header file must be included, compile the generated EXE file under VS2015 (generated in the debug file), through the cmd command, into the Debug folder directory, typing: mpiexec–n 4 TestForMPI.exe. where the command-n 4 means that 4 processes are used for parallel computing, the results are as follows:

2. Theoretical knowledge

From the above example, we have a preliminary understanding of the parallel computation of MPI programming, but we do not know how to actually write an MPI parallel program, which requires us to learn some theoretical knowledge. In the example above, there are several functions that are not clear to beginners of MPI, and begin with these functions.

Mpi_init: Tells the MPI system to make all necessary initialization settings. It is written at the front of the start MPI parallel computation. The specific syntax structure is:

mpi_init (     int* argc_p,     Char* * * argv_p    );

Parameters argc_p and argv_p respectively point to the pointer parameter in the main function, in order to understand this part, but also from the main function of the argument: C language provisions main function parameters can only have two, it is customary that these two parameters are written as argc and argv. Therefore, the function header of the main function can be written as: Main (ARGC,ARGV). The C language also specifies that ARGC (the first parameter) must be an integer variable, and argv (the second parameter) must be an array of pointers to the string. Where the ARGC parameter represents the number of arguments in the command line (note: The file name itself is also a parameter), the value of ARGC is automatically assigned by the system by the number of actual parameters when the command lines are entered. For example, there is a command behavior: C: ">e6 BASIC dbase Fortran because the file name E6 24 itself is a parameter, there are 4 parameters, so argc obtains a value of 4. The argv parameter is an array of string pointers whose element values are the first address of each string in the command line (arguments are processed by string).

In the Mpi_init function, however, it is not necessary to set both parameters of Argc_p and argv_p, and to set them to NULL when they are not needed.

Communication (Communicator): Mpi_comm_world represents a set of processes that can send messages to each other.

Mpi_comm_rank: The function used to get the process number in the communication sub that is calling the process.

Mpi_comm_size: A function used to obtain the number of processes of a communication sub.

The specific structure of these two functions is as follows:

int Mpiapi Mpi_comm_rank (    __in mpi_comm Comm,    int* rank    ); int Mpiapi mpi_comm_size (    __in mpi_comm Comm,    int* size    );

Their first arguments are passed to the communication sub as parameters, and the second parameter uses the outgoing parameters to separate the number of process numbers and traffic that are calling the communication child.

Mpi_finalize: The MPI system MPI has been used. It is always placed on the last side of the function block for parallel computing, after which there can be no more MPI-related stuff.

The above is only expressed as an MPI parallel computation of the basic structure, and does not really involve communication between processes, in order to better parallel, it is necessary to communicate between processes, the following describes the two interprocess communication functions, they are mpi_send and MPI_RECV, respectively, for message sending and receiving.

Mpi_send: Blocking message sending. Its structure is:

int Mpi_send (voidint count, mpi_datatype Datatype,intint  tag,mpi_comm Comm) 

The parameter buf is the send buffer, count is the number of data sent, datatype is the data type sent, dest is the destination address (process number) of the message, and the value range is 0 to np-1 Integer (NP represents the number of processes in Communicator Comm) or Mpi_proc_null Tag is a message label with an integer value ranging from 0 to Mpi_tag_ub, and Comm as Communicator.

MPI_RECV: Blocking message reception.

int MPI_RECV (voidint count, mpi_datatype Datatype,intint tag, Mpi_comm comm,mpi_ Status *status)

The parameter buf is the receive buffer; count is the number of data, which is the upper bound of the received data length, which can be obtained by calling the Mpi_get_count function, datatype to the data type received, and source as the message Origin address (process number). The values range from 0 to Np-1 (NP represents the number of processes in Communicator Comm), or Mpi_any_source, or Mpi_proc_null;tag as a message label with a range of integers between 0 and Mpi_tag_ub ; Comm is a communicator; status returns the Receive status.

Mpi_status: Returns the completion of the message delivery. The relevant variables of the data structure have more meanings, which can be used in the manual of parameters.

struct  {... ... int Mpi_source;             /* Message Source Address */ int Mpi_tag;                /* Message Labels */ int Mpi_error;              /* Error code */ ... ...} Mpi_status;
3. Example

This paper introduces the most basic inter-process communication function, we can write a more complex and more meaningful program, by writing a program to achieve the data integration of the Trapezoidal integration method.

The basic idea of trapezoid integral method is to divide the interval of x-axis into N equal-length sub-interval. Then calculate the and of the sub-interval.

Assuming that the endpoints of the sub-interval are XI and xi+1, the length of the sub-interval is: H=xi+1-xi. Then the trapezoid area is:

Since the n sub-ranges are equal, the boundaries are xi=a and x=b, respectively:

The area of all the trapezoidal areas of this area and is:

Transform to:

Therefore, the serial program code can be written like this:

h = (b-a)/2;  for (int1; i < n1; i++) {    = a + i*h;     += = H*approx;

Through the analysis of the serial program, for this example, we can identify two kinds of tasks: the first one to get the area of a single rectangular area, and the other is to calculate the area of these areas and.

Suppose f (x) =x3 divides the trapezoid into 1024 sub-regions to calculate the integrals in the [0,3] region.

#include"mpi.h"#include<stdio.h>#include<cmath>DoubleTrap (DoubleLEFT_ENDPT,DoubleRIGHT_ENDPT,DoubleTrap_count,DoubleBase_len);DoubleFDoublex);intMainintargcChar*argv[]) {    intMy_rank =0, COMM_SZ =0, n =1024x768, Local_n =0; DoubleA =0.0, B =3.0, h =0, local_a =0, Local_b =0; DoubleLocal_int =0, Total_int =0; intsource; Mpi_init (&AMP;ARGC, &argv); Mpi_comm_rank (Mpi_comm_world,&My_rank); Mpi_comm_size (Mpi_comm_world,&COMM_SZ); H= (b-a)/n;/*h is the same for all processes*/Local_n= N/COMM_SZ;/*So is the number of trapezoids*/local_a= A + my_rank*local_n*h; Local_b= Local_a + local_n*h; Local_int=Trap (local_a, Local_b, Local_n, h); if(My_rank! =0) {mpi_send (&local_int,1, Mpi_double,0,0, Mpi_comm_world); }    Else{total_int=Local_int;  for(Source =1; SOURCE < COMM_SZ; source++) {MPI_RECV (&local_int,1, mpi_double, Source,0, Mpi_comm_world, Mpi_status_ignore); Total_int+=Local_int; }    }    if(My_rank = =0) {printf ("with n =%d trapezoids, our estimate\n", N); printf ("Of the integral from%f to%f =%.15e\n", A, b, total_int);    } mpi_finalize (); return 0;}//integral functions of sub-regionsDoubleTrap (DoubleLEFT_ENDPT,DoubleRIGHT_ENDPT,DoubleTrap_count,DoubleBase_len) {    DoubleEstimate =0, x =0; inti; Estimate= (f (LEFT_ENDPT) + f (RIGHT_ENDPT))/2.0;  for(i =1; I <= Trap_count-1; i++) {x= LEFT_ENDPT +Base_len; Estimate+=f (x); } estimate= estimate*Base_len; returnestimate;}//Mathematical FunctionsDoubleFDoublex) {    returnPOW (x,3);}

In the code above, the result of the run

In this program code means that through the number of input processes, the 1024 divided sub-intervals allocated to the console input process (100) for sub-task solver, after the completion of the solution, 1-99 process computed results are sent through the Mpi_send function, and the NO. 0 process uses Mpi_ The RECV function receives a summary and sums the results of each process to get the integral value of the interval.

This parallel computes the communication between messages such as:

At this point, we have been able to use the Mpi_send message sending function and the MPI_RECV message receive function for a simple parallel program calculation, but we think, the final sum is to use the No. 0 process to do, in order to improve performance, but also need to further use the collection of communication, the next chapter will be deeply explained.

"Parallel Computing" using MPI for distributed memory Programming (I.)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.