Buffer size problem for mpi_send

Source: Internet
Author: User

Recently has been busy with the process of moving the problem, in fact, it can not be simpler: is the company tuned an MPI program to move to the big system of GZ run, this thought that there is no problem of a step to achieve a variety of problems, both encountered the compilation of the problem (mentioned in the first blog) and resolved, The operation of the program was also faulty. The error message is "Fatal error in Mpi:other MPI error" and the dead process is the sending process, which makes me baisibuxie.

Then I doubted the various ways: first, using GDB debugging, but the problem is that because of the need to modify the-o3 optimization options to debug the MPI program, which caused the program to run very slow ... and so on, and then I had to return to the most original output debugging, but I was surprised to find that The information should be output is not output or not in sequence output, it is confusing, asked your Xin just know because the buffer may be dissatisfied, so the use of flush solves the problem (syntax for:cout<< "output content" <<......<<flush) With the help of the output information, I gradually locked the error location on a statement that sent a large file with Mpi_send, because I found that sending other information such as file name, size and other small information can be received normally, but when sending a file, I suspect that it may be a buffer size problem. Because the file name and other information of the buffer set to 100 bytes, and send the file is larger, the use of circular send, the buffer size is set to 1MB, before running on the company, this setting is no problem, and in order to reduce the communication overhead I would like to change, such as 10MB or larger, But on the big system ran on the problem, in the maintenance staff over there, I adjusted the size of the buffer sent, and finally found that the <64KB buffer sizes can run normally, but >=64KB will report the above error, may be the specific implementation of MPI caused by different.

So I started to try to use the MPI send buffer mode instead of the standard mode, buffer mode is actually the user-supplied buffer instead of the system buffer, as a matter of fact there is no size limit, the approximate step is to apply for a large buffer, and then use Mpi_buffer_ Attach the buffer to MPI, when calling Mpi_bsend, the contents of the Mpi_bsend send buffer are copied to the user-supplied buffer and then returned to execute the next command, and MPI sends the content to the receiver in the background from the user's buffer. But because I want to loop send, so continue to execute, it may not have been sent, the next mpi_bsend may copy the content to the user buffer, resulting in the loss of the first message, so after each execution mpi_bsend use Mpi_buffer_ Detach the user's buffer is emptied, this emptying is blocking, that is, wait for the content to be sent to empty, to ensure the integrity of the data. It is important to note that although the parameter requirements of the Mpi_buffer_attach and Mpi_buffer_detach are similar, the first parameter of the former is the first address of the buffer, which is intended to tell the MPI user where to provide the buffer, The second parameter is the size of the buffer, the first parameter of the latter is the address of the first address, in fact, the void * * Type, the second parameter is the buffer size of the address, the purpose is to rewrite the contents of these two addresses. (See the attachment Code) changed, I in the company and GZ were tested, in the company can operate normally, GZ found still can not break the limit of 64KB, had to ask the administrator to adjust this limit, anyway, finally found the cause of the problem.

Also met a weak question, also here, is to remind yourself: before adjusting the size of the send buffer, I found that the adjustment to 10MB there will be a paragraph error, Baisibuxie also fart son to ask your Xin, the original is I have been static statement of the array uh ... Static declaration of the array can not be too large, generally not more than 1~2MB, the array of large words must be dynamic declaration AH ~ ~ ~


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.