Parfor parallel programming in MATLAB

Source: Internet
Author: User
Parfor parallel programming in MATLAB
  • Generally, the program that consumes the most computing resources is a loop. Parallelizing loops or optimizing code in the loop body is the most commonly used idea to speed up program running.
  • MATLAB provides the parfor keyword to facilitate parallel computing on multi-core machines or clusters.
Use of the parfor keyword
  • A loop guided by the for keyword is usually run in serial mode. If it is changed to parfor, multiple workers can run in parallel.
  • Parfor can divide N cycles into independent and unrelated M parts, and then hand each part to a worker for execution.
  • The result of loop execution should be irrelevant to the order of N loop executions.
Simple variables of the variable type in parfor
  • In general, the Operations corresponding to each cycle in parfor should be independent of each other, but simple operations can be performed on a variable at the same time in multiple cycles. This type of variable is called a simple variable. For example, in the code below, A is a simple variable.
    a = 0;for i = 1:1000  a = a+i;end
  • Simple operations include+ - * .* & | [,] [;] {,} {;} min max union intersect.
  • The same parfor loop must perform the same operations on simple variables, that is, it must be the same simple operator. And must be consistent with the relative position of the operator.
  • The simple variable value expression must meet the combination and exchange laws.* [] {}The underlying layer has special processing to ensure the correctness of the results.
Slice variable
  • In parfor, you may need to read or write matrices other than parfor. The reading location is related to the cyclic variable. In this way, you need to transmit a large amount of data to the worker.
  • If the matrix is recognized as a slicing variable by MATLAB, the data can be transmitted to each worker in segments to improve the transmission efficiency.
  • The size of the slice Variable matrix cannot be changed in parfor. To ensure correct MATLAB recognition, only slices indexed by the same index value can be read in each loop, as shown ina[i] a[i+1]When this occurs, A is not recognized as a slicing variable.
Loop Variable
  • The I in the preceding example indicates the ID of the current loop.
Broadcast variable
  • Assign a value before parfor. Only read operations are performed in parfor.
Temporary Variable
  • The scope is limited to parfor and does not exist after parfor. The variable with the same name declared before parfor is not affected.
Examples of variable Differentiation
  • In the following example, TMP in parfor is a temporary variable, and the TMP value is still 5 after parfor is completed, which is not affected by the temporary variable.
  • Broadcast is a broadcast variable, and the value in each loop remains unchanged.
  • Stored Ed is a simple variable. MATLAB calculates the value of the parameter by worker and sends it back to the main process for processing.
  • Sliced is the slice variable, which improves data transmission.
  • I is a loop variable.
    tmp = 5;broadcast = 1;reduced = 0;sliced = ones(1, 10);parfor i = 1:10  tmp = i;  reduced = reduced + i + broadcast;  sliced(i) = sliced(i) * i;end
Worker Configuration
  • You need to configure worker before running the program. Otherwise, as mentioned above, the parfor loop runs in the form of a common for loop and cannot be parallel.
Standalone Configuration
  • You can enable or disable the parallel computing pool of the Local Machine by using the MATLAB pool command.
  • matlabpool nCommand to open N workers.
  • matlabpool open confignameOpen according to the specified configuration. The default configuration islocal.
  • Usematlabpool closeDisable worker.
  • You can useParallel -> Manage Cluster ProfileComplete.
  • N option: if there is a c cpu core, it can usually be set to C. If the remote server is used, you can set itc-1. For Computing-intensive programs, the performance improvement brought about by hyperthreading is almost 0. You can set it to the number of cores rather than the number of threads.
Notes
  • The number of cycles N is best divided by the number of workers M, otherwise some workers will allocate a large number of cycles, resulting in some workers idle for a period of time, reducing the concurrency.
  • In parallel running, workers communicate with each other. Pay attention to the performance degradation caused by a large amount of data transmission. Especially for broadcast variables, you can try to change them to slice variables if they are large.

  
  

For more information, see focustc. The blog address is http://blog.csdn.net/caozhk.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.