Introduction to distributed development on Intel MIC and various limitations in offload mode

Source: Internet
Author: User

Recently, to do distributed development on the mic cluster, there are two modes that can be used:

1) Offload mode: This mode is similar to the GPGPU programming idea, which transfers the high-parallelism code to the local mic processor, and the other code is still executed on the CPU. The mic is only responsible for local computing, and distributed communication must be performed on the CPU.

2) Symmetric mode: Compiles two binary codes executed on mic and CPU. This mode logically allows the mic to do distributed communication, although the physical message still goes from the CPU. The biggest difficulty with this mode programming is the load balancing problem.


Through a few days of exploration, found the various restrictions in the Offload mode:

1) Because the memory address is different, in addition to the value type one-dimensional array, it is not possible to copy the data containing the reference type at offload. Of course, this is not surprising for any architecture that does not share memory.

2) cannot use complex data types, such as iostream and smart pointer. Basically, it's better to write in C honestly.

3) virtual function cannot be supported because virtual table cannot be constructed in the offload region. In this way, do not think about the object-oriented inheritance and polymorphism.

4) Unless there is a target special tag, the global variables in the CPU code are not used.

5) MPI code is not supported because the offload itself only supports local computations and does not support distributed communication.

6) If an exception is thrown in the offload area, the catch must be resolved within the offload area, and the exception cannot be expected to run into the CPU code.


In addition, if you only consider the communication between the two co-processor on a machine, that is, intra-node communication, you can also use a protocol called SCIF. Usage is lower than MPI, similar to socket programming. Because it is not suitable for my use scene, no in-depth study.


It is worth mentioning that in the near future there should be no CPU, pure mic for distributed computing fleet. If you want to develop in advance, do not consider the use of CPU resources, in fact, with symmetric mode is a very good choice. When using the symmetric mode, it is surprising to find that only the mic binary code can be executed, and the mic node rank is the same as the CPU, which perfectly supports the distributed scenario. This way, in addition to the Intel compiler does not support some of the latest C + + syntax, the original distributed CPU code is not changed, it feels great! Mic relative to the CPU, the biggest advantage should be here--code portability is not a bit better than cuda two points.

Introduction to distributed development on Intel MIC and various limitations in offload mode

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.