Smp,numa and MPP three system architectures

Last Update:2018-07-25 Source: Internet

Author: User

Tags extend relative requires

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Smp,numa and MPP three system architectures

Time 2012-08-30 11:30:04 Chenjunlu ' s Blog original http://www.chenjunlu.com/2012/08/parallel-computer-memory-architectures/

From the memory architecture of the parallel computing system, the current commercial server can be broadly divided into three categories, namely symmetric multiprocessor architecture (Smp:symmetric multi-processor), non-uniform storage access structure (Numa:non-uniform memory Access), and a massive parallel processing structure (mpp:massive Parallel processing). Their characteristics are described as follows:

1. SMP (symmetric multi-processor)

The so-called symmetric multiprocessor architecture, as shown in the figure below, refers to multiple CPUs in a server that work symmetrically, without primary or secondary relationships. Each CPU shares the same physical memory, and the time required for each CPU to access any address in memory is the same, so SMP is also known as a consistent memory access structure (uma:uniform memory access).

The way to extend an SMP server includes increasing memory, using a faster CPU, increasing the CPU, expanding I/O (number of slots and buses), and adding more external devices (usually disk storage). The main feature of the SMP server is sharing, and all the resources in the system (CPU, memory, I/O, etc.) are shared. It is precisely because of this characteristic that leads to the main problem of the SMP server, which is that it has very limited ability to expand. For SMP servers, each shared link can cause bottlenecks in the SMP server expansion, and the most restricted memory. Because each CPU must access the same memory resources through the same memory bus, as the number of CPUs increases, the memory access conflicts will increase rapidly, resulting in a waste of CPU resources, which greatly reduces the effectiveness of CPU performance. Experiments have shown that the best CPU utilization for SMP servers is between 2 and 4 CPUs.

2. NUMA (non-uniform Memory Access)

NUMA is one of the results of this effort as people begin to explore how to effectively scale up and build large-scale systems due to the limitations of the ability of SMP to expand. With NUMA technology, you can combine dozens of CPUs (even hundreds of CPUs) in one server. Its CPU module structure is shown in the following figure, the basic characteristic of a NUMA server is that it has multiple CPU modules, each CPU module consists of multiple CPUs (4), and has independent local memory, I/O slots, etc.

Each CPU can access the entire system's memory (this is an important difference between the NUMA system and the MPP system) because its nodes can connect and interact with the information through interconnected modules such as Crossbar Switch or Bus interconnect. Obviously, accessing local memory is much faster than accessing remote memory (the memory of other nodes within the system), which is also the origin of non-uniform storage access to NUMA. Due to this feature, in order to better perform system performance, the development of applications requires minimizing the interaction of information between different CPU modules. With NUMA technology, it is possible to solve the problem of extension of the original SMP system, which can support hundreds of CPUs in a physical server. Examples of typical NUMA servers include HP Superdome, Sun 15K, IBM pseries 690, and more.

However, NUMA technology also has some drawbacks, because the latency of accessing remote memory far exceeds the local memory, so when the number of CPUs increases, system performance does not increase linearly. When HP released the Superdome server, it published its relative performance values with other UNIX servers, and found that the relative performance value of the 64-way CPU's Superdome (NUMA fabric) was 20, while the relativity of 8-way N4000 (shared SMP structure) The energy value is 6.3. From this result can be seen, 8 times times the number of CPUs in exchange for only 3 times times the performance of the promotion.

3. MPP (Massive Parallel processing)

Unlike NUMA, MPP provides another way to extend the system by connecting multiple SMP servers through a certain node internetwork, working together to accomplish the same task, from a user's point of view to a server system. The basic characteristic is that multiple SMP servers (each SMP server is called a node) are connected through a node internetwork, each node only accesses its own local resources (memory, storage, etc.), is a completely non-shared (Share nothing) structure, thus the best expansion capability, theoretically its expansion is unlimited, The current technology can achieve 512 nodes interconnection, thousands of CPUs.

At present, the industry has no standard on the node internetwork, such as NCR's BYNET,IBM Spswitch, they all adopt different internal realization mechanism. However, the node Internet is only used internally by the MPP server and is transparent to the user. In the MPP system, each SMP node can also run its own operating system, database, and so on. But unlike NUMA, it does not have a problem with offsite memory access. In other words, the CPU within each node cannot access the memory of the other node. The information interaction between nodes is realized through the network of nodes, which is generally called data redistribution (redistribution). However, the MPP server requires a complex mechanism to dispatch and balance the load and parallel processes of each node. Currently, some MPP-based servers tend to mask this complexity through system-level software such as databases.

For example, my first job in life is the technical support of HP Neoview, the HP large Data Warehouse appliance, which is based on the MPP architecture, which consists of multiple Segment, a Segment with 8 blade,16 Node and 32 CPUs. Each node can only access its own memory, and all communication between node is implemented through the Fabric network. When developing an application based on this product, regardless of how many Segment or nodes the backend server consists of, the developer is confronted with the same data warehouse product without having to consider how to dispatch the load of one of the several.

4. The difference between NUMA and MPP

In terms of architecture, NUMA and MPP have many similarities: they are made up of multiple nodes, each with its own CPU, memory, I/O, and the nodes can interact with each other through the node interconnection mechanism. So where are the differences between them? First, the node interconnection mechanism is different, the NUMA node interconnection mechanism is implemented within the same physical server, when a CPU needs remote memory access, it must wait, which is the NUMA server is not able to achieve CPU increase performance linear expansion of the main reason. While the node interconnection mechanism of MPP is implemented by I/O on different SMP servers, each node accesses only local memory and storage, and the information interaction between nodes is carried out in parallel with the processing of the node itself. Therefore, the performance of MPP can be linearly extended when adding nodes. The second is a different memory access mechanism. Within a NUMA server, any CPU can access the entire system's memory, but the performance of remote access is much lower than local memory access, so you should try to avoid remote memory access when developing your application. In the MPP server, each node accesses only local memory, and there is no problem with remote memory access.

5. Selection of data warehouses

What kind of server is more adaptable to the data Warehouse environment. This requires starting with the load characteristics of the Data Warehouse environment itself. As we all know, the typical data warehouse environment has a large number of complex processing and comprehensive analysis, requiring the system to have very high I/O processing capability, and the storage system needs to provide sufficient I/O bandwidth to match. While a typical OLTP system is based on online transaction processing, each exchange involves little data, requiring the system to be highly transactional and able to handle as many transactions as possible in a unit of time. Obviously, the load characteristics of these two environments are completely different.

From a NUMA architecture, it can integrate many CPUs within a physical server, making the system highly transactional, and minimizing data interaction between different CPU modules due to remote memory access Shiyan longer than local memory access. Obviously, the NUMA architecture is more suitable for the OLTP transaction processing environment, and when used in the Data Warehouse environment, the CPU utilization will be greatly reduced because of the large amount of complex data processing which inevitably leads to a lot of interaction.

In contrast, the MPP server architecture is more capable of parallel processing, which is more suitable for complex data synthesis analysis and processing environment. Of course, it needs to use a relational database system that supports MPP technology to mask the complexity of load balancing and scheduling between nodes. In addition, this parallel processing ability also has a great relationship with the node internetwork. Obviously, the MPP server, which adapts to the data Warehouse environment, should be very prominent in the I/O performance of the node internetwork in order to realize the performance of the whole system. But this is not absolute, the quality of performance by a number of factors, such as I introduced in the last blog Exadata, it does not use the MPP architecture, but the performance is quite superior. So single-sided analysis of performance from one aspect of the server, and now the trend is to optimize the performance of the server from many aspects (including software level).

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More