Pentium III Processor Single instruction multiple data Flow extension instruction (3)

Source: Internet
Author: User
Tags arrays new features intel pentium

Profile:

With the release of the Intel Pentium III processor, many new features have been brought to the program designers. With these new features, programmers can create better products for users. Many of the new features of Pentium III and Pentium III Xeon (Xeon processors) enable her to run faster than the Pentium II and Pentium II Xeon processors, which include a processor serial number (unique Processor ID) and the Add SSE processor instruction set, these new instruction sets are like the MMX instruction set added by Pentium II on the basis of classic Pentium.

1. Data swizzling

The acceleration of the Pentium III processor SSE directive is also costly. Because the SSE directive can only manipulate new data types that she defines (128-bit). If your application uses its own data type format, convert it to this new data type before you perform the SSE instruction operation. After the operation is done, he must be converted back.

The operation of converting one data format to another is called the "Data Swizzling".

This conversion takes time and consumes the core cycle of the processor. If an application is frequently converted to data format, the waste of the processor core cycle is serious. Therefore, the conversion of this data format must be paid attention to.

1.1 Data organization

Typically, a 3D application holds a vertex in a matching data structure. When multiple vertices are expressed, the application uses an array of this structure, also called an AOS, to represent. A typical operation is to represent the vertices of X, Y and Z coordinates. The following code gives a data structure that represents a 3D vertex. If you want to represent a large number of such vertices, you need to use an array of this structure, as shown in Figure 9.

struct point {

float x, y, z;
};
...
point dataset[...];

Figure Nine: Structure array

The advantage of SSE is that multiple vertices can be processed at the same time. So we have to be able to easily handle data that represents multiple vertices (for example, 4 floating-point numbers representing the x coordinates of 4 vertices). This is achievable, and we can assemble the X, Y, and z three coordinate values representing a vertex, The application then processes them. To implement these, the application must rearrange the data into three separate arrays, or create an array structure in which each array corresponds to an array of coordinate values. This data structure is also called an SOA structure. (I understand this: the three coordinates representing a vertex are grouped into a data structure when no SSE is used. The process of a value of one value. After the use of SSE, the coordinates of all points can be combined into 3 arrays, the processing of such an array out of the 4 values of the simultaneous execution.

The following code defines an array structure such as Figure 10, which is represented by a chart.

struct point {

float *x, *y, *z;
};

Figure 10: Array structure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.