Use intel software development tools to release the optimal performance of programs in the IA Architecture

Source: Internet
Author: User
This is the last article about how to use intel software development tools to improve software performance. The first two articles introduce how to use intel compiler to improve the code execution performance after compilation and how to use intel vtune to find the code performance bottleneck. This article describes how to use a highly optimized function library to improve code development efficiency and performance. Since the functions provided by the function library have been specially optimized for Intel processors, it is convenient to directly call interfaces. Developers are like standing on the shoulders of giants, hard-coded processors are no longer required, saving
It takes a lot of time and energy to improve the development efficiency and bring the product to the market faster.
Based on the different needs of developers, Intel provides three highly optimized function libraries: the original intel integrated performance (Intel IPP) for signal processing and multimedia software ), for 3D graphics, the original Intel Graphics performance (Intel GPP) and the intel mathematical kernel library for mathematics, engineering, and scientific computing (Intel MKL ). Next, let's begin our optimization journey and learn about the functions and application of these function libraries.

Original intel integration performance, IPP
This is a cross-platform signal and multimedia software library that contains a low-level layer abstracted from the underlying processor. In this way, applications can transparently use the latest enhancements of Intel's architecture, such as MMX (TM) technology, data flow single instruction multi-data extended instruction sets (SSE) second-generation data flow single command multi-data Extended Instruction Set (sse2), as well as the anteng architecture and Intel XScale microarchitecture. Intel IPP provides a wide range of multimedia functions: audio, video, and audio codecs; image processing; signal processing; mathematical support routines and computer vision. Using Intel IPP can build many standard codecs, including: MP3 and AAC audio; H.263, MPEG-1, MPEG-2, MP3 and MPEG-4 video; JPEG and MPEG-4 images; and g.723. 1. Speech with g.729.
Intel IPP contains various functions for vector and image processing, color conversion, filtering, screen splitting, Threshold setting, conversion, and arithmetic, statistical, geometric, and morphological operations. For each function, Intel IPP supports multiple data types and la s while minimizing the number of data structures, it provides a wide range of options for users to choose when designing and optimizing applications, so that they do not have to compile assembly code.
IPP optimization results are even more immediate. The latest test on the Pentium 4 processor-based platform shows that the performance gain of the Intel IPP library exceeds the compiled C code.

Intel Graphics performance, GPP
The Library provides a rich set of powerful 3D graphics Functions and is optimized for Intel's personal Internet client architecture (Intel PCA) application processors using intelxscale technology. Intel GPP contains a wide range of 3D graphics functions, including data type conversion, arithmetic, triangle, vector, matrix and grating. Functions in the Intel GPP library are optimized for the Intel XScale microarchitecture, so that developers do not have to write the underlying assembly code to maximize the performance of Intel processors, it is a convenient and powerful means. This "ready-to-use" solution can help reduce development costs and accelerate the market. Although the Intel GPP library is not a comprehensive graphics engine, it provides a basic set of pre-fabricated code blocks that can be used to create a 3D engine tailored to the Intel XScale microarchitecture. These originals contain basic functions at the engine component level and provide great flexibility in engine architecture and implementation.
In addition, Intel GPP solves many common problems in 3D software Rendering Systems, for example, the lack of Integer Division support, the lack of dedicated floating point hardware, limited system memory, handheld and mobile device display area is too small, and so on. Intel GPP is compatible with many popular embedded operating systems, such as Microsoft Pocket PC * 2002 running on Intel pxa250 processors. However, these originals are very low-level and have been designed to avoid dependency on the host operating system. Intel GPP provides great support for application porting without sacrificing performance. Intel GPP can also optimize game applications for handheld and mobile devices.

Intel mathematics kernel library, MKL
This library is composed of highly optimized functions that involve applications in fields such as mathematics, engineering, science, and finance that have high performance requirements on Intel platforms. The functional areas of this function include linear algebra composed of LAPACK And Blas, discrete Fourier transform (DFT), vector beyond functions (vector Math Library/VML), and vector statistics Library (VSL) functions.
New features in Intel's mathematical kernel Library:
U today, on Intel processor-based desktops, workstations, and servers, the workload of various applications and simulation programs is constantly increasing to meet their needs, intel MKL continues to deliver outstanding performance by continuously improving performance and adding more numeric functions.
U further extends intel MKL's FFT function by providing one-dimensional and multi-dimensional D f t routines (up to seven dimensions, whose transformation length does not include the power of 2 and supports the hybrid base.
The U vector statistics Library provides a high-performance pseudo-random number generator that can be manually compiled, adjusted, and vectorized. These random number generator subroutines provide both basic continuous and discrete distributions.
Intel MKL performance optimization can be reflected in the following aspects:
DFT Function Optimization
This figure shows the performance of the two-dimensional DFT function when the single-precision complex is used in a series of matrix sizes, measured in 1 million floating point operations (mflops) 2 per second. Intel MKL's built-in thread technology can use multiple processors. This figure shows the superior performance of Intel MKL's DFT function optimization and the performance amplification when the number of threads increases.
Linear Algebra and dgemm
Matrix Multiplication is a cubic operation. When the matrix size doubles as the multiplier, the calculation amount will be 8 times the original. Double Precision normal matrix (dgemm) is a common problem in dense linear algebra. In many applications that rely heavily on solving large equations, the performance of the correctly compiled solver directly depends on the performance of dgemm.
In a multi-processor system, Intel MKL uses other available processors to accelerate performance and complete tasks. As the number of processors increases, the performance almost scales linearly.
Intel MKL optimization allows even the desktop to exert the power of the mainframe. Intel MKL's various functional areas (LAPACK, Blas, DFTs, VML, and VSL) can achieve excellent performance. Many functions in Intel MKL are threaded and symmetric
Multi-processing (SMP) systems can produce excellent performance. This will bring about the benefits of parallel operations. Without any additional work, the application can produce excellent performance amplification with the increase of threads.

Summary
The above three function libraries start with performance and compatibility to help developers improve program performance. You can use the advanced functions of the processor without having to write code for a specific processor. In addition, these application programming interfaces (APIS) can be used across many platforms, so that multimedia application developers can easily achieve cross-platform compatibility and help reduce development costs. The above function libraries are available in both Windows and Linux versions.

You can get a free trial version of Intel software development tools:
Http://www.xlsoft.com/cn/products/intel/index.htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.