This article describes how to compile C ProgramCodeCommon optimization methods, divided I/O Articles, memory articles,Algorithm. MMX I also wanted to come here, but because the content and title are not very similar, I decided to change the name MMX Technical details, and H263 In video compression technology MMX Two applicationsArticle.
Iii. Algorithm
In the previous article, we talked about the optimization of memory operations. This article mainly describes some common optimization algorithms. There are too many things, and the content may be a little messy. Sorry.
I. Starting from an early point:
Let's talk about some tips first:
① For example N/2 Write N> 1 This is a common method, but note that these two are not completely equivalent! Because: If N = 3 Then, N/2 = 1; n> 1 = 1; However, if N = -3 Then, N/2 =-1; n> 1 =-2 So when the positive number is used, they are all rounded down, but the negative number is different. (In Jpg2000 Integer in YUV To RGB Must be used for conversion > To replace division)
② There isA = a + 1To writeA ++; A = a + BTo writeA + = B(GenerallyVBCan writeA = a + 1)
③ Merge multiple operations: for example A [I ++]; Access A [I] , Then order I Add 1 From the assembly point of view, this is indeed optimized. A [I] , And I ++ If yes, there may be two I Variable reading and writing at a time (depending on the compiler's optimization capabilities), but if A [I ++] It must be read-only. I Variable once. However, there is a problem here: You must be careful when merging conditions, such :( IDCT In Transformation 0 Block judgment, Chen Wang algorithm)
If (! (X1 = (BLK [8*4] <8) | (x2 = BLK [8*6]) | (X3 = BLK [8*2]) | (X4 = BLK [8*1]) | (X5 = BLK [8*7]) | (X6 = BLK [8*5]) | (X7 = BLK [8*3])
The assignment statement is integrated in condition judgment, but in fact, if the condition is true, these assignment statements are not required. That is to say, when the condition is true, some junk statements are added.H263Source code problems, although these junk statements make computing0Block, the time is increased30%, But becauseIDCTAccount only1%Time,0Block and only30% ~ 70%So there is no relationship between these performance losses. (This is my conclusion when I used the compilation to rewrite the source code ). It also shows that the focus of program optimization is on the most time-consuming part. It does not have much practical significance for non-time-consuming code optimization.
II. Change speed with memory:
The world is always hard to achieve, and programming is the same. In most cases, the speed is the same as the memory (or performance, such as compression performance or something. Currently, one of the common algorithms used for Program Acceleration is to use table queries to avoid computation (for example JPG Yes Huffman Code table, in YUV To RGB The original complex computation now only supports table queries. Although the memory is wasted, the speed is significantly improved and it is quite cost-effective. This idea also exists in database queries, storing hot spots to accelerate queries. Now we will introduce a simple example (for temporary purposes, haha): for example, it should be frequent in the Program (it must be frequent !) Computing 1000 To 2000 So we can use an array A [1000] Calculate these values first, keep them, and calculate them later 1200 ! When, look up the table A [1200-1000] You can.
III. Convert to zero
due to scattered memory allocation, it takes a lot of time to create a large number of small objects, so the optimization on them is sometimes very effective. For example, the problem with the linked list I mentioned in the previous article is that a large amount of scattered memory is allocated. Start with a VB program, previously, I used VB when you compile small programs for others, it mainly uses VB programming is faster than VC , can be written in half a day.) When using msflexgrid when adding a new row to a row (a table control), it is found that the refresh speed is very slow, so every time I add 100 lines, when the data is too large to add a new row, add 100 in this way, we can" turn it to nothing "and use this method, the refresh speed is faster than the original n times! In fact, there are a lot of such ideas and applications. For example, when the program runs, it actually takes up a certain amount of space. Later, the allocation of small pieces of memory is based on this space, this ensures that as few memory fragments as possible while accelerating the operation.
IV. Condition Statement orCaseStatement puts the most likely before
The optimization effect is not obvious. If you want to get it, use it. If you don't think of it, forget it.
VFor the sake of program readability, do not do the processing that the compiler can do or the optimization is not obvious:
This is very important. A common program is good or bad, mainly because of its readability, portability, reusability, and then its performance. Therefore, if the compiler itself can help us with optimization, we do not need to write things that everyone does not understand very well. For exampleA=52(End )-16(Start); this may be because when someone else reads the program, you will understand it.A. We do not need to writeA=36Because the compiler will help us calculate.
IV. Specific analysis:
The specific analysis of specific situations is a perfect truth. Without specific analysis, you cannot flexibly apply solutions to problems. Next I will talk about the analysis method. That is, how to find the time point of the program: (starting from the simplest method, first introduce a function .)Gettickcount (),This function is called once at the beginning and end, and the return value Subtraction is the time consumed by the program, accurate1 ms)
① For a function that is considered to be time-consuming, run it twice, or comment out the internal statements of the function (ensure that the program can run) to see how long it takes (or is missing. This method is simple and inaccurate.
② gettickcount () function test time. Note: gettickcount () can only be accurate to MS . Generally, less than 10 Ms is not accurate.
③ Use another functionQueryperformancecounter(& Counter) AndQueryperformancefrequency (& frequency), Calculated aboveCPUClock cycle, followedCPUFrequency Division is time. However, if you want to be accurate to this step, we recommend that you set the process to the highest level to prevent it from being blocked.
Finally, let's talk about a program I am processing: The program requires me to forget that there is a function in it. There is a large loop in the function, and the processing inside the loop is time-consuming. As a result, the program initially showed that the process started very quickly and became slower. When I tracked the variables in the program, I found that the initial loop jumped out after several loops, the subsequent cycles increase. After finding out why the cycle is slow, you can take the right remedy. My solution is that each cycle does not start from the beginning, instead, it starts the Left and Right cycles from the place where the previous loop jumps out (because the next loop may jump out, rather than the last small one, we need to traverse the previous one ), the speed of the program is also very fast. In practice, we need to analyze the real cause of slow programs to achieve optimal optimization results.