It is often used to determine whether a value is within a certain range or not. When processing an image that is beyond a certain boundary range, I set the pixel to zero, or do I only set the vertex
In this process, we may need to perform a saturation process, as the following code is written:
if (val > 0 && val < 256){ // do something}else { // do something}
So I wrote the following function to discuss our problem.
bool isRangAt1(int val, int min, int max){ if (val > min && val < max) { return true; } return false;}
This function is very simple. It is used to determine whether a value is within the range of (Min, max,
Of course, here I did not consider the fault tolerance handling of Min <Max;
Is there any other implementation method for this code? Of course, at least we can replace the if statement with a conditional statement, as shown below:
bool isRangAt2(int val, int min, int max){ return val <= min ? false : val < max ? true : false;}
Of course, I think the second implementation is better (from the perspective of the instructions generated by disassembly), but the readability may be worse. We also have another implementation
bool isRangAt3(int val, int min, int max){ return (val > min) ^ (val > max);}
The above three functions are similar, but the readability is decreasing and the efficiency should be increasing,
In particular, the third type is implemented using comparison and XOR operations, which eliminates the if statement jump and improves efficiency,
This is also a kind of optimization strategy in the Assembly. Of course, I also thought of this method at first. I showed it through release.
This is a compilation of the previous two methods (you can install a Failover compilation tool in ubutnu, which is from PowerPC). You can compare the following:
.file"range.cpp".section".toc","aw".section".text".align 2.globl isRangAt1.section".opd","aw".align 2isRangAt1:.long.isRangAt1,[email protected].sizeisRangAt1,.-isRangAt1.previous.type.isRangAt1,@function.globl.isRangAt1.isRangAt1:.LFB11:cmpw 7,3,4li 9,1li 0,0cmpw 6,3,5ble 7,.L4bge 6,.L8.L5:mr 0,9.L4:extsw 3,0blr.L8:li 9,0b .L5.LFE11:.size.isRangAt1,.-.isRangAt1.globl __gxx_personality_v0.align 2.globl isRangAt2.section".opd","aw".align 2isRangAt2:.long.isRangAt2,[email protected].sizeisRangAt2,.-isRangAt2.previous.type.isRangAt2,@function.globl.isRangAt2.isRangAt2:.LFB12:cmpw 7,3,4li 9,1li 0,0cmpw 6,3,5ble 7,.L12ble 6,.L15.L13:mr 0,9.L12:extsw 3,0blr.L15:li 9,0b .L13.LFE12:.size.isRangAt2,.-.isRangAt2.ident"GCC: (GNU) 4.1.1 (SDK420, $Rev: 3547 $)"
After comparison, we will find that method 1 is almost the same as method 2. Of course, here I want to talk about it. This compilation is the result of my gcc-O3 release, some commands that do not exist may have been deleted,
However, many times I still think that the second method is more concise than the first one. For the compilation implemented by the third method, I will not provide the compilation for the time being. Readers can disassemble the compilation by themselves and remember-O3.
You are welcome to point it out. Please indicate the source for sharing. Thank you!
Determine whether a value is in a certain range for assembly Optimization