Speed Optimization of wpf large image processing: pointer operations, parallel operations, dozens of times optimized

Source: Internet
Author: User

I have been using GDI + for Winform pointer-based image processing. This time I made up my mind to move all of them to wpf (mainly for convenient display layout)
The image used is
2512*3307 big image, 8.3 million pixels
The class library is based on the WritableBitmapEx wpf version.
The function is an extension method written by myself. It only utilizes the environment provided by writableBitmapEx. I am too lazy to write it myself from start to end.
 
1. Standard int32 array traversal calculation release
0.28 s

Unsafe public static void TestGray1 (this WriteableBitmap bmp)
{
Using (var context = bmp. GetBitmapContext ())
{
Int height = context. Height;
Int width = context. Width;
For (int y = 0; y {
For (int x = 0; x <width; x ++)
{
Int pos = y * context. Width + x;
Var c = context. Pixels [pos];
Var r = (byte) (c> 16 );
Var g = (byte) (c> 8 );
Var B = (byte) (c );

Var gray = (r * 38 + g * 75 + B * 15)> 7 );

Var color = (255 <24) | (gray <16) | (gray <8) | gray;
Context. Pixels [pos] = color;
}
}
}
}

 

2. Standard int32 pointer traversal calculation release

0.04 s


Unsafe public static void TestGray2 (this WriteableBitmap bmp)
{
Using (var context = bmp. GetBitmapContext ())
{
Var ptr = context. Pixels;

Int height = context. Height;
Int width = context. Width;
For (int y = 0; y {
For (int x = 0; x <width; x ++)
{
Var c = * ptr;
Var r = (byte) (c> 16 );
Var g = (byte) (c> 8 );
Var B = (byte) (c );

Var gray = (r * 38 + g * 75 + B * 15)> 7 );

Var color = (255 <24) | (gray <16) | (gray <8) | gray;
* Ptr = color;

Ptr ++;
}
}
}
}

 

3. colorstruct pointer traversal Calculation

0.02 s

It should have reached the limit speed [in addition to the subsequent parallel mode], I have no idea how to increase the processing speed.

In addition, this method is the most intuitive and easy to understand, and is easy to maintain in the future.


 


[StructLayout (LayoutKind. Sequential)]
Public struct PixelColor
{
Public byte Blue;
Public byte Green;
Public byte Red;
Public byte Alpha;
}

 


Unsafe public static void TestGray3 (this WriteableBitmap bmp)
{
Using (var context = bmp. GetBitmapContext ())
{
Var ptr = (PixelColor *) context. Pixels;

Int height = context. Height;
Int width = context. Width;
For (int y = 0; y {
For (int x = 0; x <width; x ++)
{
Var c = * ptr;
Var gray = (c. Red * 38 + c. Green * 75 + c. Blue * 15)> 7 );
(* Ptr). Green = (* ptr). Red = (* ptr). Blue = (byte) gray;

Ptr ++;
}
}
}
}


 

4. For comparison, I tested the speed at which the pointer of GDI + processes images.

0.06 s

 

 


Public static unsafe Bitmap ToGray (Bitmap img)
{
Var rect = new System. Drawing. Rectangle (0, 0, img. Width, img. Height );

Var data = img. LockBits (rect, System. Drawing. Imaging. ImageLockMode. ReadOnly, System. Drawing. Imaging. PixelFormat. Format32bppArgb );
Var ptr = (ColorType *) data. Scan0.ToPointer ();

Var bytes = new Int32 [img. Width * img. Height];

Var height = img. Height;
Var width = img. Width;
For (int y = 0; y {
For (int x = 0; x <width; x ++)
{
Var color = * ptr;
Var gray = (color. R * 38 + color. G * 75 + color. B * 15)> 7 );

(* Ptr). R = (* ptr). G = (* ptr). B = (byte) gray;

Ptr ++;
}
}

Img. UnlockBits (data );

Return img;
}

 

5. The most important thing is coming. I have always been confused about Parallel. For, why does it take several times as long as normal. I studied it carefully today and found that it was an error.

0.01 seconds release

Notebook i5cpu, if the desktop I7 will be more powerful, the speed will be half to half reduced.

 

 

It mainly utilizes Microsoft's parallel job library's method of loop parallelization.

Note: The default parallel loop is very slow when the function body is small. In this case, you must use Partitioner to create the loop body. This is an introduction to MSDN and is the key.

 


Unsafe public static void TestGray5 (this WriteableBitmap bmp)
{
Using (var context = bmp. GetBitmapContext ())
{
Int height = context. Height;
Int width = context. Width;

Parallel. ForEach (Partitioner. Create (0, height), (h) =>
{
Var ptr = (PixelColor *) context. Pixels;
Ptr + = h. Item1 * width;

For (int y = h. Item1; y {
For (int x = 0; x <width; x ++)
{
Var c = * ptr;
Var gray = (c. Red * 38 + c. Green * 75 + c. Blue * 15)> 7 );
(* Ptr). Green = (* ptr). Red = (* ptr). Blue = (byte) gray;

Ptr ++;
}
}

});

}
}


 

Feelings

1. Do not use attributes or functions in a loop, which may reduce the computing speed by several times.

Because the attribute is essentially a function, and it is best not to call the function in the loop body. If you really need to use the inline code method, c # does not have inline, copy the code. Anyway, for speed.

2. the pointer shift operation seems to be 10 times faster than direct array access.

I feel that either the reason for the cache hit or the access to the array itself is encapsulated by the attribute. It is equivalent to calling a function.

3. the TPL task parallel library is really easy to use. It seems that Microsoft has already considered the loop optimization problem of a large amount of data parallel for 09 years, but I have been using the wrong method to find it very slow.

 

The weather is so good that I feel comfortable after writing the code.

 

Excerpted from bitter bear

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.