Complete the design of the Port Communication Server (iocp socket server) (3) do not trust the API (another algorithm of a single-chain table)

Source: Internet
Author: User

Design the port communication server (iocp socket server)

(3) do not trust API (another algorithm of a single-chain table)

Copyright 2009 Code client (LU yigui) All Rights Reserved

QQ: 48092788 source code blog: http://blog.csdn.net/guestcode

This title may be far fetched. This is only because of this problem in performance optimization, so it is used for titles. In this case, we will throw the content described in this article: Another algorithm of single-chain table. I have also developed an arm OS kernel. Although it is simple, the operating system mechanism is quite different. To achieve thread synchronization in the operating system, it must be interrupted (Application synchronization is generally interrupted by software, the cost for entry and exit is quite expensive (in fact, the time-based processing mechanism (clock interruption) of the operating system is also expensive, but that is inevitable ).

Sometimes, the more you know, the more complicated the factors you consider, and the more wrong the results you make. As mentioned above, thread synchronization requires a critical amount of time (compared to non-critical ). When releasing a function for memory management, some cool people told me that they can use independent threads to handle the release work. In the release function, they can send a message to it, for example, you can use a postthreadmessage function to solve the problem and avoid the use of critical points. At the beginning, I felt that it was really very simple, avoiding the congestion caused by releasing functions to do too much work. But then I raised a few questions to him: "When will the thread read the message? Every 1 ms? Every 10 milliseconds? Will it cause memory depletion errors due to latency ?" He also recommended setevent to me and used it to notify the thread in time to handle the release. It seems "very reasonable ".

Instead of doing this immediately, I did a few code segments to test (for a Senior Programmer, there is no need to test whether they use the synchronization mechanism. I just want: since it is critical, it is recommended if the method is efficient ):

First case:

Dwtickcount = gettickcount ();

For (I = 0; I <10000000; I ++)

{

Entercriticalsection (& cssection );

// Handle the release here

Leavecriticalsection (& cssection );

}

Case 2:

Dwtickcount = gettickcount ();

For (I = 0; I <10000000; I ++)

{

Entercriticalsection (& cssection );

// Put the released address in the processing queue

Leavecriticalsection (& cssection );

// Tell the processing thread that you are about to work

Setevent (hevent );

}

Dwtickcount = gettickcount ()-dwtickcount;

Case 3:

Dwtickcount = gettickcount ();

For (I = 0; I <10000000; I ++)

{

Postthreadmessage (dwthreadid, wm_user, 0, 0 );

// Tell the processing thread that you are about to work

Setevent (hevent );

}

Dwtickcount = gettickcount ()-dwtickcount;

MSG;

While (peekmessage (& MSG, 0, 0, 0, pm_remove ));

In the preceding three cases, two threads interact with each other. The test result is:

1. dwtickcount = 4281

2. dwtickcount = 17563

4. dwtickcount = 37297

Although my machine motherboard has been repaired several times, it is extremely slow, but with the same environment, the result is so wide, 2 and 3 cases can be ruled out (of course, if the release processing time in the 1st cases is much longer than that in the "2 or 3 cases to reduce the 1 case", the other is true; in other 3rd cases, it takes more or less time for the kernel to continuously deliver so many messages ). It is better to enable and enable the code release function (for the source code, refer to the previous article ).

This small case illustrates a problem. Sometimes, in the development process, we have inadvertently trust the API too much and always think that the code we see is only a line of API, compared to (Display) the efficiency is excellent in terms of the use of the critical code (this mistake has been made by myself in the past, and is caused by the "bad thinking" of the general idea during coding ). In a multi-threaded environment, Some APIs need to be synchronized through a mechanism such as "critical", such as sendmessage, postmessage, getmessge, and peekmessage.

We all want the system to provide more efficient APIs to meet our needs for server performance. For example, if Windows provides the port function, is there a more effective way for the system ?! Some people want to directly operate the physical memory, or even some people say: Is it more efficient to work on the driver layer? I have had such a bold idea. But under the current conditions, we expect that the API has little hope. Instead ...... It is better to optimize our algorithms. Next I will introduce another algorithm for improving the efficiency of one-way linked list. (Because of my information blocking, I don't know if this algorithm has been published .)

To improve efficiency, we use the memory pool and connection pool method to avoid frequent requests to the system and memory return resulting in more fragments of the system memory (this method is also inefficient ). This method is good, but the memory pool and the connection pool usually use the critical to synchronize. At the same time, a critical variable is also used to a data zone for synchronization purposes, if the concurrency is high, this synchronization will cause congestion. Can we increase the efficiency to reduce the possibility of blocking?

If we use two critical variables to perform one-way linked list in the same step, will the blocking probability be halved? The method is as follows: the one-way linked list adopts the backward-forward and backward-out method. One critical is responsible for synchronization of the linked list header and the other critical is responsible for synchronization of the end of the linked list, the premise is that the linked list is not empty.

The following is an Optimization Algorithm Based on the above assumptions (coming soon and coming soon ):

Pgio_buf giodt_allocgbuf (void)

/* Description: allocate a memory block gbuf to the business layer.

** Input: None

** Output: memory block giodata address */

{

// Chain Header

Entercriticalsection (& giodatapoolheadsection );

// Make sure that the linked list has at least one node. The pgiodatapoolhead is not empty unless it is not initialized.

// If a designer considers normal and abnormal conditions to ensure memory usage is not enough, the following judgment is redundant,

// I have been so bold in my previous designs.

If (pgiodatapoolhead-> pnext) // it does not cost much compared with the general algorithm if (pgiodatapoolhead ).

{

Pgio_buf result;

 

Result = (pgio_buf) pgiodatapoolhead;

Pgiodatapoolhead = pgiodatapoolhead-> pnext;

Dwgiodatapoolusedcount ++;

 

Leavecriticalsection (& giodatapoolheadsection );

// Why is this returned: (char *) Result + sizeof (gio_data_info ),

// This is also a type of optimization algorithm, which will be introduced later

Return (char *) Result + sizeof (gio_data_info ));

} Else

{

Leavecriticalsection (& giodatapoolheadsection );

Return (null );

}

}

 

Void giodt_freegbuf (pgio_buf pgiobuf)

/* Description: The business layer returns a memory block gbuf.

** Input: memory block giodata address

** Output: none */

{

// End of the chain table

Entercriticalsection (& giodatapooltailsection );

Pgiobuf = (char *) pgiobuf-sizeof (gio_data_info );

(Pgio_data) pgiobuf)-> pnext = NULL;

Pgiodatapooltail-> pnext = (pgio_data) pgiobuf;

Pgiodatapooltail = (pgio_data) pgiobuf;

Dwgiodatapoolusedcount --;

Leavecriticalsection (& giodatapooltailsection );

}

The following are common algorithms (first-in-first-out ):

Pgio_buf giodt_allocgbuf (void)

/* Description: allocate a memory block giobuf to provide the business layer.

** Input: None

** Output: memory block giodata address */

{

Pgio_buf result;

// Chain table

Entercriticalsection (& giodatapoolsection );

Result = (pgio_buf) pgiodatapoolhead;

If (pgiodatapoolhead)

{

Pgiodatapoolhead = pgiodatapoolhead-> pnext;

Dwgiodatapoolusedcount ++;

}

Leavecriticalsection (& giodatapoolsection );

 

Return (char *) Result + sizeof (gio_data_info ));

}

 

Void giodt_freegbuf (pgio_buf pgiobuf)

/* Description: The business layer returns a memory block giobuf.

** Input: memory block giodata address

** Output: none */

{

// Chain table

Entercriticalsection (& giodatapoolsection );

Pgiobuf = (char *) pgiobuf-sizeof (gio_data_info );

(Pgio_data) pgiobuf)-> pnext = pgiodatapoolhead;

Pgiodatapoolhead = (pgio_data) pgiobuf;

Dwgiodatapoolusedcount --;

Leavecriticalsection (& giodatapoolsection );

}

Careful readers should find out why the optimization algorithm is like this:

If (pgiodatapoolhead-> pnext)

{

Pgio_buf result;

...

Leavecriticalsection (& giodatapoolheadsection );

Return (char *) Result + sizeof (gio_data_info ));

} Else

{

Leavecriticalsection (& giodatapoolheadsection );

Return (null );

}

What is the difference with this algorithm (the code size will be less and concise ):

Pgio_buf result;

If (pgiodatapoolhead-> pnext)

{

...

} Else

Result = NULL;

Leavecriticalsection (& giodatapoolheadsection );

Return (null );

This question can be solved only after reading the compiled code: the previous method does not execute one or two pieces of assembly code (now we only talk about algorithm efficiency, and we will talk about code efficiency later ).

The above one-way linked list algorithm is only for personal views, hoping to get guidance from cool people, making the algorithm more efficient.

 

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/GuestCode/archive/2009/08/29/4496366.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.