DownloadCode: Net0307.exe (133kb)
I have noticed a trend in my recent programming, which leads to the topic of this month. Recently, I have been using Microsoft. NET Framework-based applications.ProgramCompleted a lot of Win32 InterOP. I am not saying that my application is full of custom InterOP code, but sometimes I will. net Framework class library encountered some secondary but complex, inadequate content, by calling the Windows API, you can quickly reduce this trouble.
Therefore, I think that the. NET Framework 1.0 or 1.1 class libraries have any functional limitations that Windows does not have. After all, 32-bit windows (whatever version) is a mature operating system that has served customers for more than 10 years. In contrast,. NET Framework is a new thing.
As more and more developers move production applications to managed code, it is natural for developers to study the underlying operating system more frequently to figure out some key functions-at least for the moment.
Fortunately, the InterOP function (called platform call (P/invoke) of the Common Language Runtime Library (CLR) is very complete. In this column, I will focus on how to use P/invoke to call Windows API functions. When it refers to the com InterOP function of CLR, P/invoke is used as a noun. when it refers to the use of this function, it is used as a verb. I am not going to introduce com InterOP directly, because it is more accessible than P/invoke, but more complicated, which is a bit self-contradictory, this makes the discussion on com InterOP as a topic less concise.
Enter P/invoke
Start with a simple P/invoke example. Let's take a look at how to call the Win32 messagebeep function. Its unmanaged declaration is shown in the following code:
Bool messagebeep (uint utype // beep type );
To call messagebeep, you need to add the following code to a class or structure definition in C:
[Dllimport ("user32.dll")] Static extern Boolean messagebeep (uint32 beeptype );
Surprisingly, you only need this code to make the hosted Code call the unmanaged messagebeep API. It is not a method call, but an external method definition. (In addition, it is close to a direct port from C and C #, so it is helpful to introduce some concepts starting from it .) The possible calls from managed code are as follows:
Messagebeep (0 );
Note that the messagebeep method is declared as static. This is required by the P/invoke method, because there is no consistent instance concept in this Windows API. Next, note that this method is marked as extern. This prompts the compiler that this method is implemented through a function exported from the DLL, so the method body is not required.
Speaking of the lack of a method body, have you noticed that the messagebeep declaration does not contain a method body? And mostAlgorithmDifferent hosting methods consist of intermediate language (IL) commands. The P/invoke method is only metadata, real-time (JIT) the compiler connects the hosted code to a non-hosted DLL function at runtime. An important information required to execute a connection to the unmanaged world is to export the name of the DLL of the unmanaged method. This information is provided by the dllimport custom attribute before the messagebeep method declaration. In this example, we can see that messagebeep APIs are exported by user32.dll in windows.
So far, there have been two topics not introduced about calling messagebeep. Please review that the called code is very similar to the following code snippet:
[Dllimport ("user32.dll")] Static extern Boolean messagebeep (uint32 beeptype );
The last two topics are related to data sending aling and actual method calls from managed code to unmanaged functions. The call of the unmanaged messagebeep function can be executed by any managed code that finds the extern messagebeep declaration in the scope. This call is similar to any other call to a static method. It shares the same with any other managed method call in that it brings the need for data sending and processing.
One of the C # rules is that its calling syntax can only access CLR data types, such as system. uint32 and system. boolean. C # clearly does not recognize the C-based data types used in Windows APIs (such as uint and bool). These types are only the definition of the C language type. Therefore, when the Windows API function messagebeep is compiled as follows:
Bool messagebeep (uint utype)
External methods must be defined using the CLR type, as shown in the previous code snippet. The CLR type that needs to be different from the basic API function type but is compatible with it is one of the hard-to-use aspects of P/invoke. Therefore, I will use the complete section below this column to introduce data sending and processing.
Style
In C #, it is very easy to call the Windows api p/invoke. But if the class library refuses to make your application beep, you should try to call windows to make it do the work, right?
Yes. But it is related to the method selected and has a great relationship! Generally, if a Class Library provides some way to implement your intent, it is best to use an API instead of directly calling the unmanaged code, because the CLR type and Win32 have very different styles. I can sum up my suggestions on this issue into one sentence. When you perform P/invoke, do not make the application logic directly belong to any external method or component. If you follow this small rule, you will often save a lot of trouble in the long run.
The code in Figure 1 shows the minimum additional code for the messagebeep external method I have discussed. No significant changes were made in Figure 1, but some general improvements were made to non-packaged external methods, which made the work easier. From the top, you will notice a complete type named sound, which is dedicated to messagebeep. If you need to use the Windows API function playsound to add support for playback waveforms, You can reuse the sound type. However, I am not angry with making public the type of a single public static method. After all, this is only the application code. It should also be noted that sound is sealed and an empty private constructor is defined. These are just some details, so that users do not mistakenly derive classes from sound or create its instances.
The next feature of the Code in Figure 1 is that the actual external method where P/invoke appears is the private method of sound. This method is only indirectly disclosed by the Public messagebeep method, which accepts parameters of the beeptypes type. This indirect extra layer is a key detail and provides the following benefits. First, you should introduce a future beep hosting method in the class library. You can repeatedly use the public messagebeep method to use managed APIs without changing the remaining code in the application.
The second benefit of this packaging method is that when you call P/invoke, you give up the right to avoid access conflicts and other low-level damages, which are generally provided by CLR. The Caching method can protect the rest of your application from access conflicts and similar problems (even if it does not do anything but just pass parameters ). This buffer method localized any potential errors introduced by P/invoke calls.
The third and last benefit of hiding a private external method behind the public packaging is that it provides the opportunity to add some minimal CLR styles to the method. For example, in Figure 1, I fail to convert the Boolean returned by the Windows API function to an exception similar to CLR. I also defined an enumeration type named beeptypes. Its members correspond to the defined values used with this Windows API. Because C # does not support definition, managed enumeration types can be used to avoid the spread of magic numbers to the entire application code.
The last benefit of the packaging method is insignificant for simple Windows API functions (such as messagebeep. However, when you start to call more complex unmanaged functions, you will find that the benefits of manually converting the Windows API style to a more CLR-friendly method will increase. The more you plan to reuse the InterOP feature throughout the application, the more seriously you should consider the packaging design. At the same time, I think it is not impossible to use CLR-friendly parameters in non-object-oriented static packaging methods.
DLL import attributes
Now is the time to go deeper. The dllimportattribute type plays an important role in P/invoke calls for hosted code. The main function of dllimportattribute is to instruct CLR which DLL to export the function you want to call. The DLL name is passed to dllimportattribute as a constructor parameter.
If you are not sure which DLL defines the Windows API function you want to use, the Platform SDK documentation will provide you with the best help resources. When the topic text of a Windows API function is near the end, the SDK documentation specifies the. Lib file that must be linked by the C application to use the function. In almost all cases, the. Lib file has the same name as the system DLL file defining the function. For example, if the function requires the c Application to link to kernel32.lib, the function is defined in kernel32.dll. You can find the topic of the Platform SDK documentation for messagebeep in messagebeep. At the end of the topic, you will notice that the library file is user32.lib; this indicates that messagebeep is exported from user32.dll.
Optional dllimportattribute
In addition to the host DLL, dllimportattribute also contains optional attributes, four of which are particularly interesting: entrypoint, charset, setlasterror, and callingconvention.
EntrypointIf you do not want the external hosting method to have the same name as the DLL export, you can set this attribute to indicate the name of the DLL function entry point to be exported. This is especially useful when you define two external methods that call the same unmanaged function. In addition, you can bind their serial numbers to the exported DLL Functions in windows. If you need to do this, the entrypoint values such as "#1" or "#129" indicate the serial number value of the non-hosted function in the DLL instead of the function name.
CharsetFor character sets, not all versions of Windows are also created. Windows 9XThe series products lack important Unicode support, while the Windows NT and Windows CE series use Unicode from the very beginning. CLR running on these operating systems uses Unicode for internal representation of string and char data. But don't worry.-When Windows 9 is calledXWhen using an API function, CLR automatically converts it From Unicode to ANSI.
If the DLL function does not process text in any way, you can ignore the charset attribute of dllimportattribute. However, when Char or string data is part of the equation, set the charset attribute to charset. Auto. In this way, the CLR can use the appropriate character set according to the Host OS. If the charset attribute is not explicitly set, the default value is charset. ANSI. This default value has a disadvantage because InterOP calls on Windows 2000, Windows XP, and Windows NT negatively affect the performance of text parameter messages.
Charset should be explicitly selected. ANSI or charset. the charset value of Unicode instead of charset. the only condition of auto is that you explicitly specify an export function, which is specific to one of the two Win32 OS types. The readdirectorychangesw API function is an example of this function. It only exists in Windows NT-based operating systems and only supports Unicode. In this case, you should explicitly use charset. unicode.
Sometimes, the relationship between character sets of Windows APIS is not obvious. One way to ensure that the function is correct is to check the C-Language header file of the function in the Platform SDK. (If you are not sure which header file to view, you can view the header files of each API function listed in the Platform SDK documentation .) If you find that this API function is indeed defined as a macro mapped to a function name ending with a or W, the character set is related to the function you are trying to call. An example of a Windows API function is the getmessage API declared in winuser. H. You may be surprised to find that it has a and W versions.
SetlasterrorError handling is very important, but it is often forgotten during programming. When you call P/invoke, you may also face other challenges-handling the differences between Windows API error handling and exceptions in managed code. I can give you some suggestions.
If you are using P/invoke to call a Windows API function, you can use getlasterror to find the extended error information, set the setlasterror attribute to true in the dllimportattribute of the external method. This applies to most external methods.
This causes the CLR to cache errors set by API functions after each call to an external method. Then, in the packaging method, you can obtain the cached error value by calling the marshal. getlastwin32error method defined in the system. runtime. interopservices. Marshal type of the class library. My suggestion is to check the expected error values from the API function and raise a perceptible exception for these values. For all other failures (including unexpected failures. win32exception defined in the componentmodel namespace, and marshal. the value returned by getlastwin32error is passed to it. If you look back at the code in Section 1, you will see that this method is used in the public packaging of the extern messagebeep method.
CallingconventionThe last or least important dllimportattribute I will introduce here is callingconvention. With this attribute, you can give CLR instructions on which function call conventions should be used for parameters in the stack. The default value of callingconvention. winapi is the best option. It works in most cases. However, if this call does not work, you can check the Declaration header file in the Platform SDK to see if the called API function is an abnormal API that does not conform to the call conventions.
Generally, the call conventions of local functions (such as Windows API functions or C-runtime DLL Functions) describe how to push parameters into the thread stack or clear them from the thread stack. Most Windows API functions first push the last parameter of the function into the stack, and then the called function is responsible for clearing the stack. On the contrary, many c-runtime DLL functions are defined to push them into the stack in the order in which method parameters appear in the method signature, and stack cleanup is handed over to the caller.
Fortunately, to call P/invoke, you only need to let the peripheral devices understand the call conventions. Generally, the default value callingconvention. winapi is the best choice. Then, in C Runtime DLL functions and a few functions, you may need to change the Convention to callingconvention. cdecl.
Data delivery Processing
Data encapsulation processing is a challenging aspect of P/invoke. When passing data between managed and unmanaged code, CLR follows many rules, and few developers often encounter them until these rules can be remembered. Unless you are a class library developer, you usually do not have to know the details. In order to use P/invoke most effectively on CLR, even if only occasional InterOP application developers need to understand some basic knowledge about Data encapsulation processing.
In the rest of this month's column, I will discuss the data sending and processing of simple numbers and string data. I will start from the most basic digital data sending and processing, and then introduce simple pointer sending and string sending processing.
Number mails and logical scalar
Windows OS is mostly written in C. Therefore, the data type used by Windows APIS is either C type or C type remarked by type definition or macro definition. Let's take a look at data sending without pointers. For the sake of simplicity, we will focus on numbers and boolean values.
When passing parameters to Windows API functions by Using values, you need to know the answer to the following questions:
• |
Is the data basically integer or floating-point? |
• |
If the data is an integer, is it signed or unsigned? |
• |
If the data is an integer, what is the number of digits? |
• |
If the data is float, is it single-precision or double-precision? |
Sometimes the answer is obvious, but sometimes it is not. Windows APIs redefine the basic C data type in various ways. Figure 2 lists some common data types of C and Win32 and their specifications, as well as a Common Language Runtime Library type with matching specifications.
Generally, your code works as long as you select a CLR type whose specification matches the Win32 type of this parameter. However, there are some special cases. For example, the bool type defined in Windows API is a signed 32-bit integer. However, bool is used to indicate whether the Boolean value is true or false. Although you do not need to use the bool parameter as the system. int32 value, if the system. boolean type is used, a more suitable ing will be obtained. The ing of character types is similar to bool, because a specific CLR type (system. Char) indicates the meaning of characters.
After learning this information, it may be helpful to gradually introduce the example. The beep topic is still used as an example. Let's try kernel32.dll low-level beep, which will beep through the computer's speaker. The Platform SDK documentation for this method can be found in beep. The local API is recorded as follows:
Bool BEEP (DWORD dwfreq, // frequencydword dwduration // duration in milliseconds );
In terms of parameter sending, your job is to understand what CLR data types are compatible with the DWORD and bool data types used by beep API functions. Review the chart in step 2. You will see that DWORD is a 32-bit unsigned integer, just like the CLR type system. uint32. This means that you can use the uint32 value as the two parameters sent to beep. The return value of bool is very interesting because the chart tells us that in Win32, bool is a 32-bit signed integer. Therefore, you can use the system. int32 value as the return value from beep. However, CLR also defines the system. boolean type as the semantics of Boolean values, so it should be used instead. By default, CLR sends the system. boolean value to a 32-bit signed integer. The external method definition shown here is the result P/invoke Method for beep:
[Dllimport ("kernel32.dll", setlasterror = true)] Static extern Boolean BEEP (uint32 frequency, uint32 duration );
Pointer Parameters
Many Windows API functions use pointers as one or more of their parameters. Pointers increase the complexity of data encapsulation because they add an indirect layer. If no pointer exists, you can pass data in the thread stack. With a pointer, you can pass data through reference by pushing the memory address of the data into the thread stack. Then, the function indirectly accesses data through the memory address. Using managed code, you can add indirect layers in multiple ways.
In C #, if the method parameter is defined as ref or out, data is passed through reference rather than through value. Even if you do not use InterOP, it is called from one managed method to another. For example, if the system. int32 parameter is passed through ref, the address of the data is transmitted in the thread stack, rather than the integer itself. The following is an example of how to receive an integer by referencing it:
Void flipint32 (ref int32 num) {num =-num ;}
Here, the flipint32 method obtains the address of an int32 value, accesses data, reverse it, And then assigns the reverse value to the original variable. In the following code, the flipint32 method calls the variable of the program.XFrom 10 to-10:
Int32 x = 10; flipint32 (ref X );
You can reuse this capability in managed code to pass pointers to unmanaged code. For example, the fileencryptionstatus API function returns the file encryption status in the form of a 32-bit unsigned mask. This API is recorded as follows:
Bool fileencryptionstatus (lpctstr lpfilename, // file namelpdword lpstatus // encryption status );
Note that this function returns a Boolean value instead of its return value, indicating whether the call is successful. When the request succeeds, the actual status value is returned through the second parameter. It works by calling a program to pass a pointer to a DWORD variable to the function, and the API function fills in the memory location pointed to by the status value. The following code snippet shows a possible definition of an external method that calls the unmanaged fileencryptionstatus function:
[Dllimport ("advapi32.dll", charset = charset. Auto)] Static extern Boolean fileencryptionstatus (string filename, out uint32 status );
This definition uses the out keyword to indicate the by-ref parameter for the uint32 status value. Here, I can also select the ref keyword. In fact, the same machine code will be generated during runtime. The out keyword is only a norm of the by-ref parameter. It instructs the C # compiler that the transmitted data is only transmitted outside the called function. On the contrary, if the ref keyword is used, the compiler assumes that data can be transmitted inside and outside the called function.
Another good aspect of the out and ref parameters in managed code is that the variable passed by the address as the by-ref parameter can be a local variable, a class, or structure element in the thread stack, it can also be an element reference in an array with the appropriate data type. This flexibility of the calling program makes the by-ref parameter a good starting point for sending a buffer pointer and a single numeric pointer. Only when I find that the ref or out parameters do not meet my needs will I consider sending pointers to more complex CLR types (such as classes or array objects ).
If you are not familiar with the c syntax or call Windows API functions, it is sometimes difficult to know whether a pointer is required for a method parameter. A common indicator is to check whether the parameter type starts with the letter P or lp, such as lpdword or pint. In these two examples, LP and P indicate that the parameters are a pointer, and they point to the data type DWORD or Int respectively. However, in some cases, you can use the asterisk (*) in the C syntax to define an API function as a pointer. The following code snippet shows an example:
Void takesapointer (DWORD * pnum );
We can see that the only parameter of the above function is the pointer to the DWORD variable.
When pointers are enclosed by P/invoke, ref and out are only used to host the value types in the code. When the CLR type of a parameter is defined using the struct keyword, it can be considered as a value type. Out and ref are used to encapsulate pointers to these data types, because usually value type variables are objects or data, and there is no reference to the value type in the managed code. On the contrary, when sending a reference type object, the ref and out keywords are not required, because the variable is already referenced by the object.
If you are not familiar with the differences between the reference type and value type, refer toMsdnMagazine, you can find more information in the topic of the. NET column. Most CLR types are reference types. However, except system. String and system. Object, all primitive types (such as system. int32 and system. Boolean) are value types.
Opaque pointer: a special case
Sometimes in a Windows API, the pointer passed or returned by a method is not transparent, which means that the pointer value is a pointer technically, but it is not directly used by the Code. Instead, the Code returns the pointer to Windows for subsequent reuse.
A very common example is the handle concept. In Windows, the internal data structure (from the file to the button on the screen) is represented as a handle in the application code. The handle is actually an opaque pointer or a value with a pointer width. The application uses it to represent the internal OS structure.
In rare cases, API functions also define opaque pointers as pvoid or lpvoid types. In the definition of Windows API, these types mean that the pointer has no type.
When an opaque pointer is returned to your application (or your application expects an opaque pointer, you should encapsulate the parameter or return value as a special type in CLR-system. intptr. When you use the intptr type, usually do not use the out or ref parameter, because intptr means to directly hold the pointer. However, If You encapsulate a pointer as a pointer, it is appropriate to use the by-ref parameter for intptr.
In a CLR type system, the system. intptr type has a special attribute. Unlike other base types in the system, intptr does not have a fixed size. On the contrary, its running size depends on the normal pointer size of the underlying operating system. This means that in 32-bit windows, the width of the intptr variable is 32-bit, while in 64-bit windows, the Code Compiled by the real-time compiler regards the intptr value as a 64-bit value. This automatic size adjustment feature is useful when an opaque pointer is enclosed between hosted and unmanaged code.
Remember that any API function that returns or accepts a handle actually operates on an opaque pointer. Your code should mail the handle in windows to the system. intptr value.
You can forcibly convert the intptr value to an integer of 32-bit or 64-bit in the managed code, or convert the latter to the former. However, when using Windows API functions, pointers should be non-transparent, so they cannot be used in addition to storage and passing to external methods. The two Special Cases of this "storage and transfer only" rule are when you need to pass the NULL pointer value to an external method and compare the intptr value and null value. To do this, you cannot forcibly convert zero to system. intptr. Instead, you should use the int32.zero static public field on the intptr type to obtain the null value for comparison or assignment.
Mail text
Text data is often processed during programming. Text creates some trouble for InterOP for two reasons. First, the underlying operating system may use Unicode to represent strings or ANSI. In rare cases, for example, the two parameters of the multibytetowidechar API function are inconsistent in the character set.
The second reason is that when P/invoke is required, you also need to know that C and CLR have different ways to process text. In C, a string is actually only an array of character values, usually using null as the Terminator. Most Windows API functions process strings according to the following conditions: For ANSI, use it as an array of character values; for Unicode, use it as an array of wide character values.
Fortunately, CLR is designed to be quite flexible, and the issue can be easily solved when text is sent, without worrying about what Windows API functions expect from your application. Here are some important considerations to remember:
• |
Does your application transmit text data to API functions or does API functions return string data to your applications? Or are they both? |
• |
What type of hosting should your external method use? |
• |
What format is the expected result of an API function? |
First, let's answer the last question. Most Windows API functions have lptstr or lpctstr values. From the function perspective, they are modifiable and unmodifiable buffers, including character arrays ending with null. "C" indicates a constant, which means that the parameter information is not passed outside the function. "T" in lptstr indicates that the parameter can be UNICODE or ANSI, depending on the character set you selected and the character set of the underlying operating system. In Windows APIs, most string parameters are one of these two types. Therefore, if charset. Auto is selected in dllimportattribute, CLR works in the default mode.
However, some API functions or custom DLL functions use different methods to represent strings. If you want to use such a function, you can use marshalasattribute to modify the string parameters of the external method and specify a string format different from the default lptstr. For more information about marshalasattribute, see the topic of the Platform SDK document in marshalasattribute class.
Now let's take a look at how string information passes between your code and unmanaged functions. There are two ways to know the transfer direction of information when processing strings. The first and most reliable method is to first understand the purpose of the parameter. For example, if you are calling a parameter whose name is similar to createmutex with a string, you can imagine that the string information is transmitted from the application to the API function. At the same time, if you call GetUserName, the function name indicates that the string information is transmitted from this function to your application.
In addition to this reasonable method, the second method for finding the information transfer direction is to find the letter "C" in the API parameter type ". For example, the first parameter of the GetUserName API function is defined as lptstr, which represents a long pointer to a unicode or ANSI string buffer. However, the name parameter of createmutex is converted to ltctstr. Note that the type definition here is the same, but the addition of the letter "C" indicates that the buffer zone is a constant and API functions cannot be written.
Once you confirm whether a text parameter is only used as input or output, you can determine which CLR type is used as the parameter type. Here are some rules. If the string parameter is only used as the input, the system. string type is used. In managed code, strings are unchanged and suitable for buffers that are not changed by local API functions.
If String parameters can be used as input and/or output, the system. stringbuilder type is used. The stringbuilder type is a very useful class library type. It can help you build strings effectively, or pass the buffer to the local function, where the local function fills you with string data. Once the function is returned, you only need to call the tostring of the stringbuilder object to obtain a String object.
The getmediapathname API function is used to show when to use string and when to use stringbuilder, because it has only three parameters: an input string, an output string, and a parameter that specifies the length of Characters in the output buffer.
Figure 3 shows the annotation-added unmanaged getaskpathname function documentation, which simultaneously specifies both input and output string parameters. It introduces the managed external method definition, as shown in 3. Note that the first parameter is enclosed as system. String because it is only used as the input parameter. The second parameter represents an output buffer, which uses system. stringbuilder.
Summary
The P/invoke function described in this column is sufficient to call Many API functions in windows. However, if you use InterOP a lot, you will eventually find that you have provided a complicated data structure, and you may even need to directly access the memory through pointers in the managed code. In fact, InterOP in the local code can be a real Pandora box that hides details and low-level bits. CLR, C #, and managed C ++ provide many useful functions. Maybe I will introduce advanced P/invoke topics in this column later.
At the same time, as long as you feel that the. NET Framework class library cannot play your voice or perform other functions for you, you can know how to seek help from the original and excellent Windows API.