P/invoke Fragments -- processing strings

Source: Internet
Author: User

P/invoke Fragments -- processing strings
Several styles of strings in the memory are called styles, that is, the existence of strings in the memory. How to store, occupy the memory size, and the storage order. The string style is generally different in different programming languages and platforms. 1 ,. the string style in. net. managed code in. net: 1 string strin = "in string"; 2 Console. writeLine (strin); // under the breakpoint here 3 Console. read (); When debugging, find the string strin address and find the memory: 1 0x01363360 b8 97 0f 79 0a 00 00 00 09 00 00 69 00 6e 00... y ........ i. n.2 0x01363370 20 00 73 00 74 00 72 00 69 00 6e 00 67 00 00 00. s. t. r. i. n. g... one of my questions is. is the string storage style in. net? 2. C-style hosting code: 1 [DllImport (@ "C: \ Documents and Settings \ Administrator \ Desktop \ pInvoke \ CPPDLL \ Debug \ CPPDLL. dll ")] 2 private static extern void Putstring (string s); 3 static void Main (string [] args) 4 {5 string strin =" in string "; 6 Console. writeLine (strin); 7 Putstring (strin); // call an unmanaged function 8 Console. read (); 9} unmanaged functions: 1 extern "C" _ declspec (dllexport) void Putstring (char * str) 2 {3 printf ("% s \ n ", str); // breakpoint 4} outside the breakpoint, in the memory Str: 1 0x0012F2EC 69 6e 20 73 74 72 69 6e 67 00 00 00 01 00 00 00 in string ....... this is the C-style string stored in the memory. The string ends at 00, and the length of the string is 1 more than the number of characters, that is, the '\ 0 '. a c-style string represents an array of ASCII or Unicode characters ending with 0. 3. Visual Basic and Java style ● the Visual Basic string is an ASCII character array with a prefix indicating the length. ● The Java string is an array of Unicode characters ending with 0. 4. BSTR-style BSTR is short for "Basic STRing", which is a standard STRing data type defined by Microsoft in COM/OLE. A common STRing type needs to be defined, it is easy to match different programming languages. In C ++, It is BSTR. The standard BSTR is an OLECHAR array with a length prefix and a null Terminator. 1 0x0012F2EC 12 00 00 00 69 00 6e 00 20 00 73 00 74 00 72 00 .... i. n .. s. t. r.2 0x0012F2FC 69 00 6e 00 67 00 00 00 22 00 00 00 00 00 00 00 00 I. n. g... "....... when sending a string, you need to consider the character width, style, input, output, and memory release issues. When the string is used as the input parameter: unmanaged function: 1 // when the string is used as the input parameter 2 extern "C" _ declspec (dllexport) void Putstring (char * str) 3 {4 printf ("% s \ n", str); 5} a simple function: print the string on the console. Managed code: 1 [DllImport (@ "C: \ Documents and Settings \ Administrator \ Desktop \ pInvoke \ CPPDLL \ Debug \ CPPDLL. dll ")] 2 private static extern void Putstring (string s); 3 static void Main (string [] args) 4 {5 string strin =" in string "; 6 Putstring (strin); 7 Console. read (); 8} since CLR will automatically send mails in the default mode, it is equivalent to the following explicit operation: 1 private static extern void Putstring ([In] [financialas (UnmanagedType. ansiBstr)] string s); 2 // It is input by default and is related to the (AnsiBstr) Platform Question 1: I don't know why it's not the same as the default processing (LPTStr) specified on MSDN) the following table lists the sending and receiving options when a string is used as the method parameter for calling a platform call: the procedure for processing a string as a parameter or return value is as follows: Step 1, string strin = "in string"; allocate memory in the managed memory and save the string "in string". The location and distribution in the memory are as follows: 0x01DEBF60 48 0d 87 61 0a 00 00 00 09 00 00 69 00 6e 00 H .? A ........ i. n.0x01DEBF70 20 00 73 00 74 00 72 00 69 00 6e 00 67 00 00 00. s. t. r. i. n. g... we can see that the style in the memory and above are a model. 48 0d 87 61 I don't know what this means. The start of other strings is also the data. 0a 00 00 00 is also the same in other strings, the third DWORD 09 00 00 00 represents the number of characters. Step 2: Putstring (strin); enter the unmanaged function from here. Execute printf ("% s \ n", str); In the function, you can see that str represents a memory pointer, pointing to the memory 0x01debf6c to save the passed string, this process allocates an address in the unmanaged memory and writes the processed string to this address. The processing of this address is based on the type of the unmanaged function parameter, convert to an ansi c style. As follows: 69 6e 20 73 74 72 69 6e 67 00 00 00 00 00 00 in string ....... step 3: Console. read (); from managed code to unmanaged code. Here, there is a key action, that is, the memory at the unmanaged address 0x01debf6c will be translated, and the places where the original characters are stored will be completely different. In. net, the characters both occupy two bytes, that is, the width. Here, the non-hosted function uses the narrow character, ANSI, And the encapsulation processor will change by default. What happens if the parameter used by an unmanaged function is of the wchar_t * type? The changes to the signature of an unmanaged function are as follows: 1 void Putstring (wchar_t * str) can be seen in the real-time window: str 0x002ff04c "too many characters when g" is passed due to narrow characters, each character in the copied to the unmanaged memory is closely arranged, however, when the wprintf function is used to process the "in" ASC code as a Chinese character, nothing is output to the console. This output is only a special case. In some cases, the output cannot recognize characters. If you change the description of a function in the managed code to the following, the string can be output normally. And the memory is also released. 1 private static extern void Putstring ([financialas (UnmanagedType. BStr)] string s); // UnmanagedType. the default value of the BStr Mail Processor to the input string is UnmanagedType. ansiBStr: During execution, a memory block is allocated in the unmanaged memory, and the string is copied to the memory for proper processing, after returning from the unmanaged memory, the memory zone is automatically released, and the string is not copied from the unmanaged memory back to the managed memory. This means that the reference is passed as a value. In fact, this memory is allocated using the CoTaskMemAlloc function. The Mail Processor will return it at the end and call the CoTaskMemFree function to release the occupied memory, otherwise, the Mail Processor will not be able to release this memory. To prove this conjecture, modify the unmanaged function as follows: other memory allocation methods include malloc and new. However, the Mail Processor cannot automatically release the memory allocated in the unmanaged memory using these methods. // When the string is used as the input parameter, extern "C" _ declspec (dllexport) void Putstring (char * str) {char * pNew = (char *) malloc (10 ); // modify printf (str);} Finally, you will find that the memory to which pNew points will not be released and cannot be released in the managed code, the only way is to call another unmanaged method to release the memory. If you use CoTaskMemAlloc to apply for memory, you can use the Marshal. FreeCoTaskMem method to release the memory in the managed code. String as an output parameter: to achieve a goal, pass the string to an unmanaged function, after a string is processed by an unmanaged function, the result is reflected in the managed code and the result is not returned. Therefore, the string must be used as an output parameter. By default, string operations are processed as input [In] and copied to an unmanaged memory, the operations performed by an unmanaged function on a string are actually performed on the string in the unmanaged memory. How can we modify the string so that it can be reflected in the managed code? 1. Use stringbuilder. In many cases, it is the buffer zone of stringbuilder that is passed to an unmanaged function. I also think that it is a pointer to a string and that the unmanaged code operates in the same memory area as the managed code, therefore, the final result of the unmanaged code operation is also reflected in the managed code. But the memory is not what I think. When stringbuilder is used as a parameter, the default direction attribute is [In, Out]. in addition, there are some strict requirements to achieve "changes to unmanaged functions can be reflected in managed code ". The first requirement is to explicitly determine the mail type as CharSet = CharSet. Unicode. Otherwise, it will not be completely copied. The second requirement is that the parameter of an unmanaged function is Unicode. The entire memory operation process is the same as described above. The string is copied back and forth to the memory. Regardless of the process, the results can still reach the goal. 2. Use IntPtr. 1 private static extern void Putstring (IntPtr ps); 2 static void Main (string [] args) 3 {4 IntPtr ipstr = Marshal. stringToHGlobalAnsi ("123456"); 5 Putstring (ipstr); 6 Console. read (); 7} unmanaged functions: 1 // when a string is used as an input parameter 2 extern "C" _ declspec (dllexport) void Putstring (char * str) 3 {4 printf (str); 5 StrCpy (str, "abc"); 6} You can see through the trace that what is passed is indeed an address, is an unmanaged address, this address is always 0x00657ee0 in both managed code and unmanaged functions. All the modifications are made here. The only bad feeling is that the memory CLR cannot be released. 3. The statement in the hosting code is explicitly enclosed in Unicode: 1 [DllImport (@ "C: \ Users \ Administrator \ Desktop \ pInvoke \ CPPDLL \ Debug \ CPPDLL. dll ")] 2 // [return: financialas (UnmanagedType. LPWStr)] 3 private static extern void Instring ([financialas (UnmanagedType. LPWStr)] string refstr); 4 static void Main (string [] args) 5 {6 string refstr = "321"; 7 Instring (refstr); 8} unmanaged functions: 1 extern "C" _ declspec (dllexport) void Instring (wchar_t * pStr) 2 {3 Wcscpy (pStr, L "abc"); 4} the main problem lies in the use of IntPtr, it is a matter of consideration. At this time, the Mail Processor will pass the address of the unmanaged function refstr = "321", that is, pStr points to "321 ", any changes to pStr can be reflected in refstr. This is a perfect way to pass out parameters. This is the legendary fix. In many cases, we used to copy data. But there are requirements: first, the managed code must call the local code, rather than the local code to call the managed code. Second, this type must be able to be copied directly or in some cases. Third, you are not passing through references (using out or ref ). Fourth, the caller and the called are located in the same thread context or unit. Direct Replication refers to a type that has a common representation in the hosted and unmanaged memory. In CLR, the string is Unicode. In this case, the explicit blocking is also specified using [financialasattribute (UnmanagedType. LPWSTR)], and the code can be copied from managed code to unmanaged code. The returned value is a string: unmanaged function: 1 extern "C" _ declspec (dllexport) char * Outstring () 2 {3 char * pStr = (char *) malloc (8 ); 4 StrCpy (pStr, "abc"); 5 return pStr; 6} managed code: 1 private static extern int Outstring (); 2 static void Main (string [] args) 3 {4 int I = Outstring (); 5 Console. read (); 6} The final I receives a DWORD value, an address, pointing to the string "abc" in the unmanaged memory ". If IntPtr is used for receiving, the final. PtrToStringAnsi conversion can get the normal result. Note that if the string type is used for receiving, an exception is prompted to access the inaccessible memory.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.