1. Starting source :
VCU10 video Download module, in pure Python implementation, C + + code calls pythonrun.h configuration Python runtime Environment launch Python module, so that the interface UI can use the functionality.
Tricky question! Use the time of day to print the log repeatedly, verify the difference between the strings, in order to find problems to locate problems, until after work inspiration.
Verification found that non-Chinese characters can be downloaded normally, Chinese characters download parsing failure, that is, the thought may be caused by the string is not unified, in Python code to do string conversion processing, are not effective.
The original interface encapsulates the code as follows:
[DllImport ("videodownloader", callingconvention = callingconvention.cdecl, CharSet = CharSet.Ansi)] privateexternstaticbool vddownload (IntPtr instance,string Savepath,string imagesavepath, String string subtitlelang);
For example, Savepath for [D:\vcu new],dll calls the Python module, the print log is:
' D:\\kvd\xd0\xc2 '
and running the Python code directly, the output is:
' d:\\kvd\xe6\x96\xb0 '
Using C # to build a demo to verify the \xd0\xc2 for the [ new ] Word gb2312 encoding, \xe6\x96\xb0 for the [ new ] Word utf-8 encoding.
2. String encoding
Win7 default encoding for C # in 64-bit Simplified Chinese version:
With this simple verification, it is found that the default encoding is gb2312, and the Python code, which takes into account multibyte characters such as Chinese, is utf-8 encoded by default.
The problem is probably here, DLL interface string, changed to Utf-8 string transfer. But C # built-in without a utf-8 wrapper, custom implements it.
3, Utf8marshaler
//interface data is set by UTF-8 encoding Public classUtf8marshaler:icustommarshaler { Public voidCleanupmanageddata (Objectmanagedobj) { } Public voidCleanupnativedata (IntPtr pnativedata) {Marshal.freehglobal (pNativeData); } Public intgetnativedatasize () {return-1; } PublicIntPtr MarshalManagedToNative (Objectmanagedobj) { if(Object. ReferenceEquals (Managedobj,NULL)) returnIntPtr.Zero; if(! (Managedobj is string)) Throw NewInvalidOperationException (); byte[] utf8bytes = Encoding.UTF8.GetBytes (managedobj as string); IntPtr ptr= Marshal.allochglobal (utf8bytes. Length +1); Marshal.Copy (Utf8bytes,0, PTR, utf8bytes. Length); Marshal.writebyte (PTR, utf8bytes. Length,0); returnptr; } Public Objectmarshalnativetomanaged (IntPtr pnativedata) {if(pNativeData = =IntPtr.Zero)return NULL; List<byte> bytes =Newlist<byte>(); for(intoffset =0; ; offset++) { byteb =marshal.readbyte (pNativeData, offset); if(b = =0) Break; Elsebytes. ADD (b); } returnEncoding.UTF8.GetString (bytes. ToArray (),0, Bytes. Count); } Private StaticUtf8marshaler instance =NewUtf8marshaler (); Public StaticICustomMarshaler getinstance (stringcookies) { returninstance; } }
4. Update interface string marshaling style
[DllImport ("VideoDownloader", CallingConvention = callingconvention.cdecl, CharSet =CharSet.Ansi)]Private extern Static BOOLVddownload (IntPtr instance, [MarshalAs (Unmanagedtype.custommarshaler, MarshalTypeRef=typeof(Utf8marshaler))] stringSavepath, [MarshalAs (Unmanagedtype.custommarshaler, MarshalTypeRef=typeof(Utf8marshaler))] stringImagesavepath,stringQuality,stringExtstringSubtitlelang);
Validation work ok! The results are as follows:
At this point, a technical difficulty solved, recorded here.
C # calls the DLL interface to pass the Utf-8 string method