02. Unicode man transcoding tool, 02 unicode
When using a Windows app, data in json format is used for interaction with the server.
Encoding to unicode format, which is not convenient during debugging, you have written a tool to convert the unicode content
Chinese characters for easy debugging. This little tool was written a few months ago. It was stored on the company's computer disk and used elsewhere,
A little troublesome. Put it in your blog.
This tool is simple and runs:
1. On the xaml page, place two WebBrowser controls to display unicode strings on the left and transcoded results on the right. The reason why browser controls are used,
Instead of directly using TextBlock controls (or TextBox), these Wpf controls cannot be used directly for text copying or pasting.
<Grid> <Grid. columnDefinitions> <ColumnDefinition Width = "1 *"/> <ColumnDefinition Width = "auto"/> <ColumnDefinition Width = "1 *"/> </Grid. columnDefinitions> <WebBrowser x: Name = "webSource" Grid. column = "0"/> <Button Content = "convert" Click = "Button_Click" Height = "100" HorizontalAlignment = "Center" Grid. column = "1" VerticalAlignment = "Top" Margin = "10"/> <WebBrowser x: Name = "webResult" Grid. column = "2"/> </Grid>
2. Place a textarea Form Control in the html string used by the WebBrowser control and set its css style width and height to 100%.
In the constructor, the test text content is displayed by default:
Public MainWindow () {InitializeComponent (); // you must specify it as UTF-8 encoding, otherwise, the default is gb-2312 string html = @ "
3. The WebBrowser control has a Document attribute, indicating "Document objects of HTML pages hosted ". Because it is of the object type,
You cannot directly use this attribute to obtain the html dom tree content (such as o. body. innerHTML to obtain the TAG content in the body ). Additional
Microsoft. mshtmlAnd introduce the mshtml. HTMLDocument type.
Click Event code of the button:
Private void Button_Click (object sender, RoutedEventArgs e) {// obtain the document object that represents the HTML page. Mshtml. HTMLDocument o = webSource. document as mshtml. HTMLDocument; // use mshtml. HTMLDocument requires Microsoft. mshtml Assembly reference // use the HTMLDocument object to directly obtain the html dom tree content, // and convert the content to the Chinese character string strResult = ToChinsesWord (o. body. innerHTML); // It Must Be UTF-8 encoded, otherwise, the default is gb-2312 string html = @ "
4. Logic for converting unicode strings to Chinese characters:
/// <Summary> /// convert Unicode encoding to a Chinese character string /// </summary> /// <param name = "str"> Unicode encoding string </param >/// <returns> Chinese character string </returns> public static string ToChinsesWord (string str) {// use the specified matching option to search for all the matching items of the specified Regular Expression in the specified input string MatchCollection mc = Regex. matches (str, @ "\ u ([\ w] {2}) ([\ w] {2})", RegexOptions. compiled | RegexOptions. ignoreCase); byte [] bts = new byte [2]; foreach (Match m in mc) {// convert the string representation of the number in the specified style to its equivalent 32-bit signed integer bts [0] = (byte) int. parse (m. groups [2]. value, NumberStyles. hexNumber); bts [1] = (byte) int. parse (m. groups [1]. value, NumberStyles. hexNumber); // decodes all the bytes in the specified byte array into a string newWord = Encoding. unicode. getString (bts); str = str. replace (m. value, newWord) ;}return str ;}
Code Link