JavaScript itself uses the charCodeAt method to obtain Unicode encoding of a character and converts Unicode encoding into corresponding characters through the fromCharCode method.
The charCodeAt method, however, should be a 16-bit integer that takes two bytes per character. Transmission over the network is generally UTF-8 encoded, and JavaScript does not provide such a method by itself. However, there is an easy way to implement UTF-8 encoding and decoding.
Web requirements URL query string is UTF-8 encoded, for some special characters or Chinese, etc., will be encoded into a number of bytes, into the form of the corresponding 16 binary code. For example: Chinese characters will be encoded as%E4%B8%AD.
This javascript provides a combination of encodeuricomponent and decodeURIComponent methods to encode and decode query strings. With this, we can process the encoded string of the encodeURIComponent method and finally get the corresponding byte array. The code is as follows:
function EncodeUtf8(text){ ConstCode= encodeURIComponent(text); Constbytes=[]; for(varI= 0;I< Code.length;I++){ ConstC= Code.charAt(i); if(c=== '% '){ ConstHex= Code.charAt(I+ 1)+ Code.charAt(I+ 2); ConstHexval= parseint(Hex, -); bytes.Push(Hexval);I+= 2; } Else bytes.Push(C.charCodeAt(0)); } returnbytes;}
The function of this method is to get a string corresponding to the UTF-8 encoded byte sequence, which can be decoded into the corresponding string by the System.Text.Encoding.UTF8.GetString (bytes) method in the service-side language, such as C #.
Instead, the JavaScript method that decodes the UTF-8 encoded byte sequence to string is:
function DecodeUtf8 (bytes) { var encoded = Span class= "St" "" " for (var i = 0 ; I < bytes . length I++ ) { encoded += + bytes[i]. tostring (16 ) } return decodeuricomponent (encoded) Span class= "OP"; }
The method converts each byte into a representation of a% plus 16 binary number, which is then decoded by the decodeURIComponent method to obtain the corresponding string. Examples of use are:
var=encodeUtf8(‘ab热cd!‘);console.log(array); // 打印 [97, 98, 231, 131, 173, 99, 100, 33] var=decodeUtf8(array);console.log(content); // 打印 ab热cd!
The corresponding C # usage examples are as follows:
varbytes= System.Text.Encoding.UTF8.GetBytes("AB Hot cd!");//The following cycle will print the 98 231 131 173foreach(varBinchbytesConsole.Write(b+ " ");Console.Write("\ n");varContent= System.Text.Encoding.UTF8.GetString(bytes);Console.WriteLine(content); //Print AB hot cd!
By combining the above methods, the data can be exchanged in binary form between the front end and the back end by WebSocket, which facilitates the formulation of the Protocol.
This article is reproduced in the following address: JavaScript for UTF-8 encoding and decoding
JavaScript for UTF-8 encoding and decoding