PHP character encoding conversion class,
support for ANSI, Unicode, Unicode big endian, UTF-8, Utf-8+bom to convert each other.
Four common text file encoding methods
ANSI Code:
No file header (file encoding at the beginning of the symbolic byte)
ANSI encoded alphanumeric account of one byte, Chinese characters accounted for two bytes
Carriage return line break, single byte, hexadecimal representation 0d
[Unicode] character encoding table information, unicode character encoding
The UTF-8 is somewhat similar to the Haffman encoding, which encodes Unicode:
0x00-0x7F characters, expressed in a single byte;
The character 0x80-0x7FF is expressed in two bytes;
0x800-0xFFFF characters are represented in 3 bytes;
① The unicode
Let's first explain why we need to convert Chinese to unicode encoding. Unicode plays an important role in general international standards. It is more byte-saving than traditional character encoding, enabling the design of web pages to be displayed on platforms of different languages, therefore, as long as the Chinese character is converted to Unicode, no garbled
When a Unicode string is written into a text file or other storage, the Unicode scalar in the string is encoded in several encoding formats defined by Unicode. The small block encoding in each string is called a code unit. These include the UTF-8 encoding format (the coded string is a 8-bit code unit), the UTF-16 encoding format (the code unit that encodes the st
Python _ str _ (self) and _ unicode _ (self) ,__ str ____ unicode __
Official documents: https://docs.python.org/2.7/reference/datamodel.html? Highlight =__ mro __
Object.
_ Str __
(
Self
)
Called byStr ()Built-in function and byPrintStatement to compute the "informal" string representation of an object. This differs from_ Repr __()In that it does not have to be a valid Python expression: a mo
I have been using the KB Rom stm32, but I only recently used kb. I want to use fatfs to display the support for long file names. I found that the Rom is not enough after cc936.c is added, it is decided to store this two-way code table in external memory, flash or SD card, only can read;
Encoding conversion function in cc936.c after modificationWchar ff_convert (/* converted code, 0 means Conversion error */Wchar SRC,/* Character code to be converted */Uint dir/* 0:
Reading ANSI, Unicode, Unicode big endian, and UTF-8 text files by row in vc ansi Environment
1. Question proposal
The file class cstdiofile provided by MFC. One of the functions readstring implements row-based reading of files, but it cannot meet the needs of reading different types of text files by row. To solve this problem, I
Preliminary StudySome coding knowledge is provided. Based on some online mate
First go to an article
Article :
I also encountered this problem when I was a beginner at windows SDK programming. I believe many beginners of Windows programming have also encountered this problem. Later, I gradually understood this problem, but sometimes I am not quite clear about it when asked by others. Today, I would like to take this opportunity to organize my own ideas and explain this question in detail in the following article, hoping to help my friends who have this question.
Unicode Character Set and encoding method, unicode Character Set Encoding
Generally, a set of all characters that can be expressed in a standard is called a character set. For example, the character set defined by ISO/Unicode is Unicode. In Unicode, each character occupies a
ASCII (sbcs): It is a byte encoded string of all English characters and contains several control characters.
MBCS: countries modify and expand ASCII based on their own needs, and use several bytes to indicate local characters. There is no unified standard.
UNICODE: logically, all character encodings in the world are unified, that is, all characters in the world are uniquely encoded and not specific implementations are involved (there is no rule on h
Java uses the UTF-8 format string, where Java communicates with C + + to convert UTF-8 strings to Unicode strings, such as "ABCDABCD Chinese people xiasha 123." After converting to Unicode, the appearance is as follows:\u41\u42\u43\u44\u61\u62\u63\u64\u4e2d\u56fd\u4eba\u6c11\u4e0b\u6c99\u31\u32\u33\u2e\u676d\u5ddeVC in the following processing can be turned backCString cxxclass::unicodetostring () {CString
This is a PHP function that converts Chinese characters to Unicode encoding, and supports GBK and UTF8 encoding.
function Uni_decode ($uncode)
{
$word = Json_decode (Preg_replace_callback ('/# (\d{5});/', Create_function (' $dec ', ' return ' \\u\ '. Dechex ($dec [1 ]; '), ' "'. $uncode. '"));
return $word;
}
Convert Unicode to Kanji
function Uni_decode ($uncode)
{
$word = Json_decode (Preg_replace_callback
Label:Unicode characters are standard characters, such as English, numerals, and Chinese characters are not supported.Non-Unicode is the inclusion of Chinese characters and some special charactersnvarchar supports kanji, but takes two bytes per characterFor example, there is a field such as: [Name] [nvarchar] (50) We insert "xiaoming" This record, only two characters actually occupy 4 bytes. We insert "xiaoming" 8 English characters, which actually oc
Name lookup of C ++: qualified name lookup1. Introduction
In C ++, Name lookup is a process of finding the corresponding statement through name. For a function, name lookup may match multiple declarations. If it is a function, you can use Argument-dependent lookup to find the statement. If it is a function Template, yo
Code sharing of online conversion tools for php Unicode encoding and decoding
The code is as follows:
Function unicode_encode ($ name)
{
$ Name = iconv ('utf-8', 'ucs-2', $ name );
$ Len = strlen ($ name );
$ Str = '';
For ($ I = 0; $ I {
$ C = $ name [$ I];
$ C2 = $ name [$ I + 1];
If (ord ($ c)> 0)
{// Two-byte text
$ Str. = '\ U '. base_convert (ord ($ c), 10, 16 ). str_pad (base_convert (ord ($ c2), 10, 16), 2, 0, STR_PAD_LEFT );
}
Else
{
$ Str.
Code sharing of online conversion tools for php Unicode encoding and decoding
The code is as follows:
Function unicode_encode ($ name){$ Name = iconv ('utf-8', 'ucs-2', $ name );$ Len = strlen ($ name );$ Str = '';For ($ I = 0; $ I {$ C = $ name [$ I];$ C2 = $ name [$ I + 1];If (ord ($ c)> 0){// Two-byte text$ Str. = '\ U '. base_convert (ord ($ c), 10, 16 ). str_pad (base_convert (ord ($ c2), 10, 16), 2, 0, STR_PAD_LEFT );}Else{$ Str. = $ c2;}}Re
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.