The coding problem of Python basics

Source: Internet
Author: User
Tags string format

Python Basics Coding issues in this section
    1. The origin of the string encoding problem
    2. String Encoding Solution
1. String encoding problem origin

Since the string encoding evolved from the ASCII--->unicode--->utf-8 (utf-16 and utf-32, etc.), and similar to China's GBK encoding, these encodings are incompatible with each other, So the written software implementation of the cross-language platform to run will appear characters garbled problem ...

The information is as follows:

    1. In the Python2 default encoding is ASCII, python3 default is utf-8 (file encoding default is Utf-8, string encoding is Unicode by default)
    2. Unicode is divided into utf-32 (accounting for 4 bytes), utf-16 (two bytes), Utf-8 (1-4 bytes), so Utf-8 is Unicode
    3. In the Py3 encode, while transcoding will also change the string to bytes type, decode decoding will also turn bytes back to string
2. String Encoding Solution

First, it needs to be understood that Unicode encoding is compatible with all encoding formats, and Unicode encoding acts as an intermediate bridge between various encoding transformations, and if ASCII encoding is to be converted to GBK encoding, it must first be decoded, converted to Unicode encoding, And then re-encoded into GBK encoding to complete the process. The process of converting from other encodings to Unicode encoding is called decoding (decode), and the process of converting from Unicode encoding to other encodings is called encoding (encode). PS:UTF-8 encoding is not compatible with GBK encoding by default and needs to be converted to Unicode encoding to be compatible with GBK encoding.

The encoding and decoding methods can be referred to as follows:

The coding problem involves the following aspects:

    1. Encoding format of the file
    2. Encoding format for strings
    3. Terminal encoding format for output string

The encoding format of the file and the encoding format of the string are consistent with the encoding format of the terminal in order to properly output the desired string.

There are two functions for transcoding in Python, the Encode () encoding function, and the decode () decoding function. Where the Encode function needs to fill in the source encoding format of the string, the Decode function needs to fill in the string format to be encoded. The test code is as follows, the original encoding format is the UTF-8 format string:

1s="Tesla"2S_to_unicode=s.decode ("Utf-8")#decoding into Unicode encoding format3 Print(s)4 Print(S_to_unicode)5Unicode_to_gbk=s_to_unicode.encode ("GBK")#encode into GBK encoded format6 Print(UNICODE_TO_GBK)7Gbk_to_unicode=unicode_to_gbk.decode ("GBK")#decoding into Unicode encoding format8 Print(Gbk_to_unicode)9Unicode_to_utf8=gbk_to_unicode.encode ("Utf-8")#encode into UTF-8 encoded formatTen Print(Unicode_to_utf8)

The coding problem of Python basics

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.