Python summary of encode and decode misreading

Source: Internet
Author: User

python summary of encode and decode misreading



Recently learning Python, there's a misunderstanding about coding.

The following is an erroneous understanding:

Encode (): encoding that converts the encoding of an object into a specified encoding format, which, in the literal sense, has always been thought of as converting other encoding formats into Unicode format encodings

Decode (): Decoding, is the inverse of the encoding process. Parsing and decoding, converting Unicode format to other formats.


Check out some information and other great God blogs to get the right recognition and understanding

The role of Decode is to convert other encoded strings into Unicode encodings, such as Str1.decode (' gb2312 '), to convert gb2312 encoded string str1 into Unicode encoding.

The role of encode is to convert Unicode encoding into other encoded strings, such as Str2.encode (' gb2312 '), to convert Unicode encoded string str2 to gb2312 encoding.


Python is a language that is prone to coding problems. So, I write down these words according to my understanding.

First, there are several concepts to understand.

* Bytes: Representation of computer data. 8-bit binary. can represent unsigned integers: 0-255. Below, a string consisting of "bytes" is denoted by "byte stream".

* Characters: The English character "abc", or the Chinese characters "you I he". The character itself does not know how to save it in the computer. In the following paragraphs, the word "string" is avoided and "text" is used to table

A string that consists of "characters".

* Code (VERB): Converts "text" to "byte stream" according to a certain rule (this rule is called: encoding (noun)). (in Python: Unicode becomes str)

* Decode (verb): Converts a "byte stream" into "text" according to a rule. (in Python: Str becomes Unicode)

* * In fact, anything that is represented in a computer requires coding. For example, the video is encoded and then saved in a file, which needs to be decoded for viewing when playing.

Unicode:unicode defines the correspondence between a "character" and a "number", but does not specify how the "number" is saved in the computer. (Just like in C, an integer can be either int or short.) Unicode does not specify whether to use int or short to denote a "character")

Utf8:unicode implementation. It uses the Unicode-defined "character" "number" mapping, which in turn specifies how the number is saved on the computer. The other utf16 are Unicode implementations.



Summarize:

encoding is converting text (string) into byte stream, Unicode format to other encoding format

Decoding is converting bytes into strings (text), other encoding formats to Unicode


Python summary of encode and decode misreading

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.