This article mainly introduces the comparison between Python 2 and Python 3 and encoding. This article describes in detail. For more information, see the following. I. version comparison
The Python version is mainly divided into two categories:
Python 2.7.3 is the most widely used Python 2.7.3.
Python 3. x, known as Python3: is the latest version, such as Python 3.1. In the long run, it is also a future trend.
[Differences between Python2 and Python3]
1. from Python2 to Python3, many basic function interfaces have changed. even some libraries or functions have been removed and renamed.
Python2 and Python3 have changed interfaces in many basic and most commonly used functions. The most typical function is to count the most commonly used print functions.
2. third-party database support, which is currently the best supported by Python2, is not supported by Python3
One of the reasons why Python is powerful is that there are many third-party libraries and powerful functions.
Currently, many third-party Python libraries only support Python 2.
Or, even if Python 3 is provided, it is not necessarily mature.
II. encoding comparison
In Python, there are only two types of characters, whether Python2 or Python3:
General Unicode characters;
(After unicode is encoded) a character of a certain encoding type, such as UTF-8, GBK and so on.
Character type in Python2:
Character type in Python3:
We can think of a string in two states: text state and byte (binary) state. The two character types in Python2 and Python3 correspond to the two states respectively, and then perform codec conversion between them. Encoding refers to converting a string into a bytecode that involves the internal representation of a string. decoding refers to converting a bytecode into a string and displaying bits as characters.
In Python2, both str and unicode have the encode and decode methods. However, it is not recommended to use encode for str and decode for unicode, which is a design defect in Python2. Python3 is optimized. str only has one encode method to convert the string into a bytecode, and bytes only has one decode method to convert the bytecode into a text string.
Both str and unicode of Python2 are subclasses of basestring, so they can be directly spliced. In Python3, bytes and str are two independent types, which cannot be spliced.
In Python2, the common character enclosed by quotation marks is str. the encoding type of the string corresponds to the encoding type stored in your Python file, in the most common Windows platform, GBK is used by default. In Python3, the string enclosed by single quotes or double quotes is a Unicode str.
There are some prerequisites for str encoding:
The Python file starts to declare the corresponding encoding.
The Python file itself is indeed saved using this encoding.
The two encoding types should be the same (for example, both UTF-8 or GBK)
In this way, the Python parser can correctly parse the text into the corresponding encoded str.
In general, the character encoding problem in Python3 has been greatly optimized, and it is no longer as troublesome as Python2. In Python3, the text is always Unicode, represented by the str type. binary data is represented by bytes, which does not secretly mix str and bytes, making the difference between the two more obvious.
Summary
The above is all about this article. I hope this article will help you learn or use python. if you have any questions, please leave a message, thank you for your support for PHP.
For more articles on comparison between Python 2 and Python 3 and encoding, refer to PHP Chinese network!