1. How the Python interpreter executes the py file, such as Python test.py
First stage: Thepython interpreter starts , which is equivalent to launching a text editor
The second stage: the Python interpreter equivalent to the text editor , to open the test.py file, from the hard disk to read the contents of the test.py file into memory (Small review: Pyhon interpretation, decided that the interpreter only care about the contents of the file, do not care about the file suffix name)
Phase three: ThePython interpreter interprets the code that executes just loaded into memory test.py (PS: In that phase, when executed, the Python syntax is recognized, execution of the in-file code executes to Name= "Egon", which will open up memory space to hold the string " Egon ")
2. Unicode, UTF-8
The origin of 2.1 Unicode, unified 2Bytes for a character, 2**16-1=65535, can represent more than 60,000 characters, thus compatible with the universal language
Function: Unicode: Simple rough, all characters are 2Bytes, the advantage is the character----the conversion speed of the number, the disadvantage is that occupy large space
The origin of the 2.2 UTF-8, but for the entire English text, this encoding is undoubtedly one times more storage space (binary is ultimately stored in the form of electricity or magnetic storage media)
Thus produced the UTF-8, the English characters only with 1Bytes, the Chinese characters with 3Bytes
Function: Utf-8: Accurate, different characters with different lengths, the advantage is to save space, the disadvantage is: character---number conversion speed is slow, because each time you need to calculate how long the character needs bytes to be able to accurately represent
- The encoding used in memory is Unicode, with space-time (the program needs to be loaded into memory to run, so the memory should be as fast as possible)
- in the hard disk or network transmission with UTF-8, network I/O latency or disk I/O latency is much larger than the utf-8 conversion delay, and I/O should be as much as possible to save bandwidth, ensure the stability of data transmission .
Use of 2.3 character encodings
Unicode------->encode--------->utf-8
UTF-8------->decode--------->unicode
3.1 Analysis Process
Files from memory brush to hard disk operations for short files
Files read from hard disk to memory for short read files
Comments:
If you do not specify the header information #-*-coding:utf-8-*-in the Python file, use the default
Default usage in Python2 in Ascii,python3 utf-8
3.2 Two types of string in Python3 str and bytes
STR is Unicode
# coding:utf-8s=' forest '# When the program executes, you do not need to add u, ' Forest ' will also be in Unicode form to save the new memory space, # s can be directly encode into any encoding format s.encode('utf-8') s.encode (' GBK ' )print#<class ' str ' >
This section summarizes
One
1. What code to save and what encoding to take out
PS: Memory fixed using Unicode encoding
We can control the encoding is to the hard disk storage or based on the network transmission selection code
2. Data is first generated in memory, is Unicode format, to be transferred to bytes format
#Unicode---------->encode (utf-8)---------->bytes
Get bytes--------->decode (GBK)---------->unicode
String in 3.python3 is recognized as Unicode
The string encode in Python gets bytes
Two.
Open
1. A system call is initiated to the operating system, and the operation opens a file
2. In a Python program A value is generated that points to the operating system to open that file, and we can assign that value to an X.
Recycling Resources
1.f.close (): shut down the operating system open file, that is, recycle the operating system resources
2.del f: no need to do this because after the Python program has finished running, all the memory consumption associated with the program is automatically cleaned up
f = open (R'aaaaa.py','r', encoding='utf-8 ' )#print (F.read ())#print (F.readline (), end= ")Print (F.readlines ()) F.close ()
Python character encoding (DAY10)