Python Route 3 "knowledge points" vernacular Python coding and file manipulation

Source: Internet
Author: User
Tags python script

Python file Header Template

Let's start with a little bit of knowledge: How to automatically add file header information when creating a file !

By: File--settings each time through the file--setings to open the settings page too troublesome! You can select the toolbar toolbar by: view--

The modified effect:

The first line of the Python Script template

This is very simple to tell the system with what explanation to explain, if you directly with Python python_file_name.py words, this has no effect can not add.

But if you want to go straight through./python_file_name.py to run, you have to add!

Why?

Take Linux and Windows for example, what is the default Linux interpreter that has its own executable file? The shell will default to the shell if you don't tell the system to use Python to explain it. It's going to go wrong if you don't know.

Windows also has its own interpreter for PowerShell .... So do you understand?

Second, Python script template third line

This doesn't need a special explanation, it's the author's comment.

Let's look at the second line alone.

Python Coding--painstaking

In the 2.7 environment we're going to write this line #-*-coding:utf-8--* Why do we add this line? The same means that the top encoding type is UTF-8 encoded!

First of all, before looking at this question, have we ever thought of a problem?

Why can we see the text, numbers, pictures, characters, and so on on the display? Everyone knows that the computer itself can only identify 0 1 combinations, how do they show these things? How do we communicate with computers?

If we use a combination of 0 1 and computer communication, can you still see these things? Another problem is that the 01 combination is almost impossible for us to understand, right?

So what? How to make the computer understand our language, and we can understand the language of the computer?

To give a comparative image of the example, the Chinese and English dictionary table, so that we can be in English and Chinese translation of each other? Right! The same thing with computers. He needed a standard control relationship, so what was the first name of the standard? ASCII table

ASCII (American Standard Code for information Interchange, US Information Interchange standards codes) is a computer coding system based on the Latin alphabet, mainly used to display modern English and other Western European languages. It is now the most versatile single-byte encoding system and is equivalent to ISO/IEC 646.

Let's look at this table:

There are special symbols, uppercase letters, small letters, numbers ( note here that the 0~9 number is a character ), and there is a 10-digit number on the left side of these characters. But for the 10, the computer is not understandable, because he can only understand 0 1, but the 10 binary and 2 binary conversion is very easy!

Click on an article I wrote earlier to view: decimal & Binary Conversions

For example, if I press a letter A on my keyboard, I actually transmit a number 65 to the computer, and through this mechanism and computer communication, with this ASCII code table can communicate with any computer. Nice

Here's a little bit of knowledge: what's the smallest unit in the computer? Bit bit we always say a binary, a binary is either 0 or 1

But bit this unit is too small, we use byte to represent. They have the rules of conversion (see the following rules I think we are not very strange, right):

"8b = 1 b  #小b =bit; Big b=byte1024b = 1KB1024KB = 1m1024m = 1g1024g = 1T"

In the storage of English we need at least 1 bytes (one letter), is 8 bit (bit), look at the ASCII table of 1 bytes can represent all the English required characters, is not very efficient!

Why is it? The space of the early computer is very valuable!

Then you will find 1 bytes 8 bits, the maximum data he can store is 2 8 square-1 = 255, one byte can represent up to 255 characters that Western countries they used 127 characters, so what are the remaining characters? is to do the expansion, Westerners consider other countries. So left the extension bit.

However, there is a problem, the computer is invented by Westerners, if only support English, these 127 characters can be fully expressed in all the English language can be used in the content. But he did not consider us big China Ah! ASCII to China after the discovery: We are the most commonly used Chinese in China are more than 6,000 completely enough!

But how do we do that? The Chinese are very smart: in the original extension, expand their own GBK, gb2312, gb2318 character encoding.

How did he expand it? For example, in the ASCII code of 128 This position, this position also specify a separate table, smart Bar! Other countries are designed in this way!

China's East Asian power is it, our country comparison NB, I want to be compatible with other countries commonly used code! For example, Korea and Japan, because Korean and Japanese people have their own code, they do not bird you, for example, such as Korean games, in China after the download and installation will be garbled after the situation ? What the heck?

This garbled appearance basically has two kinds of situation:

1, character encoding No

2, character encoding conflict, people in writing this program when the specified character set and we use the character set position is not correct. 0 0!

Do you think it's not just Asian countries, European countries, African countries will have this problem, based on this chaos like the Internet organization said that all of your countries do not, we give you a unified , What is the unification of what is the Unicode "universal Code",

Unicode (Uniform Code, universal Code, single code) is a character encoding used on a computer. Unicode is created to address the limitations of traditional character encoding schemes, which set a uniform and unique binary encoding for each character in each language.

The specified characters and symbols are represented by at least 16 bits (2 bytes), that is: 2 **16 = 65536, Note: This is said to be at least 2 bytes, possibly more .

Here's another question: The amount of bytes used is increased, so the direct effect is that the space used is directly doubled ! For example, it is also true that the ABCD characters store an identical article, and if the ASCII code is 1M, then the Unicode storage at least 2M may be more.

In order to solve a problem arises: UTF-8 code

UTF-8 encoding: Is the compression and optimization of Unicode encoding, he no longer use a minimum of 2 bytes, but all the characters and symbols are categorized: the contents of the ASCII code is saved with 1 bytes, the characters in Europe are saved with 2 bytes, the characters in East Asia are saved in 3 bytes ...

stored in this extensible way.

OK above understand:

1, what ASCII encoding

2, what Unicode encoding

3, what UTF-8 code

Review the occurrence of garbled characters cause: 1, no character set 2, character set conflicts

Look back and see why you need to add the specified code to the second line? In the 2.x version of Python Pyton when explaining the. py file, the default is to give him a coded ASCII code , so if you do not specify the encoding in version 2.7 and write an ASCII code in the. py file, characters that are not present will show garbled 0 0!

However, this problem does not exist in Python3, because the default is Unicode encoding in Python3 .....

Python Encoding Conversion

There is a problem, since there is a unified Unicode encoding, for the hair also need code conversion? Everyone unified a code is not OK?

1.

Don't ask me why, I ask you a question, if the world has a world language, you will give up Chinese? To use the world's universal language? This is a pit, and it's a problem left over.

However, although the world language will slowly replace our common language, we will communicate in the use of the world language will not have a communication barrier, right. (Just for example)

2.

What is another situation? Korea's game to China after it is garbled? In conjunction with the previous answer we can guess that the person writing the game may not even consider exporting to other countries when writing the game. If we don't have this Unicode code, it's definitely garbled to show up here.

Then you need to convert them into a Unicode (utf-8) encoding set by transcoding them. So they can display the Korean text normally! (This is just a transcoding set not translated into Chinese don't confuse ~~! )

First, the encoding conversion in Python3

#因为在Python3中默认就是unicode编码

#!/usr/bin/env python#-*-coding:utf-8-*-#author luotianshuaitim = ' Day Handsome ' #转为UTF-8 coded print (Tim.encode (' UTF-8 ')) # Convert to GBK encoded print (Tim.encode (' GBK ')) #转为ASCII编码 (Error why?) print (Tim.encode (' ASCII ') because there is no ' handsome ' character set in the ASCII table ~ ~)

Second, the code conversion in python2.x

#因为在python2. The default is ASCII encoding in x, you specify the encoding in the file as UTF-8, but UTF-8 if you want to turn GBK words can not directly turn, need Unicode to do a forwarding site.

#!/usr/bin/env python#-*-coding:utf-8-*-#author luotianshuaiimport chardettim = ' Hello ' print Chardet.detect (Tim) # Decode the Unicode encoding first, and then encode from Unicode to Gbknew_tim = Tim.decode (' UTF-8 '). Encode (' GBK ') print Chardet.detect (new_tim) #结果 ' {' Confidence ': 0.75249999999999995, ' encoding ': ' Utf-8 '} {' confidence ': 0.35982121203616341, ' encoding ': ' TIS-620 '} '

Python Route 3 "knowledge points" vernacular Python coding and file manipulation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.