Python default Character Set

Source: Internet
Author: User

Python default Character Set

This article briefly introduces the character set history and configuration methods used for parsing Python programs.

Background: When writing a script program, it is inevitable that some variable content related to Chinese characters will be designed. In this case, it is a headache for a new Python beginner (including me) to configure python to correctly identify Chinese content in the program. This article briefly introduces how to configure the Python Character Set and some historical information.

 

Python default Character Set

The default Character Set of Python has been changed in several major versions. The following lists the default character sets of each version:

  • Python2.1 and earlier: latin1
  • Python2.3 and later, before Python2.5: latin1 (but WARNING is proposed for non-ASCII character sets)
  • Python2.5 and later: ASCII

In addition, it is also proposed to adjust the default character set to UTF-8 in later versions in THE PEP

 

How to configure the default character set (before Python2.5)

It is difficult to configure the default character set used for parsing the current Python script file before 2.5. Because these old versions do not support coding configuration similar to shebang. Although the old versions earlier than 2.5 are out of date, we recommend that you configure character sets in these versions. The specific configuration principle is throughsys.setdefaultencoding()Function. But the Tangle is that this functionsite.py(A script that runs automatically when Python is started) is deleted. As a result, the following methods are available on the Internet:

  • Reload (sys)
  • Modifysitecustomize.pyConfigure the global default Character Set

Both methods only work and are not elegant. For more detailed operation methods, refer to the discussion on stackoverflow.

 

How to configure the default character set (Python2.5 and later)

Python2.5The default character set configuration method will be much simpler in the future. As long as it is behind Shebang (that is#! /usr/bin/pythonThis line), followed by the character set configuration line of the previous line. The writing rules of Character Set Configuration lines must conform to such a regular expression.coding[:=]\s*([-\w.]+). That is to say, the following write methods can take effect:

#!/usr/bin/python# coding=utf8

Or

#!/usr/bin/python# -*- coding: utf8 -*-

Or

#!/usr/bin/python# vim: set fileencoding=<encoding name> :

All of these can work.

-------------------------------------- Split line --------------------------------------

Install Python3.4 on CentOS source code

Python core programming version 2. (Wesley J. Chun). [Chinese version of hd pdf]

Python development technology details. (Zhou Wei, Zong Jie). [hd PDF scan version + book guide video + code]

Obtain Linux information using a Python script

Build a desktop algorithm transaction research environment using Python in Ubuntu

A Brief History of Python Development

Python details: click here
Python: click here

This article permanently updates the link address:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.