Jython Chinese problem solution reference

Source: Internet
Author: User

Jython Chinese problem: Garbled characters may occur when outputting Chinese characters.

Some Jython learners report Chinese garbled characters when using Jython to output Chinese characters. To address the Jython Chinese issue, you need to have an encoding format statement in the first line. For the writing method, refer to the coding declaration method in Python:

The encoding and decoding methods in Python are Unicode and Str. The encoding is Unicode-> Str. On the contrary, the decoding is str-> Unicode.

The remaining problems below are to determine when encoding or decoding is required. For example, some libraries are in the Unicode version, in this way, we need to encode the returned values of these library functions into appropriate types when transmitting or writing files.

For the "encoding indication" at the beginning of the file, that is, #-*-coding:-*-this statement. Python default script files are all anscii encoded. When there are characters in the file that are not within the anscii encoding range, the "encoding indication" should be used for correction.

About SYS. defaultencoding, this method is used when decoding is not explicitly specified. For example, I have the following code:

 
 
  1. #! /Usr/bin/env python
  2. #-*-Coding: UTF-8 -*-
  3.  
  4. S = 'Chinese' # note that str is of the str type, not unicode type.
  5. S. encode ('gb18030 ')

This code re-encodes s into the gb18030 format, that is, Unicode-> STR conversion. Because S is of the STR type, Python automatically decodes s to Unicode and then encodes it into gb18030. Because the decoding is automatically performed by python, we do not specify the decoding method, python will use the method specified by sys. defaultencoding to decode. In many cases, SYS. defaultencoding is anscii. If S is not of this type, an error occurs.

In this case, my sys. defaultencoding is anscii, and the S encoding method is the same as the file encoding method, which is utf8, so an error occurs:

Unicodedecodeerror: 'ascii 'codec can't decode byte 0xe4 in position

0: ordinal not in range (128)

In this case, we have two ways to correct the error:

1. clearly indicate the encoding method of s.

 
 
  1. #! /Usr/bin/env python
  2. #-*-Coding: UTF-8 -*-
  3.  
  4. S = 'Chinese'
  5. S. decode ('utf-8'). encode ('gb18030 ')

Second, change sys. defaultencoding to the file encoding method.

 
 
  1. #! /Usr/bin/env python
  2. #-*-Coding: UTF-8 -*-
  3.  
  4. ImportSys
  5. Reload (sys) # The method sys. setdefaultencoding will be deleted after Python2.5 initialization. We need to reload
  6. Sys. setdefaultencoding ('utf-8 ')
  7.  
  8. Str = 'Chinese'
  9. Str. encode ('gb18030 ')

This should solve the problem of Chinese garbled characters in Jython.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.