How to delete all Chinese characters in a script

Source: Internet
Author: User
Tags ranges

Today, a user asked me a question: He wants to delete all Chinese characters in a script. This problem makes me a little hard at once. If I delete one or two Chinese characters in the script, it is my first time to match all Chinese characters. Then I thought about it carefully. Chinese characters are represented in a certain encoding format in the computer system, which is what we often say, such as GB2312 and GB18030, this problem should be well solved, as long as the encoding format is met, it is all Chinese characters. Therefore, the Chinese character encoding format is searched online. The result is as follows:
From the GB2312-1980 encoding, Chinese characters are both dubyte encoding. To distinguish it from the basic ASCII character set in the system, the first byte of all Chinese character codes is 1. For example, the "ah" character is encoded as 0xB0A1. The Chinese character encoding rules for GB2312 are as follows: the value of the first byte ranges from 0xB0 to 0xF7, and the value of the second byte ranges from 0xA0 to 0xFE. GB12345 and GB13000 are an extension of the GB2312-1980, all the Chinese character encoding that has been included in GB2312 remain unchanged, in addition to adding more code bit. The encoding rules are roughly as follows: the value of the first byte ranges from 0x81 to 0 x Fe, and the value of the second byte ranges from 0x40 to 0 x fe. Because GB13000 is an extension of GB2312, it is also known as GBK.
The remaining problems are simple. I will use sed to replace null values that conform to these encoding formats.
The sed command expression is as follows:
# Sed-r "s/[\ x81-\ xFE] [\ x40-\ xFE] // g" file
Run the following command to find that there is a problem. If the encoding settings of the original system are incorrect, update it:
# LANG = C sed-r "s/[\ x81-\ xFE] [\ x40-\ xFE] // g" file
C Represents the ASCII encoding format in the English environment. Run the code again. Everything is OK.
 
This article is from the "Xiao Miao" blog

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.