Python must learn bytes every day, pythonbytes bytes

Last Update:2016-01-31 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Python must learn bytes every day, pythonbytes bytes

The bytecode in Python is expressed in the form of B 'xxx. X can be represented by characters or in ASCII format \ xnn. nn consists of 256 types of characters from 00-ff (hexadecimal.

I. Basic operations

The following describes the basic operations of a byte. It can be seen that it is very similar to a string:

In [40]: B = B "abcd \ x64" In [41]: bOut [41]: B 'abcd' In [42]: type (B) out [42]: bytesIn [43]: len (B) Out [43]: 5In [44]: B [4] Out [44]: 100 #100 in hexadecimal notation \ x64

If you want to modify a byte in a byte string, you must convert it to bytearray before modifying it:

In[46]: barr = bytearray(b)In[47]: type(barr)Out[47]: bytearrayIn[48]: barr[0] = 110In[49]: barrOut[49]: bytearray(b'nbcdd')

2. Relationship between byte and character

As mentioned above, bytes are very similar to characters. In fact, they can be converted to each other. Bytes can be converted to corresponding characters in some encoding form. Bytes can be converted to characters by using the encode () method, while characters can be converted to bytes by using the decode () method:

In [50]: s = "My life is short, I use Python" In [51]: B = s. encode ('utf-8') In [52]: bOut [52]: B '\ xe4 \ xba \ xe7 \ x94 \ x9f \ xe8 \ x8b \ xa6 \ xe7 \ x9f \ xad \ xef \ xbc \ x8c \ xe6 \ x88 \ x91 \ xe7 \ x94 \ xa8Python 'In [53]: c = s. encode ('gb18030') In [54]: cOut [54]: B '\ xc8 \ xcb \ xc9 \ xfa \ xbf \ xe0 \ xb6 \ xcc \ xa3 \ xac \ xce \ xd2 \ xd3 \ xc3python' In [55]: B. decode ('utf-8') Out [55]: 'Life is short, I use python' In [56]: c. decode ('gb18030') Out [56]: 'Life is short. I use Python 'In [57]: c. decode ('utf-8') Traceback (most recent call last): exec (code_obj, self. user_global_ns, self. user_ns) File "<ipython-input-57-8b50aa70bce9>", line 1, in <module> c. decode ('utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byteIn [58]: B. decode ('gb18030') Out [58'

We can see that the characters and bytes parsed by different encoding methods are completely different. If different encoding methods are used for encoding and decoding, garbled characters may occur, or even conversion fails. Because each encoding method contains different types of bytes, the \ xc8 in the preceding example exceeds the maximum character of UTF-8.

Iii. Application

For the simplest example, I want to crawl the content of a webpage. Now I want to crawl the pages returned by Baidu when searching for Python. Baidu uses the UTF-8 encoding format, if the returned result is not decoded, It is a super long byte string. After correct decoding, a normal html page is displayed.

import urllib.requesturl = "http://www.baidu.com/s?ie=utf-8&wd=python"page = urllib.request.urlopen(url)mybytes = page.read()encoding = "utf-8"print(mybytes.decode(encoding))page.close()

The above is all the content of this article. I hope it will help you learn python programming.

Articles you may be interested in:

Go deep into the Python interpreter to understand the bytecode in Python

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python must learn bytes every day, pythonbytes bytes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python must learn bytes every day, pythonbytes bytes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support