import struct
pack、unpack、pack_into、unpack_from
# ref: http://blog.csdn.net/JGood/archive/2009/06/22/4290158.aspx</p><p>import struct</p><p>#pack - unpack<br />print<br />print '===== pack - unpack ====='</p><p>str = struct.pack("ii", 20, 400)<br />print 'str:', str<br />print 'len(str):', len(str) # len(str): 8 </p><p>a1, a2 = struct.unpack("ii", str)<br />print "a1:", a1 # a1: 20<br />print "a2:", a2 # a2: 400</p><p>print 'struct.calcsize:', struct.calcsize("ii") # struct.calcsize: 8</p><p>#unpack<br />print<br />print '===== unpack ====='</p><p>string = 'test astring'<br />format = '5s 4x 3s'<br />print struct.unpack(format, string) # ('test ', 'ing')</p><p>string = 'he is not very happy'<br />format = '2s 1x 2s 5x 4s 1x 5s'<br />print struct.unpack(format, string) # ('he', 'is', 'very', 'happy')</p><p>#pack<br />print<br />print '===== pack ====='</p><p>a = 20<br />b = 400</p><p>str = struct.pack("ii", a, b)<br />print 'length:', len(str) #length: 8<br />print str<br />print repr(str) # '/x14/x00/x00/x00/x90/x01/x00/x00'</p><p>#pack_into - unpack_from<br />print<br />print '===== pack_into - unpack_from ====='<br />from ctypes import create_string_buffer</p><p>buf = create_string_buffer(12)<br />print repr(buf.raw)</p><p>struct.pack_into("iii", buf, 0, 1, 2, -1)<br />print repr(buf.raw)</p><p>print struct.unpack_from("iii", buf, 0)<br />
運行結果:
[work@db-testing-com06-vm3.db01.baidu.com python]$ python struct_pack.py
===== pack - unpack =====
str: ?
len(str): 8
a1: 20
a2: 400
struct.calcsize: 8
===== unpack =====
('test ', 'ing')
('he', 'is', 'very', 'happy')
===== pack =====
length: 8
?
'/x14/x00/x00/x00/x90/x01/x00/x00'
===== pack_into - unpack_from =====
'/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00'
'/x01/x00/x00/x00/x02/x00/x00/x00/xff/xff/xff/xff'
(1, 2, -1)
==============================================================================
Python是一門非常簡潔的語言,對於資料類型的表示,不像其他語言預定義了許多類型(如:在C#中,光整型就定義了8種),它只定義了六種基本類型:字串,整數,浮點數,元組,列表,字典。通過這六種資料類型,我們可以完成大部分工作。但當Python需要通過網路與其他的平台進行互動的時候,必須考慮到將這些資料類型與其他平台或語言之間的類型進行互相轉換問題。打個比方:C++寫的用戶端發送一個int型(4位元組)變數的資料到Python寫的伺服器,Python接收到表示這個整數的4個位元組資料,怎麼解析成Python認識的整數呢? Python的標準模組struct就用來解決這個問題。
struct模組的內容不多,也不是太難,下面對其中最常用的方法進行介紹:
struct.pack
struct.pack用於將Python的值根據格式符,轉換為字串(因為Python中沒有位元組(Byte)類型,可以把這裡的字串理解為位元組流,或位元組數組)。其函數原型為:struct.pack(fmt, v1, v2, ...),參數fmt是格式字串,關于格式字串的相關資訊在下面有所介紹。v1, v2, ...表示要轉換的python值。下面的例子將兩個整數轉換為字串(位元組流):
import struct
a = 20
b = 400
str = struct.pack("ii", a, b) #轉換後的str雖然是字串類型,但相當於其他語言中的位元組流(位元組數組),可以在網路上傳輸
print 'length:', len(str)
print str
print repr(str)
#---- result
#length: 8
# ----這裡是亂碼
#'/x14/x00/x00/x00/x90/x01/x00/x00'
import struct
a = 20
b = 400
str = struct.pack("ii", a, b) #轉換後的str雖然是字串類型,但相當於其他語言中的位元組流(位元組數組),可以在網路上傳輸
print 'length:', len(str)
print str
print repr(str)
#---- result
#length: 8
# ----這裡是亂碼
#'/x14/x00/x00/x00/x90/x01/x00/x00'
格式符"i"表示轉換為int,'ii'表示有兩個int變數。進行轉換後的結果長度為8個位元組(int類型佔用4個位元組,兩個int為8個位元組),可以看到輸出的結果是亂碼,因為結果是位元據,所以顯示為亂碼。可以使用python的內建函數repr來擷取可識別的字串,其中十六進位的0x00000014, 0x00001009分別表示20和400。
struct.unpack
struct.unpack做的工作剛好與struct.pack相反,用於將位元組流轉換成python資料類型。它的函數原型為:struct.unpack(fmt, string),該函數返回一個元組。 下面是一個簡單的例子:str = struct.pack("ii", 20, 400)
a1, a2 = struct.unpack("ii", str)
print 'a1:', a1
print 'a2:', a2
#---- result:
#a1: 20
#a2: 400
str = struct.pack("ii", 20, 400)
a1, a2 = struct.unpack("ii", str)
print 'a1:', a1
print 'a2:', a2
#---- result:
#a1: 20
#a2: 400
struct.calcsize
struct.calcsize用於計算格式字串所對應的結果的長度,如:struct.calcsize('ii'),返回8。因為兩個int類型所佔用的長度是8個位元組。
struct.pack_into, struct.unpack_from
這兩個函數在Python手冊中有所介紹,但沒有給出如何使用的例子。其實它們在實際應用中用的並不多。Google了很久,才找到一個例子,貼出來共用一下:
import struct
from ctypes import create_string_buffer
buf = create_string_buffer(12)
print repr(buf.raw)
struct.pack_into("iii", buf, 0, 1, 2, -1)
print repr(buf.raw)
print struct.unpack_from('iii', buf, 0)
#---- result
#'/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00'
#'/x01/x00/x00/x00/x02/x00/x00/x00/xff/xff/xff/xff'
#(1, 2, -1)
import struct
from ctypes import create_string_buffer
buf = create_string_buffer(12)
print repr(buf.raw)
struct.pack_into("iii", buf, 0, 1, 2, -1)
print repr(buf.raw)
print struct.unpack_from('iii', buf, 0)
#---- result
#'/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00'
#'/x01/x00/x00/x00/x02/x00/x00/x00/xff/xff/xff/xff'
#(1, 2, -1)
具體內容請參考Python手冊 struct 模組
Python手冊 struct 模組:http://docs.python.org/library/struct.html#module-struct
轉載聲明: 本文轉自 http://blog.csdn.net/JGood/archive/2009/06/22/4290158.aspx
Format |
C Type |
Python type |
Standard size |
Notes |
x |
pad byte |
no value |
|
|
c |
char |
string of length 1 |
1 |
|
b |
signed char |
integer |
1 |
(3) |
B |
unsigned char |
integer |
1 |
(3) |
? |
_Bool |
bool |
1 |
(1) |
h |
short |
integer |
2 |
(3) |
H |
unsigned short |
integer |
2 |
(3) |
i |
int |
integer |
4 |
(3) |
I |
unsigned int |
integer |
4 |
(3) |
l |
long |
integer |
4 |
(3) |
L |
unsigned long |
integer |
4 |
(3) |
q |
long long |
integer |
8 |
(2), (3) |
Q |
unsigned long long |
integer |
8 |
(2), (3) |
f |
float |
float |
4 |
(4) |
d |
double |
float |
8 |
(4) |
s |
char[] |
string |
|
|
p |
char[] |
string |
|
|
P |
void * |
integer |
|
(5), (3) |
Notes:
The '?' conversion code corresponds to the _Bool type defined by C99. If this type is not available, it is simulated using a char. In standard mode, it is always represented by one byte.
New in version 2.6.
The 'q' and 'Q' conversion codes are available in native mode only if the platform C compiler supports C long long, or, on Windows, __int64. They are always available in standard modes.
New in version 2.2.
When attempting to pack a non-integer using any of the integer conversion codes, if the non-integer has a __index__() method then that method is called to convert the argument to an integer before packing. If no __index__() method exists, or the call to __index__() raises TypeError, then the __int__() method is tried. However, the use of __int__() is deprecated, and will raise DeprecationWarning.
Changed in version 2.7: Use of the __index__() method for non-integers is new in 2.7.
Changed in version 2.7: Prior to version 2.7, not all integer conversion codes would use the __int__() method to convert, and DeprecationWarning was raised only for float arguments.
For the 'f' and 'd' conversion codes, the packed representation uses the IEEE 754 binary32 (for 'f') or binary64 (for 'd') format, regardless of the floating-point format used by the platform.
The 'P' format character is only available for the native byte ordering (selected as the default or with the '@' byte order character). The byte order character '=' chooses to use little- or big-endian ordering based on the host system. The struct module does not interpret this as native ordering, so the 'P' format is not available.
A format character may be preceded by an integral repeat count. For example, the format string '4h' means exactly the same as 'hhhh'.
Whitespace characters between formats are ignored; a count and its format must not contain whitespace though.
For the 's' format character, the count is interpreted as the size of the string, not a repeat count like for the other format characters; for example, '10s' means a single 10-byte string, while '10c' means 10 characters. For packing, the string is truncated or padded with null bytes as appropriate to make it fit. For unpacking, the resulting string always has exactly the specified number of bytes. As a special case, '0s' means a single, empty string (while '0c' means 0 characters).
The 'p' format character encodes a “Pascal string”, meaning a short variable-length string stored in a fixed number of bytes, given by the count. The first byte stored is the length of the string, or 255, whichever is smaller. The bytes of the string follow. If the string passed in to pack() is too long (longer than the count minus 1), only the leading count-1 bytes of the string are stored. If the string is shorter than count-1, it is padded with null bytes so that exactly count bytes in all are used. Note that for unpack(), the 'p' format character consumes count bytes, but that the string returned can never contain more than 255 characters.
For the 'P' format character, the return value is a Python integer or long integer, depending on the size needed to hold a pointer when it has been cast to an integer type. A NULL pointer will always be returned as the Python integer 0. When packing pointer-sized values, Python integer or long integer objects may be used. For example, the Alpha and Merced processors use 64-bit pointer values, meaning a Python long integer will be used to hold the pointer; other platforms use 32-bit pointers and will use a Python integer.
For the '?' format character, the return value is either True or False. When packing, the truth value of the argument object is used. Either 0 or 1 in the native or standard bool representation will be packed, and any non-zero value will be True when unpacking.