[Python]ctypes+struct implementation of Class C structured data serial processing __python

Source: Internet
Author: User
Tags unpack
1. The structured data processing with C + + implemented

In the process of developing communication protocols, it is often necessary to develop a language to effectively express and process the data structure of the defined communication protocol. In this regard it is the natural advantage of the C + + language: Through struct, union, and Bit-fields, C + + can handle such problems in a most efficient and natural manner.

For example, the following figure is part of a communication protocol used by a smart grid for remote automatic meter reading.

C can be described as follows:

  struct
  {
    unsigned char  urouter:1;  Route identification
    unsigned char   usubnode:1;//attached node identifies
    unsigned char   ucm:1;     The communication module identifies
    unsigned char   ucd:1;     Conflict Detection
    unsigned char   ulevel:4;  Relay level
    unsigned char   uchannel:4;//channel identification
    unsigned char   uerrbate:4;//error-Correcting code identification
    unsigned char   uresbytes//Expected number of response bytes
    unsigned short  uspeed:15;//Communication baud rate, bin format
    unsigned short  uunit:1;   0:bps;1:kbps
    unsigned char  ureserve;
  } Req;

This not only clearly describes the message data structure that fully conforms to the requirements of the communication protocol, but also has at least the following two advantages:
1. It is extremely convenient to assign value to any variable in structure, such as

    struct REQ R;
    R.UCD = 0;
    R.uchannel = 0x0f;

And don't bother to compute the offset. And if the communication protocol is upgraded later, just define changes to the data structure, and the rest of the code is completely unchanged.
2. More importantly, this data structure is naturally in computer memory arranged according to the serial structure of the communication protocol (assuming that the big end problem is set correctly), only

    struct REQ R;
    ...
    Send ((unsigned char *) &r, sizeof (R));

You can convert the data into a byte stream in a format that is exactly the same as the communication protocol. Sent out. and receive parsing is equally convenient:

    struct REQ rs;
    unsigned char rcv_buffer[100];
    ...
    RCV (Rcv_buffer sizeof (REQ));
    memcpy ((unsigned char *) &rs, Rcv_buffer, sizeof (R));


2. Structured data processing implemented in Python

Now the question is: would it be possible to use Python to achieve this structured data processing as easily? That is, the need to implement the following functions: To access the data segment by variable name, do not need to manually calculate the offset to handle the bit-level data section can easily form a serial communication word throttling, but also easy to receive from the byte stream parsing data;

One might think it's not a problem: is it possible to use a python dictionary? To think about it, a dictionary can only provide the first requirement, that is, to access data segments with variable names. But Python, because it is a high-level language, provides only one data structure for an int, and many times in the protocol the data segment is bit-level, or single-byte, two-byte, three-byte. Only python native data structures can not directly access the bit-level segments, and even the data body finally accounted for a few bytes, can not be convenient statistics.

To solve this problem, the essence is to return to the level of C language. The good news is that Python provides a library of ctypes that allows us to implement functions similar to the C language in Python.

>>> from ctypes Import *
>>> class Req (Structure):
    _fields_=[(' Urouter ', c_ubyte,1),
            (' Usubnode ', c_ubyte,1, ('
            UCM ', c_ubyte,1), ('
            UCD ', c_ubyte,1), ('
            ulevel ', c_ubyte,4)
            , (' Uchannel ', c_ubyte,4),
            (' uerrbate ', c_ubyte,4),
            (' uresbytes ', c_ubyte), (' Uspeed ',
            c_ushort,15), (' Uunit ',
            C _ushort,1),
            (' Ureserve ', c_ubyte)]
>>> r=req ()
>>> sizeof (R)
8
> >> r.uunit=1
>>> print r.uunit
1
>>> r.uunit=2
>>> Print R.uunit
0

The main function of the cTYPES library is actually to use the Python program to invoke the libraries and DLLs generated by the C compiler, but we use only the data structure here.

cTYPES is used with the following considerations: The custom struct class must inherit the structure or union class; A list variable named fields must be defined in a custom struct class. Each element is a tuple, defines the structure of each data unit information, format (' variable name string ', variable data type [, bit number]) defined class, you can use sizeof (class name) View the number of bytes in the data body, as in the C language. Then use the instance name. Member name to access the corresponding data unit, if the init () method is defined after inheritance, you can also initialize the class 3. Serial Data stream Processing

With the structure, the three requirements above meet the two, on the third request, although cTYPES provided the cast () method, but after my research, I found that cast can only achieve a simple array structure such as data type pointer conversion, but not like C to convert the structure object address to byte address. In this case, another library of Python is needed: struct

struct is a library dedicated to structural and data flow transformations, the main methods we use are pack () and unpack (). The use of pack () is described as follows:

Struct.pack (FMT, V1, v2, ...) 
Return a string containing the values V1, v2, ... packed according to the given format. The arguments must match the values required by the format exactly.

As an example:

>>> Pack (' BHB ', 1,2,3)
' \x01\x00\x02\x00\x03 '

The use of pack () is similar to format (), and the first parameter indicates the formatting to be converted with a string, for example, ' B ' denotes a 8-bit unsigned integer, ' H ' is a 16-bit unsigned integer, and so on, as detailed in the instructions for the struct library in Python help. Here the ' BHB ' is equal to indicate that the following three digits into a byte stream, the first number is represented by a 8-bit unsigned number, the second is represented by a 16-bit unsigned number, and the third is represented by a 8-bit unsigned number.

Wait a minute! What's wrong? Two 8-bit unsigned number, a 16-bit unsigned number, combined should be 4 bytes. But we see the conversion result ' \x01\x00\x02\x00\x03 ' altogether is five bytes, the last 3 is also when 16 unsigned number processing, is it a bug?

This question is also clearly stated in the Help document, which is the difference between the so-called machine ' native format and the standard format. In short, for some C compilers, without a special compilation constraint, data such as unsigned char is not really represented in 1 bytes for the sake of processing the word width, but rather as the length of the CPU register that is most appropriate for processing. For example, an unsigned 8-digit number followed by an unsigned 16-digit number is also represented by a 16-bit width. This wastes memory, but is more efficient at addressing assignments ... In summary, if you must require a strict 8-bit and 16-bit, you need to use standard format, which is to qualify the first letter of the format string, such as:

>>> Pack (' >BHB ', 1,2,3)
' \x01\x00\x02\x03 '

Here's >: The byte stream conversion uses standard format, and it uses big-endian mode. 4. The character throttling conversion of the structural body

With the pack () this tool, and then back to the front of the structure of the byte stream conversion ... There is still a problem, because the pack () can implement Single-byte, Double-byte, but cannot operate on the bit field. And how to solve it.

In fact, I did not find a good solution to the problem, after all, pack () requires us to manually specify variables, define the order and byte length. Here I offer a solution, which is to borrow union.

Still take the previous structure as an example, another way of writing:

>>> class Flag_struct (Structure): _fields_=[(' Urouter ', c_ubyte,1), (' Usubnode ', c_ubyte,1), (' UCM ', c_ubyte,1), (' UCD ', c_ubyte,1), (' Ulevel ', c_ubyte,4)] >>> class Flag_union ( Union): _fields_=[(' whole ', c_ubyte), (' Flag_struct ', flag_struct)] >>> class Channel_struct (Stru Cture): _fields_=[(' Uchannel ', c_ubyte,4), (' Uerrbate ', c_ubyte,4)] >>> class Channel_union (Union ): _fields_=[(' whole ', c_ubyte), (' Channel_struct ', channel_struct)] >>> class Speed_struct (Struc
    ture): _fields_=[(' Uspeed ', c_ushort,15), (' Uunit ', c_ushort,1)] >>> class Speed_union (Union): _fields_=[(' whole ', c_ushort), (' Speed_struct ', speed_struct)] >>> class Req (Structure): _pack_
            =1 _fields_=[(' flag ', flag_union), (' channel ', Channel_union), (' Uresbytes ', c_ubyte), (' Speed ', speed_union),
            (' Ureserve ', C_ubyte)] 

In short, all fields involving Bit-field are represented by a union and a child struct. (where the pack is for 1-byte alignment, similar to the native format and standard format described in the previous section). The purpose of this is to compromise bit-field access with whole byte conversion processing, for example:

>>> r=req ()
>>> r.speed.speed_struct.uunit=1
>>> r.flag.flag_struct.ulevel=0xf
>>> ack (' >BBBHB ', r.flag.whole,r.channel.whole,r.uresbytes,r.speed.whole,r.ureserve)
' \xf0\ X00\x00\x80\x00\x00 '

5. A simpler method of Word throttling conversion

Later, by looking at the document carefully, it was found that cTYPES provides a simpler way to convert a byte stream:

String_at (AddressOf (R), sizeof (R))

AddressOf () and String_at are all methods provided in cTYPES. This is the closest approach to native C, so the union doesn't have to be defined.

>>> class Req (Structure):
    _pack_=1
    _fields_=[(' Urouter ', c_ubyte,1),
            (' Usubnode ', c_ubyte,1),
            (' UCM ', c_ubyte,1),
            (' UCD ', c_ubyte,1), ('
            ulevel ', c_ubyte,4), ('
            uchannel ', c_ubyte,4), ('
            uerrbate
            ', c_ubyte,4), (' Uresbytes ', C_ubyte, (' Uspeed ', c_ushort,15), (' Uunit ',
            c_ushort,1), (
            ' Ureserve ', c_ubyte)]


>>> sizeof (REQ)
6
>>> r=req ()
>>> r.uunit=1
>>> r.ucm=1
>>> String_at (AddressOf (R), sizeof (R))
' \x04\x00\x00\x00\x80\x00 '

If you need a big-endian data structure, the superclass needs to select Bigendianstructure, and the definition of Bit-field is also from high to low, and you need to readjust the order of definitions as follows:

>>> class Req (bigendianstructure):
    _pack_=1
    _fields_=[(' Ulevel ', c_ubyte,4),
            (' UCD ', C_ubyte, 1), ('
            UCM ', c_ubyte,1),
            (' Usubnode ', c_ubyte,1), (
            ' Urouter ', c_ubyte,1),
            (' uerrbate ', c_ubyte,4),
            (' Uchannel ', c_ubyte,4),
            (' Uresbytes ', c_ubyte),
            (' Uunit ', c_ushort,1), ('
            uspeed ', c_ushort,15),
            (' Ureserve ', c_ubyte)]


>>> r=req ()
>>> r.ulevel=0xf
>>> r.uunit=1
>>> string_at ( AddressOf (R), sizeof (R))
' \xf0\x00\x00\x80\x00\x00 '

Finally, someone asked: Python is a high-level language, why do such a low-level thing? In fact, the surgery industry has specialized, for embedded communications, using Python to do high-level auxiliary testing tools is very convenient.

(2015-12-14 added) a method for parsing a stream of bytes into a structure:

R = Req ()
s = IO_RCV ()        #receive byte stream from IO
memmove (AddressOf (R), s,sizeof (REQ))
...
--------------------------------------------------------------------------------------------------------------- -------
Class Buffer:def __init__ (self, Bytes=none, recogn=10000): if (bytes = None): self.length  
            = 4 Self.buffer = struct.pack (' i ', recogn) else:self.length = Len (bytes)  
        Self.buffer = Bytes def write_char (self, c): Self.buffer = Self.buffer + struct.pack (' C ', c); Self.length = self.length + 1 def write_short (self, value): Self.buffer = Self.bu  
        Ffer + struct.pack (' h ', value); Self.length = self.length + 2 def write_int (self, value): Self.buffer = Self.buffer + STRUCT.PA  
        CK (' i ', value);  
        Self.length = self.length + 2 def write_utf (Self, Strings): Self.write_short (Len (strings))  
        Self.buffer = Self.buffer + struct.pack ('%ds '% len (strings), strings); Self.length = Self.length + len (strings) def to_bytes (self): bytes = Struct.pack (' i ',Self.length + 4) + Self.buffer # a,b = struct.unpack (' II ', bytes) return bytes def Read_cha  
      
    R (Self): c, Self.buffer = Struct.upack (' c%ds '% (len (self.buffer)-1), Self.buffer) return C  
        def read_short (self): value, Self.buffer = Struct.unpack (' h%ds '% (len (self.buffer)-2), Self.buffer) return value def read_int (self): value, Self.buffer = Struct.unpack (' i%ds '% (Len (self.buffer )-4), Self.buffer return value def read_utf (self): length = 0 self.read_shor T (length) string, Self.buffer = Struct.unpack (' i%ds '% (len (self.buffer)-length), Self.buffer) retur   N String

Long ago, always wanted to write a game server, starting from the bottom reactor,socket,epoll. Finally, started two months ago, called DL Server engine. Write to now, the bottom network communication module basically finished work, has begun to divide it into the process of communication, such as certification services, gateways, logic engine and so on.

In testing reactor performance, lazy to write in C + +, decided to use Python to write a client test tool, Python is used to interview a game company to learn. Python's Twisted Network library works well, and it's decided to use Python plus twisted to do client testing. In the process of writing, it was found that Python did not convert various basic types into binary streams. Internet Baidu, found struct this module, very easy to use, like the C of printf format. If you want to know its detailed usage. Please Baidu.

Refer to the As for the network packet unpack and write package class, with Python wrote this buffer class, if it is the contract, with buffer (Recogn = 10000) This constructor, because the general contract to the server with a message number, RECOGN is the message number, After writing all kinds of data, call the To_bytes method to put the length of the package into a whole package. If the server sent over the network packet, with the buffer (bytes = package) This construction method, package is the server sent to the binary stream. Then call each read_* method.

Recently at work while busy working, while busy writing DL engine, feeling too busy, in the work out a lot of problems, come back at night has been thinking about how to improve the DL engine, write to two or three points. Feel a little tired. This time, temporary pen, the DL engine stop for a period of time, wait until there is free, then write, a good thing, need the best state to operate. Turn from: http://blog.csdn.net/ball32109/article/details/7881881

http://blog.csdn.net/machael_sonic/article/details/50266499


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.