This article describes how to use Python to implement the xxd-I function of Linux. For more information, see
I. Linux xxd-I Function
The xxd command in Linux displays the file content in binary or hexadecimal format. If the outfile parameter is not specified, the result is displayed on the terminal screen; otherwise, the result is output to the outfile. For detailed usage, see linux Command xxd.
This article focuses on the xxd command-I option. This option can be used to output the C language array definition in the name of inputfile. For example, after executing the echo 12345> test and xxd-I test commands, the output is:
unsigned char test[] = {0x31, 0x32, 0x33, 0x34, 0x35, 0x0a};unsigned int test_len = 6;
It can be seen that the array name is the name of the input file (if there is a suffix, the dot is replaced with an underline ). Note: 0x0a indicates the linefeed LF, that is, '\ n '.
Ii. Common usage of xxd-I
When the device does not have a file system or does not support dynamic memory management, binary files (such as boot programs and firmware) are sometimes stored in a static array of C code. In this case, the xxd command can be used to automatically generate version arrays. Example:
1) Use linuxcommand xddto convert the binary file vdslbooter.binto the 16‑in file dslbooter.txt:
Xxd-I <VdslBooter. bin> DslBooter.txt
The '-I' option indicates the style of the output file containing C (array mode ). The redirection symbol '<' redirects the content of the VdslBooter. binfile to the standard input. This Processing removes the array declaration and length variable definitions so that the output only contains hexadecimal values.
2) define the corresponding static array in the C code source file:
static const uint8 bootImageArray[] = {#include " ../../DslBooter.txt"};TargetImage bootImage = {(uint8 *) bootImageArray,sizeof(bootImageArray) / sizeof(bootImageArray[0])};
When compiling the source code, the content of the dslbooter.txt file is automatically expanded to the preceding array. By using the # include pre-processing command, you can save the trouble of manually copying the array content.
Iii. Python Implementation of the class xxd-I Function
This section uses the Python2.7 language to implement functions similar to xxd-I.
Because the author is in the learning stage, there are many differences in the code, but the functions are the same or similar, to provide different syntax reference, please forgive me.
First, read a short but complete program (saved as xddi. py ):
#! /Usr/bin/python # coding = UTF-8 # determine whether the C language keyword CKeywords = ("auto", "break", "case", "char", "const ", "continue", "default", "do", "double", "else", "enum", "extern", "float", "for", "goto ", "if", "int", "long", "register", "return", "short", "signed", "static", "sizeof", "struct ", "switch", "typedef", "union", "unsigned", "void", "volatile", "while", "_ Bool ") # _ Bool is the new C99 keyword def IsCKeywords (name): for x in CKeywords: if cmp (x, name) = 0: return Truereturn Falseif _ name _ = '_ main _': print IsCKeywords ('const') # Xxdi ()
This code checks whether the given string is a C-language keyword. Enter E: \ PyTest> python xxdi. py in the Windows cmd command prompt. The execution result is True.
The subsequent code snippets will omit the header script and encoding Declaration, as well as the 'main' section at the end.
Make sure that the array name is valid before generating the C array. A c-language identifier can only consist of letters, numbers, and underscores, and cannot start with a number. In addition, keywords cannot be used as identifiers. All of them need to process invalid characters. For the rules, see the code comment:
Import redef GenerateCArrayName (inFile): # All characters other than letters and numbers are converted into underscores # 'int $ = 5; ', which can be compiled in Gcc 4.1.2, the identifier inFile = re. sub ('[^ 0-9a-zA-Z \ _]', '_', inFile) # '_' is changed to ''To Remove invalid characters # Double-underline at the beginning of a number if inFile [0]. isdigit () = True: inFile = '_' + inFile # If the input file name is a C language keyword, in this case, it is capitalized and suffixed with an underscore (_) as the array name #. It cannot only be capitalized or prefixed with an underscore. Otherwise, it is easy for users to customize a name conflict if IsCKeywords (inFile) is True: inFile = '% s _' % inFile. upper () return inFile
Run the print generatecarrayname('1a%if1%%4.txt ') command to convert the input parameter string to _ 1a_if%%%4_txt. Similarly, _ Bool is converted to _ BOOL _.
To simulate the Linux Command style as much as possible, you also need to provide command line options and parameters. Optionparser is used in the parsing module. For details about its usage, see python command line parsing. The command line implementation of the xxd-I function is as follows:
#def ParseOption(base, cols, strip, inFile, outFile):def ParseOption(base = 16, cols = 12, strip = False, inFile = '', outFile = None):from optparse import OptionParsercustUsage = '\n xxdi(.py) [options] inFile [outFile]'parser = OptionParser(usage=custUsage)parser.add_option('-b', '--base', dest='base',help='represent values according to BASE(default:16)')parser.add_option('-c', '--column', dest='col',help='COL octets per line(default:12)')parser.add_option('-s', '--strip', action='store_true', dest='strip',help='only output C array elements')(options, args) = parser.parse_args()if options.base is not None:base = int(options.base)if options.col is not None:cols = int(options.col)if options.strip is not None:strip = Trueif len(args) == 0:print 'No argument, at least one(inFile)!\nUsage:%s' %custUsageif len(args) >= 1:inFile = args[0]if len(args) >= 2:outFile = args[1]return ([base, cols, strip], [inFile, outFile])
The commented def ParseOption (...) was originally called in the following method:
base = 16; cols = 12; strip = False; inFile = ''; outFile = ''([base, cols, strip], [inFile, outFile]) = ParseOption(base,cols, strip, inFile, outFile)
The intention is to modify base, cols, strip, and other parameter values at the same time. However, this method is quite awkward. Instead, you can use the default parameter function definition method. You only need to write ParseOption () for calling. If the reader knows a better way to write the code, I hope you will not be enlightened.
Call up the command prompt with the-h option, which is very similar to the Linux style:
E:\PyTest>python xxdi.py -hUsage:xxdi(.py) [options] inFile [outFile]Options:-h, --help show this help message and exit-b BASE, --base=BASE represent values according to BASE(default:16)-c COL, --column=COL COL octets per line(default:12)-s, --strip only output C array elements
Based on the above exercises, complete the highlights of this article:
Def Xxdi (): # parse command line options and parameters ([base, cols, strip], [inFile, outFile]) = ParseOption () import osif OS. path. isfile (inFile) is False: print ''' % s' is not a file! ''' % InFilereturnwith open (inFile, 'rb') as file: # binary files must be accessed in 'B' mode # file = open (inFile, 'rb ') # versions earlier than Python2.5 do not support... as syntax # if True: # Do not use for line in file or readline (s), so as to avoid line feed content = file in case of '0x0a. read () # split the file content into a byte array if base is 16: # Hexadecimalcontent = map (lambda x: hex (ord (x), content) elif base is 10: # Decimalcontent = map (lambda x: str (ord (x), content) elif base is 8: # Octalcontent = map (lambda x: oct (Ord (x), content) else: print '[% s]: Invalid base or radix for C language! '% Basereturn # construct the array definition header and length variable cArrayName = GenerateCArrayName (inFile) if strip is False: cArrayHeader = 'unsigned char % s [] = {' % cArrayNameelse: cArrayHeader = ''cArrayTailer = '}; \ nunsigned int % s_len = % d;' % (cArrayName, len (content) if strip is True: cArrayTailer = ''# print will automatically wrap the line after each line is output. if outFile is None: print cArrayHeaderfor I in range (0, len (content), cols): line = ','. join (content [I: I + cols]) print ''+ line + ', 'print cArrayTailerreturnwith open (outFile, 'w') as file: # file = open (outFile, 'W') # versions earlier than Python2.5 do not support... as syntax # if True: file. write (cArrayHeader + '\ n') for I in range (0, len (content), cols): line = reduce (lambda x, y :','. join ([x, y]), content [I: I + cols]) file. write ('% s, \ n' % line) file. flush () file. write (cArrayTailer)
Python or earlier versions do not support the with... as syntax, while the Linux system used for debugging is only installed with Python 2.4.3. Therefore, to run xddi. py in Linux, you can only write it as file = open (.... However, this requires processing of file closures and exceptions. For details, see with… in Python... As... Syntax. Note: when using the with... as syntax in Python2.5, you must declare from _ future _ import with_statement.
You can use platform. python_version () to obtain the Python version. For example:
Import platform # determine whether Python is major. minor and later versions def IsForwardPyVersion (major, minor): # python_version () returns 'major. minor. patchlevel ', such as '2. 7.11 'ver = platform. python_version (). split ('. ') if int (ver [0])> = major and int (ver [1])> = minor: return Truereturn False
After both Windows and Linux systems, Xddi () works as expected. Take the 123456789abcdef.txt file (the content is '123456789abcdef ') as an example. The test results are as follows:
E:\PyTest>python xxdi.py -c 5 -b 2 -s 123456789ABCDEF.txt[2]: Invalid base or radix for C language!E:\Pytest>python xxdi.py -c 5 -b 10 -s 123456789ABCDEF.txt49, 50, 51, 52, 53,54, 55, 56, 57, 65,66, 67, 68, 69, 70,E:\PyTest>python xxdi.py -c 5 -b 10 123456789ABCDEF.txtunsigned char __123456789ABCDEF_txt[] = {49, 50, 51, 52, 53,54, 55, 56, 57, 65,66, 67, 68, 69, 70,};unsigned int __123456789ABCDEF_txt_len = 15;E:\PyTest>python xxdi.py -c 5 -b 8 123456789ABCDEF.txtunsigned char __123456789ABCDEF_txt[] = {061, 062, 063, 064, 065,066, 067, 070, 071, 0101,0102, 0103, 0104, 0105, 0106,};unsigned int __123456789ABCDEF_txt_len = 15;E:\PyTest>python xxdi.py 123456789ABCDEF.txtunsigned char __123456789ABCDEF_txt[] = {0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x41, 0x42, 0x43,0x44, 0x45, 0x46,};unsigned int __123456789ABCDEF_txt_len = 15;
Take a slightly larger level-2 file as an example. After python xxdi. py VdslBooter. bin booter. c is executed, the content of booter. c is as follows (the beginning and end of the screenshot ):
unsigned char VdslBooter_bin[] = {0xff, 0x31, 0x0, 0xb, 0xff, 0x3, 0x1f, 0x5a, 0x0, 0x0, 0x0, 0x0,//... ... ... ...0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,};unsigned int VdslBooter_bin_len = 53588;
In summary, the xxdi module implemented by the author is very similar to the xxd-I function of Linux, and has its own advantages and disadvantages. Xxdi has the advantage of more adequate array name legality check (keyword check) and richer array content representation (octal and 10hexadecimal). The disadvantage is that redirection is not supported, and the value width is not fixed (for example, 0xb and 0xff ). Of course, these shortcomings are not difficult to eliminate. For example, you can use '0x % 02x' % val instead of hex (val) to control the output bit width. However, adding improvements may inevitably increase the complexity of the Code.
The above section describes how to use Python to implement the xxd-I function of Linux!