Python-based webpage document processing script implementation,

Last Update:2016-12-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Embedded web servers are different from traditional servers. The web needs to be converted to an array format and stored in flash to facilitate lwip network interface calls. Recently, due to business requirements, web pages need to be frequently modified, each compression and conversion process is a very tedious process. Therefore, I have the knowledge to use python to compile a program that can process web files in batches, the script to compress and convert it into an array.

Script running background (compatible with later versions ):

Python 3.5.1 (for download, installation, and configuration, refer to the online tutorial)

Node. js v4.4.7. Install the uglifyjs management package and support non-text compression of js files

Uglifyjs is used to compress the JS file engine, the specific installation can refer to the http://www.zhangxinxu.com/wordpress/2013/01/uglifyjs-compress-js/

The specific implementation code is as follows:

#/Usr/bin/pythonimport osimport binasciiimport shutil from functools import partialdef FileReduce (inpath, outpath): infp = open (inpath, "r", encoding = "UTF-8 ") outfp = open (outpath, "w", encoding = "UTF-8") print (outpath + "compressed successfully") for li in infp. readlines (): if li. split (): li = li. replace ('\ n ',''). replace ('\ t', ''); li = ''. join (li. split () outfp. writelines (li) infp. close () outfp. close () # shell command line call (use ugllifyjs2 to compress js files) def ShellReduce (inpath, outpath ): command = "uglifyjs" + inpath + "-m-o" + outpath print (Command) OS. system (Command) # Read the file in binary format and convert it into an array to save def filehex (inpath, outpath): I = 0 count = 0 a = ''inf = open (inpath, 'rb'); outf = open (outpath, 'w') records = iter (partial (inf. read, 1), B '') print (outpath +" converted to array succeeded ") for r in records: r_int = int. from_bytes (r, byteorder = 'Big ') a + = hex (r_int) +', 'I + = 1 count + = 1 if I = 16: a + = '\ n' I = 0 a = "const static char" + outpath. split ('. ') [0]. split ('/') [-1] + "[" + str (count) + "] = {\ n" + a + "\ n} \ n" outf. write (a) inf. close () outf. close () # create a new folder def mkdir (path): path = path. strip () isExists = OS. path. exists (path) # determine whether a folder exists. if it does not exist, create if not isExists: print (path + 'created successfully') OS. makedirs (path) else: pass return path # delete a folder (including all internal files) def deldir (path): path = path. strip () isExists = OS. path. exists (path) # determine whether the folder exists. if it exists, delete if isExists: print (path + "deleted successfully") shutil. rmtree (path) else: passdef WebProcess (path): # original webpage .. \ basic \ # compressing web pages .. \ reduce \ # compilation completed. web Page C .. \ programe BasicPath = path + "\ basic" ProgramPath = path + "\ program" Export cepath = path + "\ reduce" # Delete the original folder, create a new folder named deldir (ProgramPath) deldir (inclucepath) mkdir (ProgramPath) for root, dirs, files in OS. walk (BasicPath): for item in files: ext = item. split ('. ') InFilePath = root + "/" + item out1_cepath = mkdir (root. replace ("basic", "reduce") + "/" + item OutProgramPath = ProgramPath + "/" + item. replace ('. ',' _ ') + '. c' # Processing Based on Different suffixes # Remove '\ n',' \ t' from html/css ', one space character is retained # js calls uglifyjs2 for compression # gif jpg ico Direct Copy # other direct copies # other files are converted into hexadecimal arrays at the same time and saved. c file if ext [-1] in ["html", "css"]: FileReduce (InFilePath, out1_cepath) filehex (out1_cepath, OutProgramPath) elif ext [-1] in ["js"]: ShellReduce (InFilePath, out1_cepath) filehex (out1_cepath, OutProgramPath) elif ext [-1] in ["gif ", "jpg", "ico"]: shutil. copy (InFilePath, out1_cepath) filehex (out1_cepath, OutProgramPath) else: shutil. copy (InFilePath, outdomaincepath) # obtain the current path = OS. path. split (OS. path. realpath (_ file _) [0]; WebProcess (path)

The above implementation principles mainly include:

1. traverse the folder to be processed (Path: .. \ basic, you need to create, copy the processed file to it, and place the script on the folder) -- WebProcess

2. Create a compressed page folder (.. \ reduce, used to store compressed files), which is completed by a script and processed:

Html and css: Remove unnecessary spaces and line breaks in text.

Js: Call uglifyjs for compression processing

Gif, jpg, ico, and others: directly copy

3. Create a processing page folder (.. \ program, used to store compressed files), which is completed by the script. The processing action is as follows:

Read the file in binary mode and convert it to a hexadecimal string to write it to the folder.

In the folder (shift + right-click), enable windows command line, and input python web. py, you can repeat these three processes to process all the files.

Note: All processed files must be stored in UTF-8 format. Otherwise, an "gbk" read error will be reported during reading.

The implementation result is as follows:

Html file:

Conversion array:

For examples, see:

Http://files.cnblogs.com/files/zc110747/webreduce.7z

In addition, a small script is provided to query the number of lines of the selected code and the number of empty lines in the current directory and subfolders (derived from the script test ):

#/usr/bin/pythonimport ostotal_count = 0; empty_count = 0;def CountLine(path):        global total_count        global empty_count        tempfile = open(path)        for lines in tempfile:                total_count += 1                if len(lines.strip()) == 0:                       empty_count += 1 def TotalLine(path):        for root, dirs, files in os.walk(path):                for item in files:                        ext = item.split('.')                        ext = ext[-1]                          if(ext in ["cpp", "c", "h", "java", "php"]):                                subpath = root + "/" + item                                CountLine(subpath)path = os.path.split(os.path.realpath(__file__))[0];TotalLine(path)print("Input Path:", path)print("total lines: ",total_count)print("empty lines: ",empty_count)print("code lines: ", (total_count-empty_count))

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python-based webpage document processing script implementation,

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python-based webpage document processing script implementation,

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support