Several methods of extracting compressed packets from Python
. gz. Tar. tgz. zip. rar
Brief introduction
GZ: That is, gzip, usually only one file can be compressed. Combined with tar, it can be packaged first and then compressed.
Packaging tools under the Tar:linux system, packaged only, not compressed
Tgz: That is tar.gz. The files that are packed first with tar and then compressed with GZ
Zip: Unlike gzip, while using a similar algorithm, you can package and compress multiple files, but compress the files separately, with a compression rate lower than tar.
RAR: Packaging Compressed files, originally used for DOS, based on the Windows operating system. The compression rate is higher than zip, but it is slow and the speed of random access is slow.
For various comparisons between zip and rar, it can be seen that:
Http://www.comicer.com/stronghorse/water/software/ziprar.htm
Gz
Because GZ generally compresses only one file, all work with other packaging tools. For example, you can pack the tar into Xxx.tar first, and then compress to XXX.tar.gz
To extract gz, in fact, is to read a single file, the Python method is as follows:
[Python] View plain copy
- Import gzip
- Import OS
- def un_gz (file_name):
- "" " ungz zip file " ""
- F_name = File_name.replace (". Gz", "" ")
- #获取文件的名称, remove
- G_file = gzip. Gzipfile (file_name)
- #创建gzip对象
- Open (F_name, "w+"). Write (G_file.read ())
- #gzip对象用read () is opened and written to the file created by open ().
- G_file.close ()
- #关闭gzip对象
Tar
XXX.tar.gz decompression to get Xxx.tar, but also further decompression out.
* Note: tgz and tar.gz are the same format, the old version dos extension up to three characters, so with tgz.
Since there are multiple files here, we first read all the filenames and then unzip them, as follows:
[Python] View plain copy
- Import Tarfile
- def Un_tar (file_name):
- Untar zip file "" "
- tar = Tarfile.open (file_name)
- names = Tar.getnames ()
- if Os.path.isdir (file_name + "_files"):
- Pass
- Else:
- Os.mkdir (file_name + "_files")
- #由于解压后是许多文件, create a folder with the same name beforehand
- For name in names:
- Tar.extract (name, file_name + "_files/")
- Tar.close ()
* Note: The tgz file is the same decompression method as the tar file.
Zip
Similar to tar, read more than one file name and unzip it as follows:
[Python] View plain copy
- Import ZipFile
- def un_zip (file_name):
- "" " Unzip zip file " ""
- Zip_file = ZipFile. ZipFile (file_name)
- if Os.path.isdir (file_name + "_files"):
- Pass
- Else:
- Os.mkdir (file_name + "_files")
- For names in zip_file.namelist ():
- Zip_file.extract (Names,file_name + "_files/")
- Zip_file.close ()
rar
Because RAR is usually used under window, additional Python package rarfile is required.
Available Address: Http://sourceforge.net/projects/rarfile.berlios/files/rarfile-2.4.tar.gz/download
Unzip to the/scripts/directory of the Python installation directory, open the command line in the current window,
Enter Python setup.py install
Installation is complete.
[Python] View plain copy
- Import Rarfile
- Import OS
- def un_rar (file_name):
- "" " unrar Zip file " ""
- rar = Rarfile. Rarfile (file_name)
- if Os.path.isdir (file_name + "_files"):
- Pass
- Else:
- Os.mkdir (file_name + "_files")
- Os.chdir (file_name + "_files"):
- Rar.extractall ()
- Rar.close ()
Tar packaging in the process of writing packaging code, using Tar.add () to add files, the file itself will be added to the path, plus arcname can be based on their own naming rules to add files to the TAR package code:
- #!/usr/bin/env/usr/local/bin/python
- # Encoding:utf-8
- Import Tarfile
- Import OS
- Import Time
- Start = Time.time ()
- Tar=tarfile.open ('/path/to/your.tar, ' W ')
- For root,dir,files in os.walk ('/path/to/dir/'):
- For file in files:
- Fullpath=os.path.join (Root,file)
- Tar.add (Fullpath,arcname=file)
- Tar.close ()
- print time.time ()-start
Compression rules can be set during packaging, such as Tar=tarfile.open ('/path/to/your.tar.gz ', ' w:gz ') in the format that you want to package in GZ compression, as in the following table: There are many types of mode in Tarfile.open: mode Action
' R ' or ' r:* ' |
Open for reading with transparent compression (recommended). |
' R: ' |
Open for reading exclusively without compression. |
' R:gz ' |
Open for reading with gzip compression. |
' R:BZ2 ' |
Open for reading with bzip2 compression. |
' A ' or ' A: ' |
Open for appending with no compression. The file is created if it does not exist. |
' W ' or ' W: ' |
Open for uncompressed writing. |
' W:gz ' |
Open for gzip compressed writing. |
' W:BZ2 ' |
Open for bzip2 compressed writing. |
The Tar unpack package can also be decompressed according to different compression formats.
- #!/usr/bin/env/usr/local/bin/python
- # Encoding:utf-8
- Import Tarfile
- Import Time
- Start = Time.time ()
- t = Tarfile.open ("/path/to/your.tar", "R:")
- T.extractall (Path = '/path/to/extractdir/')
- T.close ()
- print time.time ()-start
The above code is to extract all, but also to do different processing, but if there are too many files in the tar package, careful memory Oh ~
- tar = Tarfile.open (filename, ' r:gz ')
- For tar_info in tar:
- File = Tar.extractfile (tar_info)
- Do_something_with (file)
Several methods of extracting compressed packets from Python