Several methods of extracting compressed packets from Python

Source: Internet
Author: User
Tags extract gz

Several methods of extracting compressed packets from Python

. gz. Tar. tgz. zip. rar

Brief introduction

GZ: That is, gzip, usually only one file can be compressed. Combined with tar, it can be packaged first and then compressed.

Packaging tools under the Tar:linux system, packaged only, not compressed

Tgz: That is tar.gz. The files that are packed first with tar and then compressed with GZ

Zip: Unlike gzip, while using a similar algorithm, you can package and compress multiple files, but compress the files separately, with a compression rate lower than tar.

RAR: Packaging Compressed files, originally used for DOS, based on the Windows operating system. The compression rate is higher than zip, but it is slow and the speed of random access is slow.

For various comparisons between zip and rar, it can be seen that:

Http://www.comicer.com/stronghorse/water/software/ziprar.htm

Gz

Because GZ generally compresses only one file, all work with other packaging tools. For example, you can pack the tar into Xxx.tar first, and then compress to XXX.tar.gz

To extract gz, in fact, is to read a single file, the Python method is as follows:

[Python] View plain copy
  1. Import gzip
  2. Import OS
  3. def un_gz (file_name):
  4. "" " ungz zip file " ""
  5. F_name = File_name.replace (". Gz", "" ")
  6. #获取文件的名称, remove
  7. G_file = gzip. Gzipfile (file_name)
  8. #创建gzip对象
  9. Open (F_name, "w+"). Write (G_file.read ())
  10. #gzip对象用read () is opened and written to the file created by open ().
  11. G_file.close ()
  12. #关闭gzip对象

Tar

XXX.tar.gz decompression to get Xxx.tar, but also further decompression out.

* Note: tgz and tar.gz are the same format, the old version dos extension up to three characters, so with tgz.

Since there are multiple files here, we first read all the filenames and then unzip them, as follows:

[Python] View plain copy
  1. Import Tarfile
  2. def Un_tar (file_name):
  3. Untar zip file "" "
  4. tar = Tarfile.open (file_name)
  5. names = Tar.getnames ()
  6. if Os.path.isdir (file_name + "_files"):
  7. Pass
  8. Else:
  9. Os.mkdir (file_name + "_files")
  10. #由于解压后是许多文件, create a folder with the same name beforehand
  11. For name in names:
  12. Tar.extract (name, file_name + "_files/")
  13. Tar.close ()

* Note: The tgz file is the same decompression method as the tar file.

Zip

Similar to tar, read more than one file name and unzip it as follows:

[Python] View plain copy
  1. Import ZipFile
  2. def un_zip (file_name):
  3. "" " Unzip zip file " ""
  4. Zip_file = ZipFile. ZipFile (file_name)
  5. if Os.path.isdir (file_name + "_files"):
  6. Pass
  7. Else:
  8. Os.mkdir (file_name + "_files")
  9. For names in zip_file.namelist ():
  10. Zip_file.extract (Names,file_name + "_files/")
  11. Zip_file.close ()

rar

Because RAR is usually used under window, additional Python package rarfile is required.

Available Address: Http://sourceforge.net/projects/rarfile.berlios/files/rarfile-2.4.tar.gz/download

Unzip to the/scripts/directory of the Python installation directory, open the command line in the current window,

Enter Python setup.py install

Installation is complete.

[Python] View plain copy
  1. Import Rarfile
  2. Import OS
  3. def un_rar (file_name):
  4. "" " unrar Zip file " ""
  5. rar = Rarfile. Rarfile (file_name)
  6. if Os.path.isdir (file_name + "_files"):
  7. Pass
  8. Else:
  9. Os.mkdir (file_name + "_files")
  10. Os.chdir (file_name + "_files"):
  11. Rar.extractall ()
  12. Rar.close ()


Tar packaging in the process of writing packaging code, using Tar.add () to add files, the file itself will be added to the path, plus arcname can be based on their own naming rules to add files to the TAR package code:
  1. #!/usr/bin/env/usr/local/bin/python
  2. # Encoding:utf-8
  3. Import Tarfile
  4. Import OS
  5. Import Time
  6. Start = Time.time ()
  7. Tar=tarfile.open ('/path/to/your.tar, ' W ')
  8. For root,dir,files in os.walk ('/path/to/dir/'):
  9. For file in files:
  10. Fullpath=os.path.join (Root,file)
  11. Tar.add (Fullpath,arcname=file)
  12. Tar.close ()
  13. print time.time ()-start

Compression rules can be set during packaging, such as Tar=tarfile.open ('/path/to/your.tar.gz ', ' w:gz ') in the format that you want to package in GZ compression, as in the following table: There are many types of mode in Tarfile.open: mode Action
' R ' or ' r:* ' Open for reading with transparent compression (recommended).
' R: ' Open for reading exclusively without compression.
' R:gz ' Open for reading with gzip compression.
' R:BZ2 ' Open for reading with bzip2 compression.
' A ' or ' A: ' Open for appending with no compression. The file is created if it does not exist.
' W ' or ' W: ' Open for uncompressed writing.
' W:gz ' Open for gzip compressed writing.
' W:BZ2 ' Open for bzip2 compressed writing.
The Tar unpack package can also be decompressed according to different compression formats.
    1. #!/usr/bin/env/usr/local/bin/python
    2. # Encoding:utf-8
    3. Import Tarfile
    4. Import Time
    5. Start = Time.time ()
    6. t = Tarfile.open ("/path/to/your.tar", "R:")
    7. T.extractall (Path = '/path/to/extractdir/')
    8. T.close ()
    9. print time.time ()-start
The above code is to extract all, but also to do different processing, but if there are too many files in the tar package, careful memory Oh ~
    1. tar = Tarfile.open (filename, ' r:gz ')
    2. For tar_info in tar:
    3. File = Tar.extractfile (tar_info)
    4. Do_something_with (file)

Several methods of extracting compressed packets from Python

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.