First on the right way:
The correct way should be, first create a zipfile, and then to its entries to do the traversal, each entry is actually a file or folder, the folder is detected when the folder is created, other situations create files, which use Zipfile.getinputstream ( Entry) can get the input stream of the current file (note that the input stream of the file is not the input stream of the compressed file). Then write it down in writer. Well, obviously it's very simple. Here is an example of reading the GBK format of the compressed package, the file encoding in the compressed package is also in the GBK format (that is, files written in Windows and packaged), the output is UTF8 decompression (cross-platform use).
def decompresszip (Source:file, dest:string, sourcecharacters:string = "GBK", destcharacters:string = "UTF-8") = { if(source.exists) {var os:outputstream=NULLvar inputstream:inputstreamreader=NULLvar outwriter:outputstreamwriter=NULLVal ZipFile=NewZipFile (source, sourcecharacters) var entries=zipfile.getentries Entries.foreach (Entry=if(Entry.isdirectory ())NewFile (dest +entry.getname). Mkdirs ()Else if(Entry! =NULL) { Try{val name=entry.getname Val Path= Dest +name var content=NewArray[char] (entry.getSize.toInt) InputStream=NewInputStreamReader (Zipfile.getinputstream (entry), sourcecharacters) println (Inputstream.read (cont ENT)) Val entryfile=NewFile (path) checkfileparent (entryfile) OS=NewFileOutputStream (entryfile) Outwriter=Newoutputstreamwriter (OS, destcharacters); Outwriter.write (NewString (content)} Catch { CaseE:throwable =e.printstacktrace ()}finally{ if(OS! =NULL) {Os.flush Os.close} if(Outwriter! =NULL) {Outwriter.flush Outwriter.close} if(InputStream! =NULL) Inputstream.close}}) Zipfile.close}}
Error Demonstration:
I do not know why, many of the online tutorials are using Ziparchiveinputstream to extract, however:
The class is preferred if reading from files as are limited by not ZipFile
ZipArchiveInputStream
being able to read the central directory h Eader before returning entries. In particularZipArchiveInputStream
- May return entries, that is, is not part of the central directory at all and shouldn ' t is considered part of the archive.
- May return several entries with the same name.
- would not return internal or external attributes.
- may return incomplete extra field data.
- may return unknown sizes and CRC values for entries until the next entry have been reached if the archive uses the Data des Criptor feature.
The use of ZipFile has been recommended in version 1.3 of Commons-compress.
Personally, I have tried ziparchiveinputstream and found a problem, ziparchiveinputstream create a way is cumbersome, need to specify a inputstream, and this method in the API documentation is so written
Constructors
Constructor and Description |
ZipArchiveInputStream(InputStream inputStream) Create an instance using UTF-8 encoding |
ZipArchiveInputStream(InputStream inputStream, String encoding) Create an instance using the specified encoding |
ZipArchiveInputStream(InputStream inputStream, String encoding, boolean useUnicodeExtraFields) Create an instance using the specified encoding |
ZipArchiveInputStream(InputStream inputStream, String encoding, boolean useUnicodeExtraFields, boolean allowStoredEntriesWithDataDescriptor) Create an instance using the specified encoding |
Parameters: inputStream
-The stream to wrap
This construction method does not indicate what the InputStream parameter is, and tries it on the Internet, using:
Val ZipFile =NewZipFile (source, sourcecharacters) var entries=ZipFile.getEntriesentries.foreach (Entry=if(Entry! =NULL) { Try{val name=entry.getname Val Path= Dest +name var content=NewArray[char] (entry.getSize.toInt) Zais=NewZiparchiveinputstream (Zipfile.getinputstream (Entry)) Val Entryfile=NewFile (path) checkfileparent (entryfile) OS=NewFileOutputStream (entryfile) ioutils.copy (Zais, OS) ....... .....
Read the data is empty, use Zais.read read out Array[byte] and convert it to string discovery is a whitespace character string, the direct output ARRAY[BYTE] discovery is 0. Later read the document probably know what the reason, this ziparchiveinputstream read should be a zip file, However, Zipfile.geiinputstream returned is the input stream of the extracted files, so this problem will occur, try to commons-compress Spark relies on the 12 release of version 1.4 and the latest 1.14 version of this method is wrong, so I suspect that their 12 years after the transfer of the blog is not through their own use and testing to forward. This zipfile and ziparchiveinputstream always feel strange ...
Use Commons-compress to extract GBK format WinZip file to UTF8, and error using Ziparchiveinputstream read out the data is all empty solution