Files can be compressed in hadoop, and gzip, lzo, and snappy can be used for compression.Algorithm. For lzo compression, commonLzocodec and lzopcodec can be used to compress sequencefile and textfile. However, after textfile is compressed, mapred cannot split the compressed files by default, you need to perform index operations on the lzo compressed file to generate lzo. index file, map operation can be split. /Hadoop jar hadoop-lzo.jar com. hadoop. Compression. lzo. lzoindexerXxx. lzo After the index is complete, the. lzo. index file is generated under the same directory of the lzo compressed file. It must be noted that 1. sequencefile generation is not supported. compressed files in lzo format, although sequencefile compression is supported, only the generation of store as textfile is supported. file with the lzo suffix 2. If lzocodec is set, the file is generated. you can use lzoindexer to calculate the index that supports split for files suffixed with lzo. If lzopcodec is set, it is generated. the file with the suffix lzo_deflate does not support index creation.