Why is there a difference in the size of the file shown with LS and du?

Source: Internet
Author: User
Tags disk usage

There were times when I used LS and du to view the size of a file and found that the size of the two shows was inconsistent, for example:

bl@d3:~/test/sparse_file$ ls-l fs.img
-rw-r--r--1 bl bl 1073741824 2012-02-17 05:09 fs.img
bl@d3:~/test/sparse_file$ Du-sh fs.img
0 Fs.img

Here LS shows the size of the fs.img is 1073741824 bytes (1GB), and du shows the size of the fs.img is 0.

Originally has not delved into this question, today special to fill up.

There are two main reasons for this difference: the size of the sparse file (sparse file) ls and du shows different meanings

Let's take a look at sparse files. Sparse files only have "holes" (hole) files in the file, such as C to write a file to create a "hole":

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main (int argc, char *argv[])
{
int fd = open ("Sparse.file", o_rdwr| O_creat);
Lseek (FD, 1024, seek_cur);
Write (FD, "n", 1);

return 0;
}

As you can see from this file, creating a "hole" file is mainly using Lseek to move the file pointer over the end of the file, and then write, thus forming a "hole".

You can also create sparse files using the shell:

$ dd If=/dev/zero of=sparse_file.img bs=1m seek=1024 count=0
0+0 Records in
0+0 Records out

The advantages of using sparse files are as follows (text on Wikipedia):

The advantage of sparse the files is so storage is only allocated when actually needed:disk spaces is saved, and large files Can be created even if there was insufficient free spaces on the file system.

That is, the "hole" in the sparse file can not occupy the storage space.

Let's look at the meaning of the file size of LS and du output (the original Wikipedia text):

The du command which prints the occupied space, while LS print the apparent size.

In other words, LS displays the "logically" size of the file, and Du displays the size of the file "physically", that is, the size of the DU display is computed by how many blocks the file occupies on the hard disk. As an example:

bl@d3:~/test/sparse_file$ echo-n 1 > 1B.txt
bl@d3:~/test/sparse_file$ ls-l 1B.txt
-rw-r--r--1 bl 1 2012-02-19 05:17 1B.txt
bl@dl3:~/test/sparse_file$ du-h 1B.txt
4.0K 1B.txt

Here we first create a file 1B.txt, size is a byte, ls shows the size is 1Byte, and 1B.txt this file on the hard disk will occupy n block, and then according to the size of each block calculated. N is used here, not a specific number, because there are a lot of details hidden behind the scenes, such as fragment size, which we'll discuss later.

Of course, these are the default behaviors of LS and Du, and LS and du provide different parameters to change these behaviors. such as LS, the-s option (print the allocated size of each file, in blocks) and Du--apparent-size options (print apparent sizes, rather than disk Usage Although the apparent size is usually smaller, it may was larger due to holes in (' sparse ') files, internal fragmentation, Indirect blocks, and the like).

In addition, for copying sparse files, the CP does some optimizations by default to speed up the copy. For example:

Strace CP fs.img fs.img.copy >log 2>&1

Open log file, we found CP command just read and Lseek, and did not write.

Stat ("Fs.img.copy", {st_mode=s_ifreg|0644, st_size=0, ...}) = 0
Stat ("Fs.img", {st_mode=s_ifreg|0644, st_size=1073741824, ...}) = 0
Stat ("Fs.img.copy", {st_mode=s_ifreg|0644, st_size=0, ...}) = 0
Open ("Fs.img", o_rdonly) = 3
Fstat (3, {st_mode=s_ifreg|0644, st_size=1073741824, ...}) = 0
Open ("Fs.img.copy", o_wronly| O_TRUNC) = 4
Fstat (4, {st_mode=s_ifreg|0644, st_size=0, ...}) = 0
Mmap (NULL, 532480, prot_read| Prot_write, map_private| Map_anonymous,-1, 0) = 0x7f90df965000
Read (3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0" ..., 524288) = 524288
Lseek (4, 524288, seek_cur) = 524288
Read (3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0" ..., 524288) = 524288
Lseek (4, 524288, seek_cur) = 1048576
Read (3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0" ..., 524288) = 524288
Lseek (4, 524288, seek_cur) = 1572864

This is related to CP's sparse options, see CP manpage:

By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST of the file is made sparse as wel  L.  That's the behavior selected by--sparse=auto. Specify--sparse=always to create a sparse DEST file whenever the SOURCE file contains a long enough sequence of zero byt  Es. Use--sparse=never to inhibit creation of sparse files.

Look at the source code of the CP, found every time after read, CP will determine whether the content read is 0, if it is only lseek and not write.

Of course, for the sparse file processing, for the user is transparent. Category: Linux tags: linux, ls, du, sparse file

About Sparse Files Technote (FAQ)
question

About Sparse Files Answer

This document is describes sparse files, exposure due to sparse files, and the effects of certain. commands files. This document applies to all versions of AIX. Overview
Creating a sparse file
The effect of certain commands on sparse files

Many applications, particularly databases, maintain data in sparse files. A sparse file is a file with empty spaces, or gaps, left open for future addition of data. If The empty spaces are filled with the ASCII null character and the spaces are, the file would be large, and Disk blocks won't is allocated to it.

This is creates an exposure:a large file would be created, but the disk blocks won't be allocated. Then, as data is added to the file, the disk blocks'll be allocated but there may isn't enough free disk blocks in the File system. Then the ' file system ' is ' full ' and writes to any file system'll fail.

You can prevent this problems by either assuring this have no sparse files on your system or by planning to have GH free spaces in the file system for the future allocation of the blocks.

You also need to is aware of how your manipulate sparse or potentially sparse files because you can easily change them from Sparse to not sparse or vice-versa.

An example sparse file can be created fairly easily. To does this, open the file, seek to a large address, and write some data. This can is demonstrated with the DD command, as Follows:first, create a regular file:

   Date > Notsparse
   ls-l

The output of the LS command'll is similar to:

   Total 8
   -rw-r--r--   1 root     sys           08:12 notsparse
Use the Fileplace command to the many allocated and unallocated blocks are the file included.

(NOTE:perfagent.tools must is installed to run the Fileplace command at AIX 4.x and 5.x.)

   Fileplace Notsparse

The output would look similiar to:

    File:notsparse  size:29 Bytes  vol.:/dev/lv03
    Blk size:4096 frag size:4096 nfrags:1  6/>logical Fragment
      ----------------
      00716                   1 frags         4096 bytes,  100%

(Note:performance analysis and Control Commands [Perfagent.tools] must is installed to enable the Fileplace command.) The du command would also reflect how many 512-byte blocks a file occupies.

   Du-rs *

Example output:

   8 Notsparse
Now create a sparse file using the regular file Notsparse as input:
   Touch sparse.1
   dd if=notsparse of=sparse.1 seek=100

Example output:

   Dd:0+1 Records in.
   Dd:0+1 records out.

The DD command takes the data from the regular file and places it 512-byte blocks into thesparse.1 file. Note that it written to the initial 512-byte blocks. The following steps show the characteristics of the resulting file. The LS command reports the distance from blocks zero to the last block in the file:

   Ls-l

Example output:

   Total
   -rw-r--r--   1 root     sys           DEC 08:12 notsparse
   -rw-r--r--   1 root     sys        51229 Dec 08:13 sparse.1
The Fileplace command tells the story Accurately-there are 4K unallocated and one blocks 4K blocks in the fi Le
   Fileplace sparse.1

Example output:

   File:sparse.1  size:51229 Bytes  vol.:/dev/lv03 Blk size:4096 frag size:4096 nfrags  :  1   Compress:no
   Logical Fragment
   ----------------unallocated frags 49152 Bytes   ,  0%
   0000769                          1 frags    4096 Bytes, 100%
The du command reports the number of allocated blocks the file takes:
   Du-rs *

Example output:

   8 Notsparse
   8 sparse.1
Backup/restore (by name and Inode)

The restore command aggressively preserves sparseness. In fact, the Restore command would unallocate any blocks filled with zeroes, thus making a file sparse. CP

The CP command does not preserve the sparseness of a file. Cpio

If You create a backup using The cpio command on sparse files, and you'll need to use The paxcommand to R estore that data. Using The cpio command to restore the data would not preserve sparseness.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.