International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > Linux

Analysis of Linux Cat command source code

Last Update:2014-11-07 Source: Internet

Author: User

Tags goto

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently in Reading Apue, the side to see also have to do to have effect. Just Linux under a lot of commands are open source, you can directly see the source code. The GNU coreutils is a good choice. SOURCE package has our most commonly used LS, cat and other command source code, each command is relatively short, suitable for reading. Here's a little note I read about cat commands.

Download the source code here. In the root directory of the source code./configure; Make can be compiled directly, and make can be compiled after modification. Command source in the src/directory, there are some auxiliary functions and constants defined in the lib/directory.

1. Command-line parsing

Basically all Linux commands use the GETOPT function to parse the command-line arguments, and cat is no exception, and cat uses the Getopt_long function to parse the long parameters and use some bool variables to store the option values. There's nothing to say.

2. Detect if the input and output files are the same

For example, in the case of cat test.txt > Test.txt, the input and output files are the same, which is not legal.

The input stream for cat is given by the command line, which defaults to standard input (stdin) and the output stream is standard output (stdout). Therefore, the method of string comparison is not able to determine whether the input output is the same. Also for some special files, such as TTY, we are allowed to have the same input and output as Cat/dev/tty >/dev/tty is legal. Cat takes the same approach as regular file, detecting the device number and I-node. The detection of non-regular file is ignored. The code for this section is as follows:

Gets the file attributes.

if (Fstat (Stdout_fileno, &stat_buf) < 0)    error (Exit_failure, errno, _ ("standard output"));

Extract the file device number and I-node. For non-regular types of files, the detection is ignored.

if (S_isreg (Stat_buf.st_mode))    {      Out_dev = Stat_buf.st_dev;      Out_ino = Stat_buf.st_ino;    }  else    {      check_redirection = false;    }

For inspection. Check_redirection is not checked for false.

if (Fstat (Input_desc, &stat_buf) < 0) <span style= "White-space:pre" ></span>//input_desc as input file descriptor        {          error (0, errno, "%s", infile);          OK = false;          Goto Contin;        }

if (check_redirection          && Stat_buf.st_dev = = Out_dev && Stat_buf.st_ino = = Out_ino          && ( Input_desc! = Stdin_fileno))        {          error (0, 0, _ ("%s:input file is output file"), infile);          OK = false;          Goto Contin;        }

Tips: '-' represents a standard input, such as a cat-command that actually reads bytes from a standard input. So cat can work with pipe commands like this: echo ABCD | Cat File1-file2. Entering only the cat command defaults to reading bytes from the standard input.

3. Number of bytes read and write at one time

Cat is implemented on the basis of the read, write function, and the number of bytes read and written at a time also affects the performance of the program.

The insize and outsize variables represent the number of bytes read and written, respectively.

Insize = Io_blksize (STAT_BUF);

enum {io_bufsize = 128*1024};static inline size_tio_blksize (struct stat sb) {  return MAX (Io_bufsize, St_blksize (SB ); <span style= "White-space:pre" ></span>/* st_blksize () The value of the macro depends on the system, defined in Lib/stat-size.h */}

The setting of the outsize value is similar to Insize.

4. Simple_cat

such as the Cat command does not use any format parameters, such as-V,-T. Then call Simple_cat to complete the operation, the advantage of Simple_cat is that it is fast because it is possible to read and write files in binary mode on some systems. Refer to Man 3 freopen.

if (! (number | | show_ends | | squeeze_blank))    {      File_open_mode |= o_binary;<span style= "White-space:pre" ></span>/* under Linux o_binary 0, without any effect, But some systems are the binary form of open files *      /if (o_binary &&! isatty (Stdout_fileno)) <span style= "White-space:pre" ></ span>/* calls Freopen, contains error handling, changes the output stream mode to "WB" *        /Xfreopen (NULL, "WB", stdout);    }

Without any format arguments, the simple_cat is called

if (! (number | | show_ends | | show_nonprinting             | | Show_tabs | | Squeeze_blank)        {          insize = MAX (insize, outsize); <span style= "White-space:pre" ></span>/* Xzz allocating memory , Failure calls Xmalloc-die () to terminate the program and report the error */          inbuf = Xmalloc (insize + page_size-1);          OK &= simple_cat (<strong>ptr_align</strong> (Inbuf, page_size), insize);        }

Ptr_align is an auxiliary function. Because the IO operation reads one page at a time, Ptr_align is an integer multiple of the starting address of the buffer array to increase IO efficiency.

static inline void *ptr_align (void const *ptr, size_t alignment) {  char const *P0 = ptr;  char Const *P1 = p0 + alignment-1;  return (void *) (P1-(size_t) P1% alignment);}

The Simple_cat function is simple

Static Boolsimple_cat (/* Pointer to the buffer, used by reads and writes.  */char *buf,/* Number of characters preferably read or written by each read and write call.  */size_t BufSize) {/* Actual number of characters read, and therefore written.  */size_t N_read;  /* Loop until the end of the file.  */while (true) {/* Read a block of input.      *//* Normal read may be interrupted by signal */N_read = Safe_read (Input_desc, buf, bufsize);          if (N_read = = Safe_read_error) {ERROR (0, errno, "%s", infile);        return false;  }/* End of this file?      */if (N_read = = 0) return true;  /* Write this block out.  */{/* the following is OK, since we know 0 < N_read. */size_t n = n_read;/* full_write and safe_read are called SAFE_SW, implemented with macros, * Viewing safe_write.c can find the key to its implementation.      */if (Full_write (Stdout_fileno, BUF, n)! = N) error (Exit_failure, errno, _ ("Write error")); }    }}

5. SAFE_RW, FULL_RW function

The read and write functions may be interrupted by signal before reading and writing the first character, SAFE_RW can resume interrupted read and write procedures. This function is very tricky, its name SAFE_RW and RW are actually macro definitions, conditional compilation can compile this function into Safe_read, safe_write two functions.

<strong >size_t </strong>/* Original read () function return value is ssize_t */safe_rw (int fd, void const *BUF, size_t count) {/* work around a  Bug in Tru64 5.1.  Attempting to read more than Int_max bytes fails with errno = = EINVAL.     See 

Read, write read and write process may be interrupted by signal, FULL_RW can resume the read and write process until the specified number of bytes read or write to reach the end of the file (EOF), or read and write errors. Returns the number of bytes currently read and written. The FULL_RW () function name is also defined by the macro, which actually implements the Full_read () Full_write ().
/* Write (  Read) COUNT bytes at BUF to (from) descriptor FD, retrying if interrupted or if a partial write (read) occurs.   Return the number of bytes transferred.   When writing, set errno if fewer than COUNT bytes is written.  When reading, if fewer than COUNT bytes is read, you must examine errno to distinguish failure from EOF (errno = = 0).  */SIZE_TFULL_RW (int fd, const void *buf, size_t count) {size_t total = 0;  const char *PTR = (const char *) BUF;      while (Count > 0) {size_t N_RW = SAFE_RW (FD, PTR, count);      if (N_RW = = (size_t)-1) <span style= "White-space:pre" ></span>/* error */break; if (N_RW = = 0) <span style= "White-space:pre" ></span>/* reach EOF */{errno = ZERO_BYTE_TRANSFE          R_errno;        Break      } Total + = N_RW;      PTR + = N_RW;    Count-= N_RW; } return total;} 

Tips: See the SAFE_READ.C and safe_write.c files in the Lib directory to see how this function is expanded into two different functions.

6. Cat function, processing formatted output
Simple_cat just inputs the output intact, without any processing, and all content related to the formatted output is placed in the cat function.
The implementation of the cat function contains many tricks. For example, use a Sentinel ' \ n ' to mark the end of the input buffer. In addition, a character array is used to count the number of rows, so that systems that do not support 64-bit integers can also use a large range of numbers.
The following is the code for this line counter.
/* Position in ' line_buf ' where printing starts.  This would not be unless the number of   lines is larger than 999999.  */static char *line_num_print = line_buf + line_counter_buf_len-8;/* Position of the first digit in ' Line_buf '.  */static char *line_num_start = line_buf + line_counter_buf_len-3;/* Position of the last digit in ' line_buf '.  */static char *line_num_end = line_buf + line_counter_buf_len-3;
Static Voidnext_line_num (void) {  char *ENDP = line_num_end;  Do    {      if ((*ENDP) + < ' 9 ')        return;      *endp--= ' 0 ';    }  while (ENDP >= line_num_start);  if (Line_num_start > Line_buf)    *--line_num_start = ' 1 ';  else    *line_buf = ' > ';  if (Line_num_start < line_num_print)    line_num_print--;}
The key to understanding this function is to understand the role of newlines, cat format output main operation to determine the line and continuous blank lines, newlines this variable is marked by the number of empty lines, a value of 0 means that at this time the Inbuf reading position at the beginning of a line, 1 means there is a blank line, 1 Indicates that a row has just been parsed and is ready to go to the next line, and you can see that the two break statements of the last while (true) of the cat function set newlines to-1.
The process of cat formatted output is essentially the process of scanning the input buffer array one by one and storing the converted characters in the output buffer array during the scan.

Analysis of Linux Cat command source code

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

sqlite source code analysis linux source code download linux source code download sentiment analysis project source code in java static code analysis tools c linux hadoop cat command cat command options

Analysis of UDP packet loss problem in Linux system 01-15

Linux error--->export ' = ' not a valid identifier for gen... 04-09

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis of Linux Cat command source code

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support