This article is reproduced in: http://www.ruanyifeng.com/blog/2012/08/how_to_read_diff.html All rights reserved by the original author
1. Preface
Diff is a very important tool program for UNIX systems.
It is used to compare the differences of two text files and is one of the cornerstones of code versioning. At the command line, you enter:
Diff will tell you what the difference is between the two files. It's not very understood, I'll show you how to read diff.
2. Three different formats for diff
For historical reasons, diff has three different formats:
We'll see in turn.
3. Sample Files
For ease of interpretation, create a new two sample file first.
The first file is called F1, and the content is a per line, with a total of 7 rows.
The second file, called F2, modifies F1, and the 4th line becomes B, and the rest is unchanged.
4. diff in normal format
Now compare F1 and F2:
At this point, diff displays the results in the normal format:
The first line is a hint to illustrate the position of the change.
It is divided into three parts: the preceding "4", indicating that the 4th line of the F1 is changed, the middle "C" indicates that the change mode is the content change, the other modes have "add" (A, represent addition) and "Delete" (d, representing the deletion), followed by "4", Represents a change and becomes the 4th line of F2.
The second line is divided into two parts.
The less-than sign above indicates that the row is to be stripped from the F1 (that is, line 4th), followed by "a" to represent the contents of the line.
The third line is used to split F1 and F2.
Line four, similar to the second row.
The preceding greater-than sign indicates that F2 has added the row, and the following "B" indicates the contents of the row. the earliest Unix (that is, the-T version of Unix) uses the diff in this format.
5. diff in context format
In the early 80, when the University of California, Berkeley introduced the BSD version of UNIX, it found the diff display to be too simple, preferably in context, to understand the changes. Therefore, a diff in context format is introduced.
It is used by adding the C parameter (representing the context).
The results appear as follows:
The result is divided into four parts.
The first part of the two lines, showing the basic situation of two files: file name and time information.
"* * *" indicates the document before the change, "---" means the document after the change.
The second part is 15 asterisks, which separate the basic situation of the file from the change content.
The third part shows the document before the change, namely F1.
This shows not only the 4th line of the change, but also the first three rows of line 4th and the next three rows, so it shows a total of 7 rows. Therefore, the previous "* * * * * * * * * * * * * * * * 1,7" starts from line 1th for 7 consecutive lines.
In addition, each line of the file content is preceded by a marker bit. If empty, indicates that the row has no change, and if it is an exclamation point (!), it indicates that the row has been changed, and if it is a minus sign (-), the row is deleted, or a plus sign (+) indicates that the behavior is new.
Part IV shows the document after the change, namely F2.
In addition to the Change row (line 4th), the context is displayed in three rows, showing a total of 7 rows.
6. diff in merged format
If the two file similarity is very high, then the context format diff, will show a lot of duplicate content, it is a waste of space. In 1990, GNU diff pioneered the "Merge format" diff, which combines the context of F1 and F2.
It is used by adding U parameters (representing unified).
The results appear as follows:
The first part of it is also the basic information of the document.
"---" means the document before the change, and "+ + +" means the document after the change.
The second part, the position of the change with two @ as the first and end.
The preceding " -1,7" is divided into three parts: the minus sign indicates the first file (that is, F1), "1" means line 1th, and "7" means 7 rows in a row. Together, it means that the following is the first file, starting from line 1th, 7 consecutive lines. Similarly, "+1,7" means that after a change, the second file becomes a row of 7 lines starting from line 1th.
The third part is the concrete content of the change.
In addition to those rows that have changed, the context displays 3 rows. It merges the context of two files together, so it's called "Merge format." The first flag bit of each line, empty represents no change, minus indicates the row that the file was deleted from, and the plus sign represents the new row for the second file .
7. diff in git format
Version management system git, which uses a variant of merge format diff.
The results appear as follows:
The first line represents a diff that results in a git format.
Comparisons are made of the F1 of version a (i.e. before the change) and the F1 of version B (i.e. after the change).
The second line represents two versions of the GIT hash value (the 6f8a38c object in the index area, compared to the 449b072 object in the Working directory area), and the last six digits are the object's schema (normal file, 644 permissions).
The third line represents the two files that are compared.
The "---" represents the pre-change version, and "+ + +" represents the changed version.
The following lines are the same as the official merge format diff.
8. Reading materials
* Diff-wikipedia
* How to read a patch or diff
* How to work with diff representation in git
Read diff (reprint)