LINUX diff command detailed

Source: Internet
Author: User
Tags diff

Just in and the company to do offline IP comparison, finally handmade, feeling is still more troublesome, when the data is very large when the manual can not be carried out

I want to use the Linux diff to compare, found that the results are very messy. The time is tight and finally it's handmade.

Now I'm busy. Learn this command:

Content excerpted from the Web:

Diff is a very important tool program for UNIX systems.

It is used to compare the differences of two text files and is one of the cornerstones of code versioning. At the command line, you enter:

$ diff < pre-change files > < changed files >

Diff will tell you what the difference is between the two files. It's not very understood, I'll show you how to read diff.

One, three different formats for diff

For historical reasons, diff has three different formats:

* Regular format (normal diff)

* Contextual format (context diff)

* Merge Format (Unified diff)

We'll see in turn.

Ii. Sample Files

For ease of interpretation, create a new two sample file first.

The first file is called F1, and the content is a per line, with a total of 7 rows.

A
A
A
A
A
A
A

The second file, called F2, modifies F1, and the 4th line becomes B, and the rest is unchanged.

A
A
A
  B
A
A
A

Third, the normal format of the diff

Now compare F1 and F2:

$ diff F1 F2

At this point, diff displays the results in the normal format:

4c4
< a
---
> b

The first line is a hint to illustrate the position of the change.

4c4

It is divided into three parts: the preceding "4", indicating that the 4th line of the F1 is changed, the middle "C" indicates that the change mode is the content change, the other modes have "add" (A, represent addition) and "Delete" (d, representing the deletion), followed by "4", Represents a change and becomes the 4th line of F2.

The second line is divided into two parts.

< a

The less-than sign above indicates that the row is to be stripped from the F1 (that is, line 4th), followed by "a" to represent the contents of the line.

The third line is used to split F1 and F2.

---

Line four, similar to the second row.

> b

The preceding greater-than sign indicates that F2 has added the row, and the following "B" indicates the contents of the row.

The earliest Unix (that is, the-T version of Unix) uses the diff in this format.

Four, the context format of the diff

In the early 80, when the University of California, Berkeley introduced the BSD version of UNIX, it found the diff display to be too simple, preferably in context, to understand the changes. Therefore, a diff in context format is introduced.

It is used by adding the C parameter (representing the context).

$ diff-c F1 F2

The results appear as follows:

F1 2012-08-29 16:45:41.000000000 +0800
---f2 2012-08-29 16:45:51.000000000 +0800
***************
1,7 * * *
A
A
A
!a
A
A
A
---1,7----
A
A
A
!b
A
A
A

The result is divided into four parts.

The first part of the two lines, showing the basic situation of two files: file name and time information.

F1 2012-08-29 16:45:41.000000000 +0800
---f2 2012-08-29 16:45:51.000000000 +0800

"* * *" indicates the document before the change, "---" means the document after the change.

The second part is 15 asterisks, which separate the basic situation of the file from the change content.

***************

The third part shows the document before the change, namely F1.

1,7 * * *
A
A
A
!a
A
A
A

This shows not only the 4th line of the change, but also the first three rows of line 4th and the next three rows, so it shows a total of 7 rows. Therefore, the previous "* * * * * * * * * * * * * * * * 1,7" starts from line 1th for 7 consecutive lines.

In addition, each line of the file content is preceded by a marker bit. If empty, indicates that the row has no change, and if it is an exclamation point (!), it indicates that the row has been changed, and if it is a minus sign (-), the row is deleted, or a plus sign (+) indicates that the behavior is new.

Part IV shows the document after the change, namely F2.

---1,7----
A
A
A
!b
A
A
A

In addition to the Change row (line 4th), the context is displayed in three rows, showing a total of 7 rows.

V. diff in merged format

If the two file similarity is very high, then the context format diff, will show a lot of duplicate content, it is a waste of space. In 1990, GNU diff pioneered the "Merge format" diff, which combines the context of F1 and F2.

It is used by adding U parameters (representing unified).

$ diff-u F1 F2

The results appear as follows:

---F1 2012-08-29 16:45:41.000000000 +0800
+ + + F2 2012-08-29 16:45:51.000000000 +0800
@@ -1,7 +1,7 @@
A
A
A
-A
+b
A
A
A

The first part of it is also the basic information of the document.

---F1 2012-08-29 16:45:41.000000000 +0800
+ + + F2 2012-08-29 16:45:51.000000000 +0800

"---" means the document before the change, and "+ + +" means the document after the change.

The second part, the position of the change with two @ as the first and end.

@@ -1,7 +1,7 @@

The preceding " -1,7" is divided into three parts: the minus sign indicates the first file (that is, F1), "1" means line 1th, and "7" means 7 rows in a row. Together, it means that the following is the first file, starting from line 1th, 7 consecutive lines. Similarly, "+1,7" means that after a change, the second file becomes a row of 7 lines starting from line 1th.

The third part is the concrete content of the change.

A
A
A
-A
+b
A
A
A

In addition to those rows that have changed, the context displays 3 rows. It merges the context of two files together, so it's called "Merge format." The first flag bit of each line, empty represents no change, minus indicates the row that the file was deleted from, and the plus sign represents the new row for the second file.

Six, the Git format diff

Version management system git, which uses a variant of merge format diff.

$ git diff

The results appear as follows:

Diff--git A/f1 B/f1
Index 6f8a38c: 449b072 100644
---a/f1
+ + B/F1
@@ -1,7 +1,7 @@
A
A
A
-A
+b
A
A
A

The first line represents a diff that results in a git format.

Diff--git A/f1 B/f1

Comparisons are made of the F1 of version a (i.e. before the change) and the F1 of version B (i.e. after the change).

The second line represents two versions of the GIT hash value (the 6f8a38c object in the index area, compared to the 449b072 object in the Working directory area), and the last six digits are the object's schema (normal file, 644 permissions).

Index 6f8a38c: 449b072 100644

The third line represents the two files that are compared.

---a/f1
+ + B/F1

The "---" represents the pre-change version, and "+ + +" represents the changed version.

The following lines are the same as the official merge format diff.

@@ -1,7 +1,7 @@
A
A
A
-A
+b
A
A
A

Vii. reading Materials

* Diff-wikipedia

* How to read a patch or diff

* How to work with diff representation in git

Finish

This article turns from http://www.ruanyifeng.com/blog/2012/08/how_to_read_diff.html

LINUX diff command detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.