Patch vs. diff

Source: Internet
Author: User

I. Overview

Diff and Patch are a pair of complementary tools, mathematically, diff is similar to the difference between two sets, and patches are similar to the and operations on two sets. Diff compares the differences of two files or collections of files and logs them down to generate a diff file, which is also what we often call patch files, which are patches. Patch can apply a diff file to one of the original two collections, resulting in another collection.

For example, file A and File B, after diff generates a patch file C, the process is equivalent to | a–b| = c, then the process of patch is B + c = A or a-c = B (A + c = B or b–c = a). So as long as we can get a, B, c three files of any two, you can use diff and patch this tool to generate another file. This is the beauty of diff and patch.

Patch patch is a genius programmer, the inventor of Perl, Larry Wall invented, it should be effective to communicate the needs of the program source code, with the development of Linux as the representative of the source code to flourish, the concept of patch has become open source code initiator, Part of the collective unconscious of contributors and participants. Patch contains only the part of the source code modification, which is important for the open source Community's collaborative development model, meaning that the release of new software releases and software defects or improvements can be published in smaller files, can reduce the transmission of the network, to facilitate the management of software maintainers.

Patch files are available in several formats, and the formats supported on different platforms vary, but the most common are the context format and the Unified format. The context format is widely used and is the de facto standard for patch file formats. The format contains the difference part and several rows adjacent to it, which are called contexts, which are not changed, but they appear in the patch file, which makes it more tolerant to restore patches. The unified format is common to the GNU patch implementation, which is used by the Linux kernel published in patch format.

In addition, there are other less-used formats, such as the normal format, side-by contrast mode (side-by-side), Ed script, and RCS script mode. In addition to side-by-side mode allows users to observe file differences, most of the other formats are compatible with the old patch format.

Second, the use of tools

1. Usage of diff

The diff can be followed by two file names or two directory names, for example:

diff [option] oldfile NewFile

If it is a directory name plus a file name, then it only works in that directory with the same name. For example:

Diff/usr/xu mine

Compare the file named mine in the directory/usr/xu with the mine file in the current directory.

The option options commonly used for diff are:

L-r when comparing catalogs, a recursive comparison is used to generate patches for the entire code tree

L-u Output Unified format, diff has "traditional" and "unified" two formats, now generally use the "unified" format, compared to the unified format generated files large, but contains more information, conducive to reading and positioning

L-n means that if the file does not exist, it is equivalent to an empty file, which is used to produce patches with file additions or deletions.

When the l-a patch contains binary files by default, diff prints to the standard output, so it is generally redirected to a file with a patch suffix, known as a patch file. Description of binary: Binary files can be stored in a patch file in the original way. Diff can be generated (plus-a option) and patches can be recognized. If you find such a patch file too ugly, one of the workarounds is to use uuencode to process the binary file.

If it is a two directory, it is used for all files in that directory, not recursively. If we want to execute recursively, we need to use the-R parameter. The diff file format generated without any parameters is a simple format that only marks a different number of rows and content. We need a more detailed format that identifies the context of the differences, which is more conducive to improving the ability of the patch command to be recognized. You can use the-c switch at this time. You can refer to the command-line options and parameters for table 1 diff.

Table 1 diff command-line options and parameters

Option description

-A treat all files as text, even if the file looks like binary, and makes a line-by-row comparison

-B ignores changes in the number of blanks in the block

-B ignores changes caused by inserting or deleting empty rows

-C produces output in the context format

-c[num] produces output in the context format that displays the contents of the num rows before and after the block, and displays the contents of the 3 rows before and after the block if the value of NUM is not specified

-H Modify how diff handles large files

-I ignores case

-I regexp ignore inserting or deleting rows that match regular expression regexp

-L The output results are processed by PR command plus page numbers

-P shows the C function that the block appears

-Q only reports whether the files are different;

-R comparing directories, recursive comparisons are used to produce patches for the entire code tree

-S reports the same two files (the default behavior is not to report the same file)

-T output when tab expands to blank

-U produces output in "unified" (Unified) format

-u[num] produces output in a "unified" (Unified) format that displays the contents of the NUM line before and after the block, and if the value of NUM is not specified, the contents of the 3 rows before and after the block are displayed

-V Print diff version number

-W ignores whitespace when comparing rows

-W cols If the output is produced in a side format (see-y), let each column of the output have cols characters justifies

-X pattern ignores any files and subdirectories that match patterns pattern when comparing directories

-y results in side-by format output

Example:

void Main () {printf ("Hello the world!\n");}

void Main () {printf ("HELLO the World!\n");}

Use the following command to generate the patch file Hello.patch:

$diff-u hello.c hello-new.c > Hello.patch

Diff can compare the entire directory, generating patch files such as hello-1.0 and hello-1.1 two directories, where hello-1.1 is the hello-1.0 Update command:

$diff-runa hello-1.0 hello-1.1 > Hello-1.1.patch

2. Use diff and patch together

Patch usage: Generate target files based on original files and patch files. For example:

Patch A C can get B, this step is called a b patch, the name of the patch is called C. After this step, your file a becomes file B. What if you want to get back to a after you've finished patching?

Patch-r B C can be re-restored to a. So don't worry about losing the a problem.

In fact, patch in the specific use of the original file is not specified, because the patch file has been documented in the original file path and name. Patches are smart enough to recognize the path and name of the original file. But sometimes it's a little bit of a problem:

For example, the two directory diff may already contain the name of the original directory, but when we hit the patch will go into the directory and then use patch, this time you need to tell the patch command how to handle the path in the patch file. You can use the-PN switch to tell the patch command to ignore the number of path separators. Examples are as follows:

A file under Dir_a, modified B file under Dir_b, General Dir_a and Dir_b in the same level directory dir_p. For a one-time diff of all files in the entire directory, we typically execute the following command to the parent directory dir_p of Dir_a and Dir_b:

DIFF-RC dir_a dir_b > C

This time the patch file C will record the path of the original file to dir_a/a

Now another user has got a file and C file, where a file is in the same directory as dir_a. Generally, he would prefer a patch operation under the Dir_a directory, which would perform

Patch < C

But this time the patch parses the records in the C file, that the original file is./dir_a/a, but actually./A, at which point the patch will not find the original file. To avoid this situation we can use the-P1 parameter as follows

Patch-p1 < C

At this point, Patch ignores the contents of the 1th "/" and considers the original file to be./A, which is correct.

Patch comes with a good help, which lists a lot of options, but 99% of the time as long as two options to meet our needs:

PATCH-P1 < [Patchfile]

Patch-r < [Patchfile] (used to undo a patch)

The-P1 option represents the number of layers in the directory to the left of the Patchfile file, and the top-level directory differs on different machines. To use this option, place your patch in the directory you want to patch, and then run PATH-P1 < [patchfile] in this directory.

Of course, the files or directories being compared can also be obtained from standard input. If File1 or file2 is represented by "-", the standard input is indicated.

Instance:

Cat Build.xml | Diff-y-W 100-build-1.10.xml

Compares the differences between Build.xml and build-1.10.xml to the screen (standard output) with a width of 100 characters per column

Diff-c Web Web2.xml > Web.xml.diff

Produces Web2.xml-Modified context-formatted patch files relative to Web. Xml to Web.xml.diff

DIFF-CRN src src_xfire > Xfire-patch.diff

The resulting code tree src_xire the context patch file to Xfire-patch.diff relative to the code tree SRC, and the contents of the new file in Src_xfire will also be included in the patch.

Patching can be done using the command line tool patch. The basic usage of this is:

Patch-pnum < patch files

When patching, switch the working directory to the top-level directory where the source code needs to be patched ( Note: The directory where you run the patch should be the same as when you generated the patch with diff ).

3. How do I then determine the number after p?

This number indicates the number of layers of the directory in the patch file that need to be removed, which is related to the relative location of the working directory and the code directory at the time of the patch creation, as indicated in the patch documentation by the general patch author. If not specified, it can be obtained by observing the full path of the file listed in the patch file and the relative path of the file in the code tree.

4. If patch fails

If patch succeeds, the default is not to build the backup file (Note: The Patch tool under FreeBSD defaults to saving the backup), if you need, you can add the B switch. In this way, the pre-modified file is backed up with the name "original filename. Orig". If you prefer a different suffix, you can specify it with a "b suffix".

If the patch fails, patch will send a successful patch line to the patch, and (unconditionally) generate a backup file and a. rej file, and generate a backup of the original file "original filename. Orig", where the "-B" option can specify the original file backup "original filename. Orig" The suffix name of the. Rej file is a patch line that has not been successfully submitted and needs to be manually patch up. This situation may occur when the original code is upgraded.

5. Make changes to the entire package

In general, we need to make changes to the entire package and generate a patch file, which is typical of the operation process. Here are a few of the command-line switches described earlier:

Tar xzvf software.tar.gz # expands the original package, whose directory is software

Cp-a Software Software-orig # Make a pre-change backup

CD software

[Modify, Test ...]

Cd..

Diff-runa Software-orig software > Software-my.patch

Now we can save Software-my.patch as the result of this modification, as for the original package, it can not be saved. Wait until the next time you need to change it, you can use the patch command to hit the original package, and then continue to work. For example, the work done on Linux kernel does not have to save a few 10 megabytes of modified source code each time.

This is one of the benefits, the benefit of the second is easy to maintain, because the unified patch format has a certain fuzzy matching ability, can reduce the original package upgrade brought about by the maintenance workload.

Patch vs. diff

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.