GitHub first pit: line break auto-Convert

Source: Internet
Author: User

From source

Always wanted to publish projects on GitHub, participate in projects, but Git is a hard-to-learn article. Bought a "Git authoritative guide", turned a few pages, mom, that call a complex, but also Cygwin and command line, scared I dare not learn.

Finally, one day I found that GitHub has a Windows client, and it's good to try. There's no need to learn too much about Git principles and commands, or to build a project on GitHub. But the first time to participate in the Open source project is a spectacle.

After

Carefully Fork the Pauling greatly (@JacksonTian) of the Eventproxy project, local changes to submit, sync to the server, with the excitement of the mood issued pull request ... Then we found the problem. I found that the update shown in the diff chart was not just the lines I modified, but the entire file was displayed as modified. For

It looked very strange, so hurriedly withdrew pull Request, himself looking for reasons.

The initial positioning is a newline character issue for the file because I found that the local file is a Windows line break, but it is clear that everyone is now doing the project with a UNIX line break. This is a big doubt, so in the repeated comparison of the Web side and local files, each version, the basic positioning to the problem.

Background

Under each operating system, the text file uses a different line break. Unix/linux uses ( 0x0A LF), earlier Mac OS used 0x0D (CR), and later OS X was consistent with UNIX after replacing the kernel. However, Dos/windows always uses 0x0D0A (CRLF) as a newline character. (Don't know what Bill Gates is thinking, bidirectional compatible?) )

This disunity does cause problems with cross-platform file exchange. Although the reliable text editor and the IDE support these kinds of line breaks, but the file in the storage always have a fixed standard Ah, such as cross-platform collaboration project source code, in the end to save the style of the line-break?

Git as a source version control system, with a kind of (I look) a little someone else, smart attitude, to this problem provides a "solution."

Git is developed by the famous Linus and can only run on the *nix system at first, so it is recommended to store only UNIX-style line breaks. But it also takes into account cross-platform collaboration scenarios, and provides a "line break auto-transform" feature.

This feature is in Auto mode by default, and when you check out a file, it tries to replace the UNIX line break (LF) with the newline character (CRLF) of Windows, and when you commit the file, it tries to replace CRLF with LF.

(Are you clear?) A version control system will change your files without your knowing it. That's a really cool TM, isn't it? )

Defects

Git's "line break auto-convert" feature sounds smart and sweet, because it tries to keep the consistency of the files in the repository (UNIX style), while guaranteeing the compatibility of local files (Windows style) on the one hand. Unfortunately, this feature has bugs and is unlikely to be fixed in the short term.

The problem is that if the file you're working on is a UTF-8 file that contains Chinese characters, then this "line break auto-Convert" feature does not work at the time of submission (but there is no problem with conversion processing at checkout). I suspect that this function module will be in the process of processing Chinese characters + CRLF This pair of combinations when the direct crash returned.

This may not be the only trigger scenario (I don't have much energy to play with it after all), but it's enough for just one hole.

Step on the Pit

This is a pretty big pit, and the Chinese developers under Windows will almost always be in the middle of the trick. For example, you check out a file with a default state of Git in Windows, write a Chinese comment (or the file contains Chinese), and then save it for submission ... Inadvertently, your files have been destroyed.

Because the files you submit to the repository have become completely Windows-style (the UNIX style is turned into Windows-style when checked out, but not converted at the time of submission), each line has been modified (see the beginning of this article), and this modification is not visible (most diff tools are hard to see the line break clearly), This ultimately leads to no one can see what you have changed in this submission.

It's not over yet. If other small partners find this problem and kindly change it back, and then you repeat the tragedy, the editing history of this document is basically a mystery.

It is almost impossible for a foreigner to step on this hole, making the bug a secret existence. But on the Internet casually search, you will find that the victim is more than me, such as the elder brother's encounter will be more painful than mine.

Prevention

First of all, do not rush to the whole Git, first of all good yourself. Your team needs to determine a uniform line break standard (UNIX style is recommended). Then, the team members need to split up to get ready--Configure their own code editor and IDE to meet both of these requirements:

    • Use the team uniform line break standard by default when creating a new file
    • Keep existing line break formatting unchanged when opening a file (do not convert automatically)

On the one hand, the specification of the project code can be kept to the maximum extent, on the other hand, even if some nonstandard cases are left in the existing code, it will not cause confusion because of repeated conversions. (Of course, as an obsessive-compulsive disorder, I wish all the projects from the outset to enter a rigorous and orderly track.) )

Next, we can start to set up Git. My advice is to completely turn off this smart "line break Auto-transform" feature . When you close it, Git doesn't do anything about your line break, and you can control your line-break style completely autonomously and with the expectation.

The following are mainly for different Git clients, describe the operation method separately.

Git for Windows

The goods are officially produced by Git and will be sold to you at the time of installation to sell the "line break auto-convert" feature, which is estimated that most people will not hesitate to choose the first item (auto-conversion) after watching the gorgeous feature introduction. Please do not resist the temptation to choose the last item (do not make any hands or feet).

If you have made the wrong choice and do not need to reinstall, you can modify the settings directly using the command line. It's simple, just open the command line tool with Git Bash, enter the following command, and then hit ENTER:

1 git config --global core.autocrlf false

Tortoisegit

Many students from TortoiseSVN will probably choose Tortoisegit as the main client, then also need to configure. In the Windows Explorer window, right-click, choose "Tortoisegit→settings→git", do the following settings.

(Since Tortoisegit is actually a GUI shell based on Git for Windows, the settings you made in the previous section will affect the state of these options, and they may be just what you want.) )

GitHub's Windows Client

It is the second defendant of today. The goods are very easy to use, very suitable for small white, I mainly used it to clone a key project to the local. Perhaps it is to maintain a concise and easy-to-use friendly image, this goods does not provide the rich option like Tortoisegit (the "automatic conversion of newline characters" such as the details of the full secrecy ah, I am such a small white dead do not know how to die ... )。 Therefore, we need to manually modify its configuration.

GitHub's Windows client is actually a shell that comes with a portable version of Git for Windows. This portable version and your own Git for Windows are independent of each other, but they all use the same configuration file (which is actually the file under the current user's home directory .gitconfig ).

So if you've already configured your own Git for Windows, there's nothing to worry about. But if your machine is installed on GitHub's Windows client only, the simplest way to configure it is to manually modify the configuration file.

Modify the global configuration file for Git

Enter the current user's home directory (usually XP's user directory is C:\Documents and Settings\yourname , under Vista and Win7 C:\Users\yourname ) and open the file with your most handy text editor .gitconfig .

Found in the [core] section autocrlf , change its value to false . If not found, add a [core] line in the section: (Final effect See figure)

1 autocrlf = false

In fact, all of the command-line or graphical interface configuration methods described above have the same final effect, because this configuration file is essentially modified.

And also

Turns off Git's "line break auto-transform" function. Lost its "protection", you will be a bit insecure in the mind. You might ask: what if I accidentally mixed a few Windows returns into my file? Can such accidents be prevented?

In fact, Git can really help you stop this mistake. It provides a line break check feature ( core.safecrlf ), which allows you to check whether a file is mixed with different styles of newline characters at commit time. The options for this feature are as follows:

    • false-Do not make any checks
    • warn-Check and warn on commit
    • true-Check on commit and reject commit if found mixed

I recommend using the most restrictive true options.

core.autocrlfas well, you can modify this option by using the command line, graphical interface, and configuration file in three ways. Concrete operation will not repeat, we extrapolate it.

At last

You might also ask, what if my editor accidentally converts the entire file's newline character to another format? Can you prevent it?

This...... I can't help you. So it is recommended that you pay more attention to the file status before submitting the file:

If you find that the number of change rows is too large, and the number of rows is the same, beware of unexpected circumstances. The children who are spoiled by the GUI often lack patience, ignore the system information, see the button on the point, easy to let small negligence into a big accident. So the experts favor the command line, not unreasonable.

GitHub first pit: line break auto-Convert

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.