Git details: git started.

Source: Internet
Author: User
Tags git workflow perforce how to use git using git version control system

This chapter describes the knowledge before using git. We will first understand the historical background of some version control tools, and then try to let git run on your system until the final configuration is complete, you can start development normally. After reading this chapter, you will understand why git is so popular and why you should start using it immediately.


1.1 about Version Control

What is version control? Do I really need it? Version Control is a system that records changes in the content of several files for future reference to specific version revisions. In the example presented in this book, we only manage Version Control for text files that store the software source code, but in fact, you can control the version of any type of files.

If you are a graphic or web designer, you may need to save all the revisions of an image or page layout file (which may be a feature you desire ). Using a version control system (VCS) is a wise choice. With it, you can trace a file back to the previous state, and even roll back the entire project to the previous state at a certain point in time. You can compare the details of file changes and find out who modified the last part, which leads to a weird problem and who reported a functional defect. Using a version control system usually means that you can easily restore the original style even if you change, delete, and delete files in the entire project. However, the additional workload is minimal.

Local Version Control System

Many people are used to copying the entire project directory to save different versions, and may change the name and backup time to show the difference. The only benefit of doing so is simplicity. But there are also a lot of disadvantages: Sometimes the working directory will be confused, once the wrong file is lost data, there is no way to cancel the recovery.

To solve this problem, many local version control systems have been developed a long time ago. Most of them use a simple database to record previous file update differences (see Figure 1-1 ).

Figure 1-1. Local Version Control System

One of the most popular ones is the RCs, which is still visible in many computer systems today. You can use the RCS command even after installing the developer toolkit on the popular Mac OS X system. Its working principle is basically to save and manage the patch ). A file patch is a text file in a specific format that records changes in the content before and after the corresponding file revision. Therefore, based on the patch after each revision, the RCS can calculate the file content of each version by continuously patching.

Centralized Version Control System

Next, people will encounter another problem. How can developers working collaboratively on different systems? Therefore, the centralized version control system (CVCs) came into being. Such systems, such as CVS, subversion, and perforce, all have a single centrally managed server that stores the revision of all files, people working collaboratively connect to this server through the client to retrieve the latest files or submit updates. Over the years, this has become the standard practice of Version Control Systems (see Figure 1-2 ).

Figure 1-2. Centralized Version Control System

This approach brings many benefits, especially compared to the older local VCs. Now, everyone can see to some extent what other people in the project are doing. Administrators can easily control the permissions of each developer, and managing a CVCs is far easier than maintaining the local database on each client.

There are two sides to the problem: Good and bad. The most obvious disadvantage of this is the single point of failure (spof) of the central server. If an instance goes down for one hour, no one can submit the update within this hour, and no one can work together. If the disk of the central server fails, and no backup happens, or the backup is not timely enough, there is still a risk of data loss. The worst case is that all history changes of the entire project are completely lost, except for some snapshot data extracted by the client. However, this is still a problem, you cannot guarantee that all data has been completely extracted in advance. A similar problem exists in the local version control system. As long as the history of the entire project is saved in a single location, there is a risk of losing all historical update records.

Distributed version control system

Therefore, the distributed version control system (DVCs) is available. In such systems, such as git, mercurial, bazaar, and darcs, the client not only extracts the latest file snapshot, but completely mirrors the original code repository. In this way, any server that works collaboratively can be recovered from a local warehouse like a mirror. Because each extraction operation is actually a complete backup of the code repository (see Figure 1-3 ).

Figure 1-3. distributed version control system

Furthermore, many such systems can specify to interact with several different remote code repositories. By default, You can collaborate with people in different teams in the same project. You can set different collaboration processes as needed, such as hierarchical workflow, which cannot be implemented in previous centralized systems.



1.2 brief git history

Like many great events in life, git was born in an age of great competition and great innovation. The Linux kernel open-source project has a wide range of participants. Most of the Linux kernel maintenance work is spent on the tedious tasks of submitting patches and saving archives (-years ). By 2002, the entire project team began to enable the distributed version control system bitkeeper to manage and maintain code.

By 2005, the partnership between commercial companies developing bitkeeper and the Linux kernel open-source community had ended, and they had withdrawn the right to use bitkeeper for free. This forces the Linux open-source community (especially Linus Torvalds, the founder of Linux) to learn the lesson. Only developing a version control system of its own will this be the same. They set a number of goals for the new system:

* Speed * simple design * strong support for non-linear development modes (Thousands of parallel development branches allowed) * completely distributed * ability to efficiently manage ultra-large scale projects (speed and data volume) similar to the Linux Kernel)

Since its birth in 2005, git has been mature and well-developed, and remains the initial goal while being highly user-friendly. It is fast and extremely suitable for managing large projects. It also has an incredible nonlinear branch management system (see chapter 3) that can meet various complex project development needs.




1.3 git Basics

So, simply put, what kind of system is git? Please note that the following content is very important. If you understand the idea and basic working principles of git, you will be able to use it easily. When learning git, do not try to compare various concepts with other version control systems (such as subversion and perforce); otherwise, it is easy to confuse the actual meaning of each operation. While git stores and processes various types of information, its command format is very similar, but it is quite different from other version control systems. Understanding these differences will help you accurately use the various tools provided by git.

Directly record snapshots, rather than compare differences

The main difference between git and other version control systems is that git only cares about whether the overall file data changes, while most other systems only care about the specific differences in file content. This type of system (CVS, subversion, perforce, bazaar, etc.) records the files that have been updated and the rows that have been updated. See figure 1-4.

Figure 1-4. Other systems record the specific differences between files in each version

Git does not store the different data before and after the changes. In fact, git takes snapshots of changed files and records them in a micro file system. Each time an update is submitted, it will view the fingerprint information of all files and take a snapshot of the file, and then save an index pointing to the snapshot. To improve performance, if the file remains unchanged, git will not save it again, but only make a link to the last saved snapshot. Git works like Figure 1-5.

Figure 1-5. Git saves the file snapshot for each update

This is an important difference between git and other systems. It completely subverts the traditional version control routines and designs the implementation methods of each link. Git is more like a small file system, but it also provides many powerful tools based on this, not just a simple VCs. Later, when we discuss git branch management in Chapter 3, we will look at the benefits of this design.

Almost all operations are performed locally.

Most operations in git only need to access local files and resources without connecting to the network. However, if CVCs is used, almost all operations need to connect to the network. Because git stores historical updates of all current projects on the local disk, the processing speed is fast.

For example, if you want to view the historical Update Summary of a project, git does not need to go to the external server to retrieve data, but read the data from the local database and display it to you. So you can read it at any time without waiting. If you want to see the difference between the current version of the file and the version earlier than a month ago, git will take out the snapshot a month ago and perform a Difference Operation on the current file, instead of requesting a remote server to do this, or pulling files of earlier versions to a local directory for comparison.

With CVCs, you cannot do anything without a network or disconnecting a VPN. However, if you use git, even if you are on a plane or train, you can submit updates very happily and upload them to a remote warehouse when there is a network. On the way home, you can continue working without connecting to a VPN. Changing to another version control system is almost impossible or troublesome. For example, if you do not connect to a server, you can hardly do anything.p4 edit fileStart editing the file because perforce needs to notify the system online to declare who is modifying the file. In fact, manual modification of file permissions can bypass this restriction, but the update cannot be submitted after the modification is completed .); If it is subversion or CVS, You can edit the file but cannot submit the update because the database is on the network. It seems that these are not big problems, but after the experience, you will be pleasantly surprised to find that this will actually bring a lot of difference.

Always maintain data integrity

Before saving the data to git, all data must be checked and calculated, and the result is used as the unique identifier and index of the data. In other words, it is impossible for git to know nothing after you modify a file or directory. As a git design philosophy, this feature is built at the bottom of the overall architecture. So if the file becomes incomplete during transmission, or the file data is missing due to disk damage, git can immediately detect it.

Git uses the SHA-1 algorithm to calculate the data checksum and calculate a SHA-1 hash value based on the file content or directory structure as the fingerprint string. The string is composed of 40 hexadecimal characters (0-9 and A-F) and looks like:


Git's work is completely dependent on this type of fingerprint string, so you will often see this hash value. In fact, all the things stored in the GIT database use this hash value for indexing, rather than relying on the file name.

Most operations only add data

Most common git operations only add data to the database. Any irreversible operation, such as data deletion, makes it difficult to roll back or reproduce a previous version. In other VCs, if an update is not submitted, some modifications may be lost or confused. However, in git, once a snapshot is submitted, there is no need to worry about data loss, this is especially true for the habit of regularly pushing data to other warehouses.

This high reliability gives us a lot of peace of mind in our development work. Even though we have made a variety of experimental attempts, we will not lose data in any way. As for how git stores and restores data internally, we will detail the Internal principles of git in Chapter 9.

Three States of a file

Well, please note that the concepts to be discussed next are very important. For any file, there are only three States in git: committed, modified, and staged ). Submitted indicates that the file has been securely saved in the local database; modified indicates that a file has been modified, but not submitted for storage; saved files are saved in the list to be saved when the file is submitted next time.

As a result, we can see three working areas for file transfer during git Project Management: git working directory, temporary storage area, and local repository.

Figure 1-6. working directory, temporary storage area, and local repository

Each project has a git directorygit cloneIn this example.gitDirectory; ifgit clone --bareThe new directory itself is the GIT directory .), It is the place where git stores metadata and object databases. This directory is very important. Each time an image warehouse is cloned, the data in this directory is actually copied.

Retrieve all files and directories of a specific version from the project, which is called a working directory for subsequent work. These files are actually extracted from the compressed object database in the GIT directory. Then you can edit these files in the working directory.

The so-called temporary storage area is just a simple file, which is generally placed in the GIT directory. Sometimes people call this file an index file, but it is also called a temporary storage area.

The basic Git workflow is as follows:

1. modify some files in the working directory. 2. Take a snapshot of the modified file and save it to the temporary storage area. 3. Submit the update and permanently dump the file snapshots saved in the temporary storage area to the GIT directory.

Therefore, we can determine the status from the position of the file: if the file is a specific version saved in the GIT directory, it is in the submitted status; if the file is modified and saved to the temporary storage area, it is in the Saved state. If it has been modified since the last time it was taken out but has not been put into the saved area, it is in the modified state. In chapter 2, we will learn more about the details, how to perform subsequent operations based on the file status, and how to skip temporary storage and submit directly.



1.4 install git

It's time to try git, but you have to install it first. There are many installation methods, mainly divided into two types, one is to install by compiling the source code; the other is to use the pre-compiled installation package for a specific platform.

Install from source code

If conditions permit, installation from source code has many advantages, at least the latest version can be installed. Every version of git is constantly trying to improve the user experience, so it is better to compile and install the latest version by yourself through the source code. Some Linux installation packages are not updated in a timely manner, so unless you are using the latest distro or backports, it is the best choice to install them from the source code.

Git needs to call the code of libraries such as curl, zlib, OpenSSL, expat, and libiconv. Therefore, install these dependency tools first. If you have a yum system (such as fedora) or an apt-Get system (such as a Debian system), run the following command to install it:

$ yum install curl-devel expat-devel gettext-devel \ openssl-devel zlib-devel $ apt-get install libcurl4-gnutls-dev libexpat1-dev gettext \ libz-dev libssl-dev

Then, download the latest source code from the following git official site:

Then compile and install:

$ tar -zxf git- $ cd git- $ make prefix=/usr/local all $ sudo make prefix=/usr/local install

Now availablegitCommand.gitClone the GIT project repository to your local machine for future updates:

$ git clone git://

To install the pre-compiled git binary installation package on Linux, you can directly use the package management tool provided by the system. Install with Yum on fedora:

$ yum install git-core

You can use apt-get to install Debian systems such as Ubuntu:

$ apt-get install git-core
Install on Mac

There are two ways to install git on Mac. The easiest option is to use a graphical git Installation tool, interface 1-7, in:

Figure 1-7. Install git OS X

The other is through macports ( If macports has been installed, run the following command to install git:

$ sudo port install git-core +svn +doc +bash_completion +gitweb

In this way, you do not need to install the dependent libraries by yourself. macports will help you solve these troubles. Generally, the installation options listed above are enough. If you want to use git to connect to the Subversion code repository, you can also add the + SVN option, which will be described in Chapter 8. Another method is to use homebrew ( install git.)

Install on Windows

Installing git on Windows is also easy. A project called msysgit provides the installation package. You can download the EXE Installation File and run it on the Google Code Page:

After installation, you can usegitTool (SSH client already included), and a graphical git project management tool.



1.5 configuration before git is first run

Generally, in the new system, we need to configure our git work environment first. The configuration only needs to be performed once, and the current configuration will be used in future upgrades. If necessary, you can use the same command to modify the existing configuration at any time.

Git provides a tool called git config.git-configCommand, but you can usegitAdd a name to call this command .), It is used to configure or read the corresponding work environment variables. These environment variables determine the specific working methods and behaviors of git in each link. These variables can be stored in the following three different places:

  • /etc/gitconfigFile: configurations that are common to all users in the system. If you usegit configUsage--systemThis file is read and written.
  • ~/.gitconfigFile: the configuration file in the user directory only applies to this user. If you usegit configUsage--globalThis file is read and written.
  • The configuration file in the GIT directory of the current project (that is, in the working directory).git/configFile): the configuration here is only valid for the current project. The configuration at each level overwrites the same configuration at the upper layer..git/configThe configuration in will overwrite/etc/gitconfigVariable of the same name.

On Windows, git will find.gitconfigFile. The main directory is$HOMEThe directory specified by the variable, usuallyC:\Documents and Settings\$USER. In addition, git will try to find/etc/gitconfigFile, just look at the directory where git was originally installed, and use it as the root directory to locate.

User Information

The first configuration is your personal user name and email address. These two configurations are very important. Each git commit will reference these two configurations, indicating who submitted the update and will be permanently included in the history together with the updated content:

$ git config --global "John Doe" $ git config --global [email protected]

If--globalThe changed configuration file is located in your user's home directory. In the future, all your projects will use the user information configured here by default. If you want to use another name or email in a specific project, remove--globalOption to re-configure, the new settings are saved in the current project.git/configFile.

Text Editor

Next, set the default text editor. Git automatically calls an external text editor when you enter some additional messages. By default, the default editor specified by the operating system is used, which may be VI or vim. If you have other preferences, such as Emacs, You can reset them:

$ git config --global core.editor emacs
Difference Analysis Tools

Another commonly used difference analysis tool is used to solve the merge conflicts. For example, to use vimdiff:

$ git config --global merge.tool vimdiff

Git can understand the output information of kdiff3, tkdiff, meld, xxdiff, emerge, vimdiff, gvimdiff, ecmerge, and opendiff merging tools. Of course, you can also specify to use your own developed tools. For details, refer to chapter 7.

View configuration information

To check the existing configuration information, you can usegit config --listCommand:

$ git config --list Chacon [email protected] color.status=auto color.branch=auto color.interactive=auto color.diff=auto ...

Sometimes we can see repeated variable names, which means they come from different configuration files (for example/etc/gitconfigAnd~/.gitconfig), But the final git actually uses the last one.

You can also directly check the setting of an environment variable, as long as the specific name is followed, like this:

$ git config Scott Chacon
1.6 get help

If you want to know how to use git's various tools, you can read their help in three ways:

$ git help $ git --help $ man git-

For example, to learn how to use the config command, run:

$ git help config

We can view the help information at any time without connecting to the Internet. However, if you think it is not enough, you can go to the frenode IRC server ( ).#gitOr#githubSeek help from others. These two channels have hundreds of people, most of whom have a wealth of git knowledge and are helpful.



Conclusion 1.7

So far, you should have a basic understanding of git, including the difference between it and the CVCs you used previously. Now, you have installed git on your system and set your own name and email. Next, let's continue to learn the basic knowledge of git.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.