Git Getting started notes on how to install Git

Source: Internet
Author: User
Tags gettext git workflow hash openssl perforce svn file permissions git clone

Introduction to GIF Basics

So, simply put, what kind of a system is Git? Please note that the next thing is very important, if you understand the idea of Git and the basic principle of work, you will know the reason why, with ease. When you start to learn Git, don't try to match concepts to other version control systems (such as Subversion and Perforce, etc.), otherwise it's easy to confuse the actual meaning of each operation. When Git saves and processes various kinds of information, it is quite different from other version control systems, although the command form is very similar. Understanding these differences will help you accurately use the various tools provided by Git.

Record snapshots directly, not diff

The main difference between Git and other version control systems is that git only cares about the overall change in file data, while most other systems only care about specific differences in file content. This type of system (Cvs,subversion,perforce,bazaar, etc.) records which files are updated each time, and what rows are updated, see figure 1-4.


Figure 1-4. Other systems document specific differences in each version

Git does not save these changed variance data. In fact, Git is more like taking a snapshot of a changed file and recording it in a miniature file system. Each time you submit an update, it will review all of the file's fingerprint information and take a snapshot of the file, and then save an index to the snapshot. To improve performance, if the file does not change, Git will not save it again, but only a link to the last saved snapshot. Git works as shown in Figure 1-5.


Figure 1-5. Git saves a snapshot of a file each time it is updated

This is the important difference between Git and other systems. It completely subverts the traditional version control routines, and has made a new design for each link realization way. Git is more of a small file system, but it also offers a number of powerful tools based on this, not just a simple VCS. When we discuss Git Branch management later in chapter three, we'll see what benefits this design will bring.

Almost all operations are locally executed

Most of the operations in Git require only access to local files and resources, without networking. But if you use CVCs, almost all operations need to connect to the network. Because Git keeps a historical update of all current projects on the local disk, it is fast to handle.

For example, if you want to browse the history update summary of a project, Git does not have to go to the outside server to fetch the data, and then you can read it directly from the local database. So anytime you can read it right away, no need to wait. If you want to see the difference between the current version of the file and the version one months ago, Git takes the snapshot one months ago and the current file for a differential operation without asking the remote server to do it or pulling the old version of the file locally for comparison.

With CVCs, you can't do anything without a network or disconnecting a VPN. But with Git, even if you are on a plane or a train, you can be very happy to submit updates frequently, and then upload them to the remote warehouse when there is a network. Also, on the way home, you can continue to work without connecting to the VPN. Switching to another version of the control system is almost impossible, or very cumbersome. For example, if you do not connect to the server, almost nothing can be done (Perforce: The default cannot issue the command to p4 edit file start editing the file, because Perforce requires the network notification system to declare that the file is being revised. But actually manually modifying the file permissions can bypass this limitation, but will not be able to commit the update when it is finished. If you are Subversion or CVS, you can edit the file, but you cannot commit the update because the database is on the network. It seems like these are not big problems, but after the actual experience, you will be pleasantly surprised to find that this is actually a very different.

Keep data integrity at all times

Before you save to Git, all data is checked and (checksum) calculated for the content, and the result is the unique identification and index of the data. In other words, it is impossible for Git to know nothing after you have modified the file or directory. This feature, as Git's design philosophy, is built on the bottom of the overall architecture. So if the file becomes incomplete during transmission, or if disk corruption causes file data to be missing, Git can immediately detect it.

Git calculates the checksum of the data using the SHA-1 algorithm, and computes a SHA-1 hash value for the contents of the file or the structure of the directory as a fingerprint string. The string consists of 40 hexadecimal characters (0-9 and a-f) and looks like this:

 
  
  
  1. 24b9da6552252987aa493b52f8696cd6d3b00373

Git's work relies entirely on this type of fingerprint string, so you will often see such a hash value. In fact, all the things that are saved in the Git database are indexed by this hash value, not by the filename.

Most operations only add data

Most of the common Git operations are simply adding data to the database. Because any kind of irreversible operation, such as deleting data, makes it difficult to rollback or reproduce the historical version. In other VCS, it is possible to lose or confuse some of the modified content if the update has not yet been submitted, but in Git, once the snapshot is submitted, there is no need to worry about losing data, especially the habit of periodically pushing it to another warehouse.

This high reliability makes our development work much more secure, and even if we try to do all kinds of experimentation, we will not lose the data. As for how the data is saved and recovered within Git, we will discuss the inner principles of Git in the Nineth chapter.

Three states of the file

Now, please note that the next concept is very important. For any file, there are only three states in Git: committed (committed), modified (modified), and staged (staged). has been submitted to indicate that the file has been securely saved in the local database, has been modified to indicate that a file has been modified, but has not been submitted for saving, and has been staged to place the modified file in the list to be saved on the next commit.

This brings us to the three working areas of file turnover when Git manages the project: Git's working directory, staging area, and local warehouse.


Figure 1-6. Working directory, staging area, and local warehouse

Each project has a git directory (if git clone it comes out, it is .git the directory; if git clone --bare so, the new directory itself is the git directory.) , where Git is used to hold metadata and object databases. The directory is very important, and each time you clone a mirrored warehouse, the actual copy is the data in this directory.

Remove all files and directories from a project, called the working directory, to start the follow-up work. These files are actually extracted from the compressed object database in the Git directory, and can then be edited in the working directory.

The so-called staging area is just a simple file, usually in a Git directory. Sometimes people call this file an index file, but the standard term is called a staging area.

The basic Git workflow is as follows:

    1. Modify some files in the working directory.
    2. Make a snapshot of the modified file and save it to the staging area.
    3. Commit the update to permanently dump the file snapshot saved in the staging area to the Git directory.

So, we can judge the state from where the file is located: If a specific version of the file is saved in the Git directory, it belongs to the submitted state, and if modified and placed in the staging area, it belongs to the staging state, and if it has been modified since it was last removed and not yet placed in the staging area, that is the modified state. By the second chapter, we will learn more about the details and learn how to follow the file status and how to skip the staging direct submission.

GIT installation process

It's time to try Git, but install it first. There are many kinds of installation methods, mainly divided into two, one is by compiling the source code to install, the other is to use for a specific platform precompiled installation package.
Installing from source code

If conditions permit, there are many advantages to installing from source code, at least the latest version can be installed. Each version of Git is constantly trying to improve the user experience, so it would be nice to be able to compile and install the latest version from the source code. Some Linux versions come with installation packages that are not updated in time, so unless you are using the latest distro or backports, installing from the source code is actually the best choice.

Git needs to invoke the code of Curl,zlib,openssl,expat,libiconv and so on, so you need to install these dependencies first. On systems with Yum (such as Fedora) or on apt-get systems (such as the Debian system), you can install them with the following command:

The code is as follows Copy Code

$ yum Install curl-devel expat-devel gettext-devel
Openssl-devel Zlib-devel
$ apt-get Install Libcurl4-gnutls-dev Libexpat1-dev gettext
Libz-dev Libssl-dev

After that, download the latest version source code from the following Git official site:

The code is as follows Copy Code

Http://git-scm.com/download

Then compile and install:

The code is as follows Copy Code

$ TAR-ZXF git-1.7.2.2.tar.gz
$ CD git-1.7.2.2
$ make prefix=/usr/local All
$ sudo make prefix=/usr/local install

Now you are ready to use Git to clone the GIT project warehouse locally so that you can update it later:

The code is as follows Copy Code

$ git clone git://git.kernel.org/pub/scm/git/git.git

Installing on Linux

If you are installing a precompiled Git Binary installation package on Linux, you can use the Package management tool provided by the system directly. Install on Fedora with Yum:

The code is as follows Copy Code

$ yum Install Git-core

On a Debian system like Ubuntu, it can be installed with Apt-get:

The code is as follows Copy Code

$ apt-get Install git

Install on MAC

There are two ways to install Git on a Mac. The easiest is to use the graphical Git installation tool, the interface is shown in Figure 1-7, and the download address is:

The code is as follows Copy Code

Http://code.google.com/p/git-osx-installer

18333fig0107-tn

Figure 1-7. Git OS X Installation Tool

The other is installed via MacPorts (http://www.macports.org). If you have installed the MacPorts, install Git with the following command:

The code is as follows Copy Code

$ sudo port install git-core +svn +doc +bash_completion +gitweb

This way you do not need to install a dependency library on your own, MacPorts will help you deal with these problems. Generally, the installation options listed above are sufficient, and if you want to use Git to connect to the Subversion code warehouse, you can add the +SVN option, which will be covered in chapter eighth. There is also a use of homebrew (Https://github.com/mxcl/homebrew): Brew install Git. )
Installing on Windows

Installing Git on Windows is also easy, with a project called Msysgit that provides installation packages that can be downloaded to the GitHub page and run:

The code is as follows Copy Code

http://msysgit.github.com/

Once the installation is complete, you can use the command line git tool (already with the SSH client), and a graphical interface git project management tool.

Note in Windows usage:you should use Git with the provided msysgit shell (Unix style), it allows to use the complex lines of command given in the book. If you are need, for some reason, to use the native Windows Shell/command line console, your have to use double quotes Instea D of Simple quotes (for parameters with spaces in them) and your must quote the parameters ending with the circumflex Accen T (^) if they are last on, as it's a continuation symbol in Windows.

Git version control

What is version control? Why should I care about it? Versioning is a system that records changes in one or more file contents for future review of specific revisions. In the example shown in this book, we only manage versioning of text files that hold software source code, but in fact, you can version control of any type of file.
If you are a graphic or web designer, you may need to save all revisions of a picture or page layout file (this may be a feature you are very eager to have). Using the version control System (VCS) is a wise choice. With it you can backtrack a file back to its previous state, even returning the entire project to a point in the past. You can compare the details of the file changes, find out who finally changed the place, so as to identify the cause of the strange problem, who was reporting a functional defect and so on. Using a version control system usually means that you can easily revert to the original image, even if you change the file in the whole project by deleting it. But the extra workload is slim.
Local version control system
Many people are accustomed to copying the entire project directory to save different versions, perhaps renaming and backup time to differentiate. The only advantage of doing so is simplicity. But there are many disadvantages: sometimes confuse the working directory, once the wrong file lost data can not undo recovery.
To solve this problem, many local version control systems have been developed long ago, mostly using a simple database to record the previous update differences (see Figure 1-1).


Figure 1-1. Local version control system

One of the most popular is called RCS, which is still visible on many computer systems today. Even after installing the Developer Toolkit on a popular MAC OS X system, you can use the RCS command. It works basically to save and manage file patches (patch). A file patch is a text file in a specific format that records changes before and after the corresponding file revision. Therefore, according to each revised patch, RCS can be constantly patched to calculate the contents of each version of the file.
Centralized version control system
Then there is the question of how to get developers working together on different systems. Thus, the centralized version control system (centralized versioning Systems, referred to as CVCs) came into being. Such systems, such as cvs,subversion and Perforce, have a single, centrally managed server that saves revisions to all files, and people who work together connect to the server through the client, take out the latest files, or submit updates. Over the years, this has become a standard practice for versioning systems (see Figure 1-2).


Figure 1-2. Centralized version control system

This has brought a lot of benefits, especially compared to the old-fashioned local VCS. Now, everyone can see to some extent what other people in the project are doing. Administrators can also easily control the permissions of each developer, and managing a cvcs is far easier than maintaining a local database on each client.
Things are both good and bad. The most obvious disadvantage of this is the single point of failure of the central server. If there is an hour of downtime, no one can commit the update and work together within that hour. If a central server's disk fails, it happens to be a backup, or the backup is not timely, there is the risk of losing data. The worst-case scenario is a complete loss of all historical change records for the entire project, while some of the locally stored snapshot data that is accidentally extracted by the client is the hope of recovering the data. But this is still a problem, and you can't guarantee that all the data has been fully extracted. There are similar problems with local version control systems, as long as the history of the entire project is kept in a single location, there is a risk of losing all history update records.
Distributed version control system
The distributed Versioning System (distributed version Control systems, abbreviated DVCS) was published. In such systems, such as Git,mercurial,bazaar and Darcs, the client does not only extract the latest version of the file snapshot, but rather mirrors the code warehouse completely. As a result, any server that works together will fail and can be recovered with any of the mirrored local warehouses afterwards. Because each extraction operation is actually a full backup of the Code warehouse (see Figure 1-3).

Figure 1-3. Distributed version control system
Further, many of these systems can be designated to interact with several different remote code warehouses. This allows you to collaborate with people from different working groups on the same project. You can set up different collaborative processes, such as hierarchical model workflows, that are not achievable in previous centralized systems, depending on your needs.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.