1.3 Getting Started-Git basics

Last Update:2015-04-01 Source: Internet

Author: User

Tags git workflow perforce

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Git Basics

So, simply put, what kind of a system is Git? Please note that the next content is very important, if you understand the idea of Git and basic principles of work, you will know the reason why, easy. When you start learning Git, don't try to compare concepts with other version control systems (such as Subversion and Perforce), or you can easily confuse the actual meaning of each operation. While Git is saving and processing various kinds of information, it is quite different from other version control systems, although the command form is very similar. Understanding these differences will help you to accurately use the various tools that Git provides.

Direct recording of snapshots, rather than differential comparisons

The main difference between Git and other version control systems is that git only cares about whether the overall file data is changing, while most other systems only care about the specific differences in file content. This type of system (Cvs,subversion,perforce,bazaar, etc.) records what files are updated each time, and what lines are updated, see figure 1-4.

Figure 1-4. Other systems record the specific differences between each file in each version

Git does not store the variance data that changes before and after. In fact, Git is more like taking a snapshot of a changed file and recording it in a tiny file system. Each time you submit an update, it will take a snapshot of all of the file's fingerprint information, and then save an index that points to the snapshot. To improve performance, if the file does not change, Git does not save it again, but only a link to the last saved snapshot. Git works as shown in Figure 1-5.

Figure 1-5. Git saves a snapshot of the file each time it is updated

This is an important difference between Git and other systems. It completely overturned the traditional version control of the routine, and the implementation of the various aspects of the way to make a new design. Git is more of a small file system, but it also provides a number of powerful tools based on this, not just a simple VCS. Later, in chapter three, we'll look at the benefits of this design when we discuss Git branch management.

Nearly all operations are performed locally

The vast majority of operations in Git require access to local files and resources without a network connection. But if you use CVCS, almost all operations need to be connected to the network. Because Git keeps a historical update of all current projects on the local disk, it's fast to handle.

For example, if you want to browse the project's history update summary, Git does not have to go to the outside server to fetch the data back, and then read it directly from the local database. So at any time you can flip through it without waiting. If you want to see the difference between the current version of the file and the version one months ago, Git takes the snapshot and the current file for one months to make a difference, instead of asking the remote server to do it, or pulling the old version of the file locally for comparison.

With CVCS, you can't do anything without a network or a disconnected VPN. But with Git, even if you're on a plane or a train, you can be very happy to submit updates frequently, and then upload them to the remote repository when there's a network. Also, on the way home, you can continue to work without a VPN connection. For other version control systems, this is almost impossible, or very cumbersome. such as Perforce, if you do not connect to the server, almost nothing to do (note: The default cannot issue commandsp4 edit fileStart editing the file because Perforce requires a networked notification system to declare who is revising the file. But actually manually modifying the file permissions can bypass this limitation, but it is not possible to commit the update after completion. If it is Subversion or CVS, although you can edit the file, you cannot commit the update because the database is on the network. It seems like none of this is a big problem, but after the actual experience, you will be pleasantly surprised to find that this is actually going to make a big difference.

Maintain data integrity at all times

Before saving to Git, all data is evaluated for content checksum (checksum), and this result is used as a unique identifier and index of the data. In other words, Git doesn't know anything about a file or directory after you've modified it. This feature, as a design philosophy of Git, is built at the bottom of the overall architecture. So if the file becomes incomplete during transmission, or if the disk is damaged, the file data is missing and Git is immediately aware of it.

Git calculates the checksum of the data using the SHA-1 algorithm, and calculates a SHA-1 hash value as a fingerprint string by calculating the contents of the file or the structure of the directory. The string consists of 40 hexadecimal characters (0-9 and a-f) and looks like this:

24b9da6552252987aa493b52f8696cd6d3b00373

Git's work relies entirely on this type of fingerprint string, so you'll often see such a hash value. In fact, everything stored in a Git database is indexed with this hash, not by file name.

Most operations only add data

Most of the common Git operations are simply adding data to the database. Because any kind of irreversible operation, such as deleting data, can make it difficult to rewind or reproduce the historical version. In other VCS, if the update has not yet been submitted, it is possible to lose or confuse some of the modified content, but in Git, once the snapshot is submitted, there is no need to worry about losing data, especially in the habit of pushing to other warehouses regularly.

This high level of reliability makes our development work a lot of peace of mind, although to do a variety of experimental try, and then how will not lose data. As to how git internally stores and recovers data, we'll discuss git internals in the Nineth chapter.

Three status of files

Well, now note that the next concept is very important. There are only three states in Git for any file: committed (committed), modified (modified), and staged (staged). Submitted indicates that the file has been safely stored in the local database, modified to indicate that a file has been modified, but has not yet been committed, and that a staged representation puts the modified file on the list to be saved on the next commit.

This is where we see the three working areas of the file flow when Git manages the project: the working directory of Git, the staging area, and the local repository.

Figure 1-6: Working directory, staging area, and local warehouse

Each project has a Git directory thatgit cloneCome out, that's the one..gitThe directory;git clone --bare, the new directory itself is the Git directory. ), which is where Git stores metadata and object databases. This directory is very important, each time you clone a mirrored warehouse, the actual copy is the data in this directory.

Remove all files and directories of a version from the project, which is called the working directory to begin the follow-up work. These files are actually extracted from the compacted object database in the Git directory and can then be edited in the working directory.

The so-called staging area is just a simple file that is typically placed in a Git directory. Sometimes people call this file an index file, but the standard term is called a staging area.

basic Git workflow is as follows:

Modify some files in the working directory.
Commit the update to permanently dump the file snapshot saved in the staging area to the Git directory.

Therefore, we can judge the status from the location of the file: if it is a specific version of the file saved in the Git directory, it is a committed state, if modified and placed in the staging area, it is a staged state, if it has been modified since the last time, but has not been placed in the staging area, is the modified state. In the second chapter, we will learn more about the details and learn how to perform subsequent operations based on file status and how to skip staging direct submissions.

1.3 Getting Started-Git basics

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More