Git (a) what Git is

Source: Internet
Author: User
Tags using git version control system

Why use Git

Confucius once said, the name is a word shun shun the matter into.

Before we learn a new technology, figuring out why it's important to learn it, and why we want to learn about git, I use a if-else statement to tell you why:

if (you believe me) {      I recommend you to study;} else if (admittedly, I am not a bull, you can not believe me, but you should believe in most people's choice) {More and more people in the world, more and more      projects are using Git, the trend is overwhelming;}else if (you may think that the truth is in the hands of a handful of people) {      You can not believe in the public, but you should believe that Linustorvalds is the one who wrote the kernel prototype of Linux//I said the prototype of the kernel, not all of the kernel, Don't pick on my fault!      git is the second work he created}else{      what?! You've never heard of Linustorvalds, and I don't know what      Linux is ... throw new Exception ("Well, you win, you don't have to learn git, I can't run Anymore");  }


What is Git?

First, Git is a version management tool.

We usually write a "HelloWorld" program, or write a small project with only hundreds of lines of code, do not need a special code management tools, rely on their own memory to be able to smooth the code.

However, when it comes to projects with huge code volumes, it often takes weeks or months for many people to work together to complete. Development process, will face the code changes, additions and deletions, recovery and other work, developers can not clearly remember each change, this time need to use version management tools to track the changes in code.

The version management tool assigns each file a version number, and after each modification, even if only one letter is changed, the version management tool is accurately recorded and changes the version number of the file. In this way, each version number corresponds to a change in the file, so that the file can be compared, restore and other operations.

The earliest widely used version management tool was the CVS(Concurrent Versions System), which was later replaced by SVN(Subversion), which works similarly, Centralized code management with one server as the core:


Centralized vs Distributed

SVN is a typical centralized version control system with a centralized version control system that has a single, centrally managed server that keeps revisions of all files, while people who work together connect to the server through the client, take out the latest files, or submit updates.

This approach brings many benefits, especially compared to older local VCs. Now, everyone can see to some extent what other people in the project are doing. Administrators can also easily control the permissions of each developer.

There are two sides to things, good and bad. The most obvious disadvantage of this is the single point of failure of the central server. If you are down for an hour, no one can submit updates, restores, comparisons, etc. within the hour, and you won't be able to work together. If a central server's disk fails, and there is no backup or backup is not enough, there is a risk of data loss. The worst-case scenario is the total loss of all history change records for the entire project, except for some snapshot data extracted by the client, but this is still a problem and there is no guarantee that all the data has been extracted.

SVN only cares about the specific differences in file content. Each time you record which files have been updated and what has been updated. As shown in the following:


Unlike SVN, the Git record version history only cares about whether the overall file data has changed. Git does not save variance data before and after the contents of the file. In fact, Git is more like taking a snapshot of a changed file and recording it in a tiny file system. Each time you submit an update, it will take a snapshot of all of the file's fingerprint information, and then save an index that points to the snapshot. To improve performance, if the file does not change, Git does not save it again, but only a connection to the last saved snapshot. The way Git works is as follows:


In a distributed version control system, the client does not just extract the latest version of the file snapshot, but instead completely mirrors the original code repository. As a result, any server that works together fails and can be recovered using any of the mirrored local repositories afterwards. This type of system can be specified to interact with several different remote code warehouses. With this, it is possible to collaborate with people from different working groups in the same project. You can set up different collaboration processes as needed.

In addition, because Git keeps all the historical updates about the current project on the local disk, and most of the operations in Git only need to access local files and resources, without a network, so fast processing. With SVN, you can't do anything without a network or disconnecting a VPN. But with git, even if you're on a plane or a train, you can be very happy to submit updates frequently, and then upload to the remote mirror repository when there's a network. For other version control systems, this is almost impossible, or very cumbersome.

To be brief, GIT has the following characteristics:

1. The repository for each clone in Git is equal. You can create your own repository from a clone of any repository, and your repository can be provided as a source to others, if you wish.

2, Git every time the extraction operation, is actually a full backup of the Code warehouse .

3, submit completely in the local completion, no need to give you authorization, your repository you master, and the submission will always be successful.

4, Git commits will not be interrupted until your work is completely satisfied, push to others or others pull your repository, merge will occur in the pull and push process , can not automatically resolve conflicts will prompt you to complete the manual.


Global version number vs Global version number

The global version number of SVN and the CVS file each maintain a set of version numbers, behind the seemingly simple global version number, is SVN provides support for the processing of things, each thing processing (that is, a commit) has the entire repository globally unique version number.

Git's version number is further, and the version number is the only one in the world . Before saving to Git, all data is evaluated for content checksum (checksum), and this result is used as a unique identifier and index of the data . In other words, Git doesn't know anything about a file or directory after you've modified it. This feature, as a design philosophy of Git, is built at the bottom of the overall architecture. So if the file becomes incomplete during transmission, or if the disk is damaged, the file data is missing and Git is immediately aware of it.

Git calculates the checksum of the data using the SHA-1 algorithm, and calculates a SHA-1 hash value as a fingerprint string by calculating the contents of the file or the structure of the directory. The string consists of 40 hexadecimal characters (0-9 and a-f) that look like this:

24b9da6552252987aa493b52f8696cd6d3b00373

1. Everything stored in a git database is indexed using this hash value, not by file name.

2. The advantage of using hash values for version numbers is that for a distributed version control system, the version number that is formed after each commit is not duplicated. Another benefit is to ensure the integrity of the data, because the hash value is calculated based on the content or directory structure, so we can also determine whether the data content has been tampered with.

3, SVN version number is continuous, you can pre-award the next version number, and Git version number is not. Because subversion is a centralized version control, it is easy to achieve the continuity of the version number. Git is a distributed version control system, and Git uses a 40-bit long hash as the version number, each person's submission is independently completed, no successive points (even if the submission has a succession of points, but also due to the direction of push/pull and timing and different). Although the version number of Git is discontinuous, there is a clue that each version has a corresponding parent version (one or two), which can form a complex commit chain.

4. Git version number simplification: Git can use a string of any length starting from the left as a simplified version number, as long as the simplified version number does not produce ambiguity. A 7-bit short version number is generally used (you can also use a shorter version number as long as there are no duplicates).

Git (a) what Git is

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.