Version Control tool comparison between SVN and CVS

Source: Internet
Author: User
Tags commit comparison require svn svn update versions version control system collabnet

As early as 2000, CollabNet, Inc. (http://www.collab.net) began to look for CVS replacement product developers, CollabNet provided a collaboration software suite CEE (CollabNet Enterprise Edition), one of its components

is a version control system. Although CEE was initially using CVS as its version control system, the limitations of CVS were obvious at the outset, and CollabNet knew sooner or later to find a better alternative. Unfortunately, CVS has become an open-source world.

The real standard, because there is no better product, at least not free to use. So CollabNet decided to write a new version control system, based on the CVS idea, but fix its errors and unreasonable features.


In February 2000, they contacted the author of Open Source Development with CVS (Coriolis, 1999) Karl Fogel, and asked if he wanted to work for the new project, coincidentally, when Karl was working with a friend Jim Blandy, please.

On the design of a new version control system. In 1995, both of them started a company that offered CVS support, cyclic software, although they eventually sold the company, but still use CVS daily for their day-to-day work, with CVS

The folding eventually prompted them to seriously consider how to manage the tagged version of the data, and they did not just put forward the name "subversion", and made the Subversion repository basic design. So when CollabNet offered an invitation,

, Karl immediately agreed to work on the project, while Jim got his employer, RedHat Software sponsored him to the project and offered a relaxed time. CollabNet hired Karl and Ben Collins Sussman, detailing the design from

Starting in March, with the help of Behlendorf, CollabNet, Jason Robbins and Greg Stein (then an independent developer, active in the WEBDAV/DELTAV System specification phase), Subversion Soon attracted a lot of

Active developers, the result is that many people with CVS experience are happy to have the opportunity to do something about the project.

The original design team was fixed on a simple target, they didn't want to reclaim the Virgin land in version control methodology, they just wanted to fix CVS, they decided that subversion matched CVs, kept the same development model, but did not replicate CVS obvious

Defects. Although it does not need to be a successor to CVS, it should also maintain enough similarity with CVS, making it easy for CVS users to make conversions.

After 14 months of coding, August 31, 2001, Subversion can "become a service", developers stop using CVS to save Subversion code, and the use of subversion itself.

When CollabNet started the project, he had funded a lot of work (it paid for full-time subversion developers), and subversion, like many open-source projects, was dominated by loose and transparent rules that inspire intellectual elites.

。 CollabNet's copyright license is in full compliance with Debian's free software policy, which means that anyone is free to download, modify and republish without the permission of CollabNet or other people.

First, subversion contains most of the CVS functionality

As a rewrite and an improved version of CVS, Subversion aims to replace the current popular CVS as a better version control software. The main developers of Subversion are industry-renowned CVS experts. Subversion supports most CVS functions/commands, and Subversion's command style and interface are very close to CVS. Of course, different places are just improvements to CVS.

second, the overall version number

A new version, and get a self-increment version number n+1, which is not for a particular file, but global, for the entire repository. Therefore, we can consider Subversion's repository as an array of file systems or file trees.

From a technical point of view, in Subversion, the "5th version of document FOO.C" is wrong; the correct argument should be: "The file foo.c has been modified 5 times in the repository, that is, after 5 commits are executed." ”。 Obviously, in subversion, the repository was modified 5 times after the content of the FOO.C, and was modified 6 times after the foo.c content is probably exactly the same, because the repository of the 6th modification is likely to only modify the other parts of the repository, and did not modify the foo.c. Conversely, in CVS, version 1.1 and version 1.2 of file foo.c are always different.

Subversion's global version number has many advantages: such as copying a directory or file, no matter how many files are involved, subversion does not need to execute a copy command on a single file sequentially. Just create a pointer to the corresponding global version number.

third, version control of the catalog

CVS can only version files and cannot version the directory, so CVS does not have any concept of a file move operation. When it comes to moving files manually, CVS can only notice that a file has been deleted in one place, and another file is created in a new location. Because it does not connect two operations, it is also easy to lose the file history track. When setting up a CVS repository, you must be very careful to choose the exact location for each file, since it will almost always be used after setup.

Also, because CVS does not record the version history of the directory, CVS does not support the "renaming" of Files (rename), and the artificial renaming of files can lead to the loss of historical connections between the files before and after the naming, whereas record history is the primary purpose of version management.

Also, CVS does not support the "copy" of a file, and a man-made copy to CVS can only see the addition of new files, and cannot record the connection between the copy source and destination files.

In summary, the lack of support for file "move", "rename", "Copy" is rooted in the fact that CVS cannot record the version history of the directory, which is often occurring during the current software development process, which is one of the main reasons why subversion was developed and replaced by CVS.

Subversion handles directories as a special kind of file (in fact, from a file system perspective, a directory is really a special kind of file that changes the contents of a directory when subdirectories/files in the directory are deleted, renamed, or new subdirectories/files are created). As a result, subversion records the history of changes to the directory as recorded in the history of changes to ordinary files, and subversion can accurately record the historical links before and after a file/directory move, rename, or copy operation. Similarly, subversion supports comparisons of different historical versions of a directory, as compared to different historical versions of a file, to clearly show the history of the directory's changes.


Iv. Atomic Submission

From the user's point of view, both CVS and Subversion support batch submissions for multiple file modifications, but there is an essential difference in how they are implemented.

CVS uses a linear, serial, batch commit, sequentially, one after the other, each file is successfully submitted, a new version of the file is recorded in the repository, and the user-provided log information is repeatedly stored in the version history of each modified file when submitted.

The drawback of the CVS serial batch submission model is that when there is a disruption to the bulk operation for any reason (typical reasons include: Network outage, client panic, etc.), the repository is often in an inconsistent state: the original should be all the storage of the file only part of the storage, it is possible that the latest version of the repository can not be compiled smoothly, More seriously, as other users perform CVS update operations, the inconsistency will quickly spread across the development team, seriously impacting the team's development efficiency and having a quality risk. In addition, the development team often spends more time debugging and troubleshooting software if the batch commits are not discovered in time.

CVS can cause inconsistencies even when bulk commits do not break: Suppose User A initiates a bulk commit that takes a long time to complete, while User B performs a CVS update operation. At this point, User B is very likely to get an inconsistent update, that is, User B through the "Update" operation, to get user A's partial modification files.


Subversion completely eliminates the above-mentioned drawbacks of CVS. No matter how many file modifications are included in a batch submission, the commit becomes valid only if all of the file modifications have been successfully made available to other users, otherwise Subversion will automatically perform a rollback (rollback) operation for any reason whatsoever. In other words, Subversion ensures that all changes are either put into effect or not in storage, i.e. no modifications are made to the repository. This is the atomic commit of subversion (atomic commit).

Because of the atomicity of subversion and the global version numbering, a unique, new global version number is generated when the commit is completed successfully, and the log information provided by the user at the time of submission is associated with the new version number and is stored only once (as opposed to the CVS-per-file deduplication).

v. Support the concept of change sets

Since all commits to subversion are atomic, the only global version number that is formed for each successful commit corresponds to all file modifications for this batch commit, that is, a subversion version number actually corresponds to a logical changeset (change set). The changeset may correspond to a fix to a bug, or an improvement to an existing feature, or an implementation that corresponds to a new feature. It can be said that a changeset is a logical result of a software development activity that can be referenced by its corresponding version number in other software development processes (such as software consolidation/integration processes, software release management, change management systems, defect tracking systems). As a result, Subversion adds versioning from a simple, single level of file modification to a logical abstraction, to a level of development activity that is easier to understand and communicate.

Six, differentiated binary file processing

For historical reasons, CVS is primarily designed for early programmers, CVS can effectively handle text files (or ASCII files, source code files), text files can be differentiated storage, old and new versions of the comparison, file merging, etc. but for binary files, CVS is obviously powerless. In the CVS repository, for a historical version of a binary file, the only thing CVS can do is separate, redundant storage for different versions, even though there are only minor differences between versions. For example, a 10M binary file (photos, graphic files, mechanical design files, electronic design files) if modified once a week, regardless of the size of each modification, a year down, only the file will consume more than 500M of storage space. Also, each time a client obtains a new version of the file, it consumes 10M of network traffic.

For the current development team, whether it is software development, WEB site development, mobile phone and other electronic products development, need to do version management is not only the source code and other text files, but also need to manage the requirements of the document, design documents, test documents, user manuals, graphic image files, mechanical/Electronic design files and many other binaries, CVS is obviously not a good choice.


Unlike CVS, Subversion uses a uniform binary difference algorithm (binary differencing algorithm), which uses the same differential comparison algorithm for text and binary files. and store in the repository in the same way: after each commit, only the differences from the previous version are stored in the repository, which saves a lot of storage space.


This binary difference algorithm not only applies to the version of the storage, more importantly, Subversion of binary files and text files, when the client needs to obtain a new version (such as the implementation of SVN update), only version differences on the network are transmitted, thus greatly reducing the consumption of network bandwidth. See "Seven, two-way differentiation-compressed network transmission" for more details.

Seven, two-way differentiation-compressed network transmission

As mentioned above, CVS cannot effectively differentiate binary files. For text files, CVS supports only one-way differential transfers: the transfer from the CVS server to the client is differentiated, that is, when the CVS update is performed, only the differential portions are transferred from the server to the client, and when the CVS commit is executed, the CVS All the contents of the modified file need to be transferred from the client to the server, and the difference cannot be transmitted.

In contrast, subversion is a two-way differential transfer between a text file and a binary file, and the process of compressing/decompressing the differentiated content: the apparent difference in server-side access, similar to CVS; The secret to subversion's differences on the client- Subversion implies a "read-only, clean" copy of each file in the client's working copy, which is hidden in the hidden directory. SVN, which is usually invisible, has more magical properties, see "12, more local/offline operations"), By comparing the user's modifications to the client and the implied copy, Subversion acquires the difference that needs to be truly transferred to the server, and the differences are compressed before the network is transferred.

For CVS, the cost of the operation (network bandwidth consumption is the maximum operating cost) is proportional to the size of the modified file, regardless of the size of the modification itself; for subversion, the cost of operations is only proportional to the size of the modification itself, regardless of the size of the file being modified. As a result, Subversion consumes less network bandwidth than CVS (the client's storage space for less bandwidth consumption should be a pretty good choice in the current computing environment.) )。 Subversion is more suitable for geographically distributed teams based on the Internet (or WAN) for collaborative development-version servers are centralized, single, and clients are widely distributed.


Eight, efficient, fast creation of branches and baselines

Both CVS and Subversion support branching (branch) and baseline (tag), which can effectively support the parallel development of large projects through branching and merging, and through baseline management, can accurately identify the version of a set of files, effectively perform software release management and historical backtracking when necessary.

However, CVS and subversion differ greatly in the way they implement branches and baselines. When CVS creates a branch, it needs to operate on all branches of the file, so the cost of establishing the branch (mainly the time required to establish the branch, or the computing resources consumed) is proportional to the number of files participating in the branch, the larger the project, the larger the repository, the more files, and the higher the cost of branching. The establishment of a baseline (tag) is similar to this.

The branches and baselines of Subversion are created by performing a "copy": Recall how we managed the so-called "branching" and "baseline" management without introducing version management tools. The answer is obviously "copy"-We build a baseline with "copy" or "Backup", and similarly, to support multiple developers to develop at the same time, we create a "copy" for each developer. As a result, Subversion creates branches and baselines through a "copy" that is very natural and somewhat "back to basics".

Because of the global version number of subversion, the creation of branches or baselines in subversion, or the "copy" process in subversion, the real operation is to create a pointer to a global version number (pointer) in the repository, There is no longer a need to perform operations on numerous individual files in turn. Therefore, the cost of the operation is a very small constant, regardless of the size of the project, the size of the repository, the number of files, and the establishment of a branch or baseline does not require redundant storage of the version, the newly established branch or baseline does not occupy the repository space, and the subsequent storage space of the branch is only related to the size of the modification


ix. Integration of Apache Web Server, providing more features

By integrating with Apache Web Server, Subversion can provide a repository access mechanism based on the HTTP/HTTPS protocol to support Subversion's secure access across firewalls. In addition, Subversion can take advantage of more Apache features, including but not limited to: Apache's rich user authentication mechanisms (including user authentication through LDAP servers such as Windows Active Directory servers), Granular access control based on directory path, compression/decompression of transmitted network traffic, browse repository directory structure, etc.

The previous time department internal PCM on SVN code management and everyone share, halfway mentioned a problem, is compared with CVS, exactly what is the advantage of SVN, because the company started with SVN very early, so many colleagues have not experienced the era of CVS;

I used CVS in the previous company, from the developer's point of view of the difference is not obvious, I can think of is two or three points: 1, CVS is very unfriendly to the management of the directory, unable to track the directory changes, 2, the file can not rename the submission; 3. The management of binary files (such as pictures) is not good, the code merge frequently problem;

What is the difference between the two? These days from the Internet to gather some, posted here for your reference:

Cvs Svn
About version numbers The file-based self-increment sequence number. Global-based, self-increasing serial numbers, not just files, but directories.
Storage type format CVS is a version control system based on RCS files. Each CVS file is just an ordinary file, plus some additional information. These files will simply repeat the tree structure of the local file. Therefore, you do not have to worry about any data loss, and you can manually modify the RCS file if necessary. SVN is based on a relational database (BERKLEYDB) or a series of binary files (FS_FS). On the one hand this solves many problems (for example, parallel read and write shared files) and adds many new features such as runtime transaction characteristics. )。 On the other hand, however, data storage becomes opaque, or less user-friendly. That's why tool software is so important to warehouses (databases).
Access speed Slower; Because he differentiated file transfers based on one-way (service-to-client) Overall, SVN is a lot faster than CVS because of the different architecture implementations because of the bidirectional differential file transfer.
It only transmits very little information on the network and supports more features of the offline mode. But it also comes at a price. The price of speed is huge storage (full backup of all working files).
Meta data Allow only files to be stored Allows a file to have any of the named properties. It's completely functional, but I don't know what it's for.
File type Originally designed for the storage of text files. So support for other file types (binary, Uniform code) files is almost no more information if needed, and the client server is tuned. SVN cares about all the file types and doesn't require you to do it manually, because his storage is binary based
Roll back CVS allows any rollback to be done on any of the submitted versions, although it will take some time (all files are processed separately). SVN is not allowed to roll back after submission. We recommend adding a good state version of the warehouse to the end, overwriting the damaged version. The corrupted version will exist in the database anyway.
Transaction The "0 or one" transaction principle in CVS is not implemented at all. If you check in a few files (add to the server), it is very likely that some of the files are complete, and the others are not. The most unspoken rule is to manually correct these and repeat check-in for the remaining files (not all files) one by one. These files will be checked in in two phases. But so far, there have been no cases of data warehouse corruption caused by this lack of functionality. Supports the "0 or one" transaction principle, which is one of the great advantages of SVN
Architecture, Code, extensibility CVS is an ancient system. At first, CVS was just some scripting file using RCS. These scripts were later formed into a single application, but the internal structure still needed to be improved. To this day, there are still people trying to start over, rewrite CVs, but not succeed. We have tried to rewrite the client code in order to better integrate the effects, however, unsuccessfully. Now we don't see how far CVS can go in function. The subversion developer did spend a lot of time on the internal architecture. We still don't know how accurate these decisions are, and so on. But one thing is certain, the code has good extensibility, and the enhancement work is ongoing.
Network layer Cannot integrate with Apache WEB server There is an abstract repository access concept that makes it easy for people to implement new network mechanisms. The Subversion "Advanced" Web server is a module of the Apache Web server that communicates with the outside world in an HTTP variant protocol called Webdav/deltav. This helps with the stability and interoperability of subversion, and provides many additional important features: For example, identity authentication, authorization, online compression, and archive browsing. There is also a small and independent Subversion server program, using a custom communication protocol, can be easily used through SSH tunnel way
Rename, delete operation Local file Rename commits are not supported;
Delete Remove and erase two The former removes both local and library files, and the latter only deletes local files;
Cannot delete folder
Support File Rename submission system will prompt to delete old file, create new file
Deleting a local file in the submission library is also deleted
User access rights There are read, write, creat, none of these four permissions, no one can delete the folder (Admin can only run to the server to delete the corresponding folder cruel. I only know this method for the time being ... ) Only read, write, and none of the three permissions creat and delete permissions seem to be bundled with write
Creating Branches and baselines When CVS creates a branch, it needs to operate on all branches of the file, so the cost of establishing the branch (mainly the time required to establish the branch, or the computing resources consumed) is proportional to the number of files participating in the branch, the larger the project, the larger the repository, the more files, and the higher the cost of branching. The establishment of a baseline (tag) is similar to this. The branches and baselines of SVN are created by performing a "copy": Recall how we managed the so-called "branching" and "baseline" management without introducing version management tools. The answer is obviously "copy"-We build a baseline with "copy" or "Backup", and similarly, to support multiple developers to develop at the same time, we create a "copy" for each developer.

Reprinted from the Old Tang's blog: http://blog.csdn.net/sfdev/archive/2008/08/26/2835073.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.