Chapter 1. What Is CVS?

CVS is a version tracking system. It maintains records of files throughout their development, allows retrieval of any stored version of a file, and supports production of multiple versions of a file. CVS enables multiple developers to work simultaneously on a single file without loss of data. Each developer works on her own copy of a file, and all changes are later merged into a single master copy. CVS can be integrated with bug-tracking and feature-tracking systems, and it provides features that can assist a project manager by tracking changes to a project over time.

CVS can be used in many environments for many purposes: maintaining configuration files, mail aliases, source code, FAQ files, art, music, articles, essays, and books. Some system administrators keep the contents of their /etc directory under CVS in order to track system configuration changes over time. CVS is also used to store and automatically publish content to web sites and FTP servers.

CVS follows the Unix ethos of small programs doing what they do well. The RCS (Revision Control System) program handles revision control of single files, so CVS uses RCS to store file data. CVS adds features to RCS—most notably, the abilities to work on collections of files and to work out of a repository that may be local or remote.

What Is a Versioning System?

Version control is the process of recording and being able to retrieve changes to a project. Computer scientists define version control, source control, and change management as different but overlapping tasks; version control is the most accurate term for the aspects of the field that apply to CVS. A version control system lets you retrieve an old version to fix bugs or update features, to branch development to allow the project to progress along multiple tracks simultaneously, and to generate reports that show the differences between any two arbitrary stages of a project.

Most version control systems store notes with each change, and many provide tools that allow a project leader to analyze the changes. Most also include the ability to retrieve differences between arbitrary versions of files, which makes it easier to create patches or locate bugs.

The benefits of a version control system such as CVS include:

  • Any stored revision of a file can be retrieved, viewed, and changed.

  • The differences between any two revisions can be displayed.

  • Patches can be created automatically.

  • Multiple developers can work simultaneously on the same project or file without overwriting one another’s changes.

  • The project can be branched to allow simultaneous development along varied tracks. These branches can be merged back into the main line of development.

  • Distributed development is supported across large or small networks. (CVS offers a variety of authentication mechanisms.)

Using version control for a project requires some extra work on an ongoing basis. In addition, previous versions of files, or records of changes to the various files in a project, occupy disk space that you might otherwise use for something else. However, the features that a good version control system makes available are well worth the investment of time and disk space. For example, without version control, project backups typically are timestamped copies of an entire project, hopefully stored together in a logical fashion. Version control provides organized storage and retrieval of the complete record of project changes, rather than whichever copies someone might remember to make.

Version control systems store files as the files are created and updated, in a way that allows any saved version of a file, or related versions of a set of files, to be retrieved at any given time. Many version control systems, including CVS, encourage a project’s files to be stored together, which means that backups are easy to produce.

The ability to recover any earlier version of a given file allows you to roll back to support feature requests for previous releases and is critical when you create a bugfix branch of the project. CVS (as do some other version control systems) allows simultaneous storage of the bugfix branch of a project and its main trunk code.

Many version control systems, including CVS, can display the differences between versions in a computer-readable format. The format CVS uses produces a file that allows the Unix patch program to automatically convert one version of a file (or set of files) to another.

Version control often allows multiple projects to use the same files, which helps divide a larger project among smaller teams. You can give each team a version-controlled copy of the section it’s working on, as well as the latest stable version of the files it needs to use to test its section.

Sometimes, two or more developers may make changes to the same file. Those changes may be in different parts of the file, and they may be made in such a way as not to conflict with each other. Other times, two or more developers may make conflicting changes to the same portion of a file. In such a case, CVS does its best to merge conflicting changes, but it only knows which lines of a file have been changed. It doesn’t know what the changes mean. CVS provides tools to display changes between arbitrary revisions, which can help locate and resolve problems.

Version control is not a substitute for communication between team members. File updates should be passed through CVS, but the meaning of the changes must be passed to other team members by actually discussing them. If one developer needs to change the arguments to a function or the chapter numbering of a book, that must somehow be communicated to the other developers.

Versioning systems are most commonly used for programming, but they are also useful for writing (I used CVS to write this book), system administration (configuration files), and anything else that involves files that change, where you might want to retrieve older versions of those files, or for situations where several people may be working on the same files. One family I know uses CVS to store its shopping list, to keep family members from overwriting each other’s entries.

Why CVS?

With all the version control systems available, why choose CVS? If you work with files that change over time, the most important thing is to have some kind of version control: after the first time it’s saved your bacon, you’ll never want to work without it. If it’s your first time with a version control system, choose one with good documentation and the features you think you’ll need for the first year or two. Read the available tutorials and quickstart guides, and use a system that you feel you can understand. As you’re already taking a look at this book, CVS is a fine choice: it will get you through that first year or two, and beyond.

Tip

Any version control is better than no version control.

Once you’re used to working with version control, you can make a more informed decision about the features you’ll want long term, and choose a system with those features in mind. Changing systems is not something to fear, as there are tools for conversion between many version control systems. At worst, you can use one system for the old data and another for the new, or write a script that checks out each revision in turn from the old system and commits it to the new.

Features of Version Control Systems

Every version control system that I’m familiar with records the changes in a document, enables you to retrieve older versions of the document, and enables you to display the differences between versions. (The first two features are what defines a version control system.) Some systems do little more than that, but most have additional features. You should choose a system based on the features that are most important to your project and working style.

One of the most important features a version control system can provide is support for multiple developers. There are two common models for this support; choose the model that works best for you. The easiest to understand is exclusive development: the version control system permits only one developer at a time to work on any individual file. A system may have strict exclusive development and make it nearly impossible for a developer to work on a file he hasn’t reserved, or it may have advisory exclusive development and simply warn the developer before allowing access to a file that another developer is editing. The alternative to exclusive development is simultaneous development, where multiple developers can work on the same file and the version control system attempts to merge the respective changes seamlessly. CVS supports both modes, which are more fully explained in Chapter 5.

Traditional version control systems have central repositories, but some of the newer ones have distributed repositories, and many systems have central repositories but support proxying. If you are a single developer, this issue won’t affect you, but managers of projects with multiple developers should think about whether a central or distributed repository better suits their needs. See the sidebar "Central, Distributed, or Proxied" later in this chapter for more information.

Systems should support remote access to the repository, or remote connection among distributed repositories. Most projects will need the ability to access the repository from outside the local network. The remote repository support in CVS is explained in Chapter 8.

Another important feature is data export. An exported set of data should not contain administrative files or data for the version control system. Most version control systems provide some form of export. CVS’s export command is explained in Chapter 7.

Customization is also important to many users. For example, some project managers want every change exported immediately to a test area. Some want committed changes reviewed to ensure they meet project standards. A proxy repository may need to initialize or finalize a connection when it communicates with the master repository. To achieve all these (and other) special tasks, a version control system needs hooks. A hook is a place to hang a script, such that the script is run at a specific stage of the version control process—perhaps before or after a change is committed, when a log message is stored, or when a file is checked out for exclusive development. CVS has hooks in the form of scripting files, explained in Chapters 6 and 7.

Internally, revision control systems usually keep numeric identifiers for each revision. One very useful feature is the tag—a symbolic name applied by the user. It’s much easier to remember that alpha_0-1 is your first alpha test than to remember that it’s revision 1.17. Most systems that provide tags allow you to apply the same tag to either the files you specify or to a full project with one command. Tags are explained in Chapter 4.

Branches enable you to run two simultaneous lines of development on the same project, stored in the same repository. The most common use is to separate out a bugfix branch independent of the main development branch. Some version control systems also have a vendor branch feature, a special branch that lets you store data provided by a third party. Branches are explained in Chapter 4, and vendor branches in Chapter 7.

A feature that CVS doesn’t have, and that many teams like, is atomic commits. This feature ensures that while one person is committing changes to the repository, no one else can. Thus, each commit is a separate process, and the repository is never in a state where it has mismatched files—e.g., one directory with Bill’s latest changes, and one with Sally’s. CVS has atomic commits on a directory-by-directory basis, but not true project-by-project or repository-by-repository atomic committing.

You may need data encryption, user authentication, and other security features. You need to be able to back up the repository and restore it to a fully working repository easily. To learn about CVS’s security and backup capabilities, see Chapter 6.

If you use a particular development suite, your choice of version control may be affected by whether the suite supports the version control system. Another factor might be the availability of programs such as web (HTTP/HTML) viewers or project management tools. Lists of some of the tools available to CVS and development suites that support CVS are in Appendixes A and B.

Comparing Version Control Systems

Table 1-1 compares the features of various version control systems. This table is based on the information provided in their web sites and other available documentation at the time of writing, and on discussion with friends who have used the systems (in cases where I haven’t).

Table 1-1. Version control systems compared

VCS

Development model

Repository type

Atomic commits

URL

CVS

Simultaneous or exclusive

Central and proxied

No

http://cvs.nongnu.org

Bitkeeper

Simultaneous

Distributed

Yes

http://www.bitkeeper.com

ClearCase

Simultaneous or exclusive

Central or distributed

N/A

http://www-306.ibm.com/software/awdtools/clearcase

Git

Simultaneous

Distributed

Yes

http://git.or.cz

GNU Arch

Simultaneous

Distributed

Yes

http://www.gnu.org/software/gnu-arch/

Perforce

Simultaneous or exclusive

Central

Yes

http://www.perforce.com

Subversion

Simultaneous or exclusive

Central

Yes

http://subversion.tigris.org

Visual Source Safe

Exclusive

Central

No

http://msdn.microsoft.com/ssafe/

It’s well worth it to read the Wikipedia articles on revision control systems. Wikipedia has a more complete list of available programs, as well as links to their home pages and to other peoples’ comparisons of systems they have worked with. The main article is available at http://en.wikipedia.org/wiki/Revision_control.

CVS Versus Subversion

Subversion is designed to be “CVS, only better.” It has support for binary files built into the design, permits versioning of directories as well as files, allows you to rename files with a simple command and to maintain their history, and has atomic commits. It works just like CVS does. So why not change to Subversion?

I’ve been looking at Subversion, and it’s not quite true that it’s “CVS, only better.” It’s “CVS, only different.” It uses a different repository structure, different methods of remote access, different administrative commands, and different hooks. It currently doesn’t support proxying or GSSAPI (Generic Security Services API, a client-server authentication system), and many of its commands, such as diff, have fewer features than CVS’s versions of those commands.

I like Subversion, and I think it has potential, but it’s not a direct replacement for CVS. It’s similar, but it’s different enough that some people prefer to use CVS for its feature set, and others prefer Subversion for its feature set.

CVS and CVSNT

CVSNT is a different system; it split off from CVS around version 1.10. Both projects have changed significantly since then. CVSNT can merge binary files, and has several Windows-specific bonuses such as use of DLLs (dynamic link libraries) for sophisticated trigger actions when a change is committed. CVS has gone in a different direction, and has added support for PAM (pluggable authentication modules), created proxying, and provided more scripting file hooks.

Why I Prefer CVS

CVS is free. That’s a really good start for small projects, free software projects, and small businesses, and most of my work has been in at least one of those fields. I’ve also been able to use existing servers for all the CVS repositories I’ve administered, and to use sites like Sourceforge or other open source hosting sites for CVS repositories for open source work. The total cost of the CVS system has been trivial, for me.

CVS is a mature system. It’s solid and reliable, and I’ve seen it fail only when I misunderstood how it worked—which happened only when I was first learning it. That’s one reason I wrote this book: by explaining the things I’ve learned, hopefully I can prevent other people from facing the problems I faced.

CVS has all the functionality I need. Most of that is built in, but because it’s such a mature system, there are third-party tools that do most of the things it doesn’t. For instance, if you want strict exclusive development, you can use the rcslock program. If you need finer-grained security, there’s the cvs_acls script. Web browsing of your repository? ViewCVS.

CVS is familiar. It’s widespread and has a huge user base. In my experience, members of my team have usually used CVS before, which saves time spent on training.

CVS in the Field

CVS records file changes during a project’s development. Project files are added to the repository as they are created, and developers check out a personal sandbox—a personal copy of the project’s files—to work from. Each developer works in his own sandbox and regularly commits his changes to the repository. Developers also update the contents of their sandboxes regularly to ensure that changes to the repository are reflected in each sandbox.

The term project can take on many different meanings. The stereotypical CVS project is a programming project in which files contain source code for the various programs written as part of the project. But that’s a narrow view of what a CVS project can be. CVS can be used in many other settings as well, as the next few sections demonstrate.

System Administration

CVS can store configuration files, mail aliases, domain records, and other files for which changes should be tracked. Import the files (or all of /etc) into a repository and require administrators to check them out into a sandbox to make changes. Commit the files back to the repository and export the changes to the server. If the changes fail, rolling back to the previous state is easy.

Multiple servers with varied but similar configurations can be maintained using different branches of the same files. Changes to any given branch can be merged into other branches selectively.

Every change made through CVS is recorded in a file history, along with the username of the person making the change, the date the change was made, and any notes recorded with the change. All this information can help, for example, when trying to spot which change to which configuration file broke the mail server.

Both the CVS server and the client run on all Unix and Linux operating systems, including modern Macintosh environments. Third-party graphical clients are available for Unix, Linux, Windows, and Macintosh systems, and for the Java runtime environment. The CVSNT CVS server is available for Windows NT or later. This makes CVS particularly useful for cross-platform environments. See Appendix A for more information.

Software Development

Program development is perhaps the most common use for version control systems. After the initial release of a program, two versions usually need to be maintained: the new version that will eventually be the next release and the bugfix version for the current release. CVS allows you to split the development into two or more parts, called a trunk and a branch. Typically, the branch is used for bug fixes, and the main trunk is used for new feature development. Both versions of the program, the bugfix branch and the main trunk, are stored in the same repository. This approach allows the changes from the bugfix branch to ultimately be merged into the main trunk, ensuring that all bugfixes get rolled into the next release of the program.

A CVS repository can be hosted on the machine that most developers will be using, or you can host the repository on a machine that developers access via a local or wide-area network. A repository can be accessed simultaneously by multiple computers. If you need to authenticate your CVS users, there are a variety of authentication mechanisms available.

When multiple developers are trying to work on the same project, it’s likely that two or more developers will eventually want to work with the same file at the same time. Without version control, this would lead to problems, as developers would soon find themselves overwriting one another’s changes. One way that some version control systems prevent such conflicts is to enable developers to lock whatever files they are working on, so that no one else can make changes at the same time.

Instead of locking files to prevent conflicts, CVS allows multiple developers to work on the same file. CVS’s file-merging feature then allows you to merge all changes into one file. File merging aids development across remote time zones, as developers can work on different sections of the same file, regardless of the lock status. In fact, there is no lock status, because there is no locking. With a file-locking system, a developer may have to email someone and then wait until that person wakes up, reads her email, and unlocks the needed file. The CVS approach prevents one developer from blocking another, thus increasing productivity. However, if your project team needs file locking, you can use the cvs watch command (discussed in Chapter 5) to emulate it.

The cvs diff command (also discussed in Chapter 5) displays the differences between any two revisions of a file (or set of files) in the repository. A variation of the command creates a Unix standard patch file to update one revision to another, which is useful when sending patches or updates to customers.

CVS can be configured to record commit messages in bug-tracking systems. Chapter 7 explains how to use the administrative files to provide message templates and run scripts automatically during commits. This step does not necessarily record the stage of each change (completed, tested, etc.). Unless you rigorously enforce a requirement to write meaningful commit messages, you should maintain a separate change log.

Store your build and installation scripts in CVS to maintain a record of changes and to ensure that such scripts are kept with the project files. Releases should always be built from a freshly checked-out sandbox and tagged with a human-friendly name.

CVS does not include build or installation tools, though cvs export (see Chapter 7) should be part of the installation process for your project. I use make for build and installation scripts.

Content-Controlled Publishing

Many people use CVS to maintain web sites and other file servers, and use scripts to automatically publish updates to those servers. Some people use scripts to distribute and apply patch files on remote machines, saving bandwidth by distributing only the changes. A variety of such scripts are available at http://www.cvshome.org, and some are discussed in Appendix B.

Other Uses for CVS

CVS is also useful for managing any other type of file, from book chapters or blueprints, to music, art, mailing lists, or shopping lists. The features that make it useful to programmers are also useful to anyone who produces something that can be stored as a computer file, or that can be generated from a computer file. A friend of mine, Maria Blackmore, uses CVS to store router configurations by using a script to retrieve and commit copies of the configuration.

Get Essential CVS, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.