Each SCM tool has a different name for the collection of files that it tracks. In the rest of this chapter, I'll use the CVS term repository for these files, simply because it is familiar to many people. The set of files in which a developer makes changes is named the working copy (CVS calls this a sandbox). Obtaining a working copy using CVS is known as checking out a copy. Publishing the changes to a repository is known as committing or checking in the changes.
A typical session with an SCM tool involves the following activities:
A developer decides to work on some part of the project. He checks out copies of the necessary files onto his machine. This is his personal working copy. Checking out the files has not changed anything in the repository, and all changes he makes are local to his machine. No one else is affected by his work yet.
Probably the most common mistake people make when they use SCM tools is to forget to add newly created files to the SCM tool. Even though your own builds and tests work just fine, this mistake breaks the build when your changes are committed, leading to self-defensive comments such as "But it works for me!" and "I ran all the tests." Some SCM tools will alert you to the presence of local files that they don't know anything about, but it's still good practice to get used to adding new files to your SCM tool right after you create them, while you still remember that they are new.
One common thing to do with an SCM tool is to see what changes have been made in the working copy, compared with the versions of the files in the repository. Another diff-related activity is to see who last changed a particular file and exactly what those changes were.
While the developer was working on the files in his working copy, someone else may have changed those same files in the repository. The developer has to get the latest versions of those files and make sure that his changes still work correctly with the changes from other people.
Finally, the developer has resolved all these changes, added all his new files, tested a version of the product created from his working copy, and is now ready to let other people in the project see his changes. This happens by committing the changes to the repository. It's helpful if you commit related changes all together, along with a descriptive comment about what the changes were for.
Various tests can be required by the SCM tool before it accepts the changes. For instance, was there a (possibly required) bug associated with these changes? Were the unit tests run and did they behave as expected? Have the changes been reviewed or checked for security or copyright problems?
Last, when the changes are accepted by the SCM tool, some notification (such as an email) is sent to the group, describing the changes and who made them. A change log may also be updated. If the files are tagged, then information about the tag should appear in the change log as well.
Figure 4-1 shows a centralized repository being used by three users: Alice, Bert, and Cuthbert. Alice is checking out her own working copy of some of the files in the repository. Bert is updating his working copy, merging in the changes that other people have made to the files in the repository. Cuthbert is committing the changes to the files that he has made in his working copy to the repository, thus making them available to other people.
Using a distributed SCM tool is similar to the process just described, except that there are now many repositories. In addition to the usual checkout, update, and commit operations on a repository, there are equivalents for repositories themselves, at the next level of abstraction. You can:
Create your repository by copying one of the existing ones, which is similar to checking out a working copy
Merge in changes from another repository, which is similar to updating a working copy
Merge your changes to another repository, which is similar to committing changes in a local copy
Figure 4-2 shows distributed repositories being used in the same way as shown in Figure 4-1 for centralized repositories. One way to think about distributed repositories is that each person has her own repository on her machine, and she can commit files to it while disconnected from a network. Then when she is reconnected to a network, she can synchronize her repository with the other repositories.