Chapter 7. Merging

Merging is the process of combining the recent changes from several branches into a single new commit that is on all those branches. Most often there are only two branches involved, but in fact, there can be any number; if there are more than two, it is called an “octopus merge.” When there are only two branches, the current branch is called “our” side of the merge, while the other branch is called “their” side. Since the octopus merge is unusual, we will generally assume a two-branch merge in this discussion.

We described how Git may start a merge for you as part of git pull (see Pulling), but you can also perform merges explicitly. Here’s a typical scenario: you’re working on a software project, and you have an idea for a new feature, but you don’t want your experimental work on that feature to disturb your main development. So you create a branch named feature to contain the work:

$ git checkout -b feature
Switched to a new branch 'feature'
(explore brilliant idea...)

When you need to go back to work on the main part of your project, you commit your work on the feature branch and switch back to master (or whichever branch you need to work on):

$ git commit -am "must save brilliant thoughts"
[feature c6dbf36e]
 0 files changed
 create mode 100644 effulgent.c
 create mode 100644 epiphany.h
$ git checkout master
Switched to branch 'master'
(perform mundane chores...)

You continue like this for some time. Eventually, if you decide you don’t like your feature idea, you can discard the work by deleting the branch with git branch -D feature. If you decide to keep it, however, at some point you’ll want to incorporate it into the main project code, and you do this with a merge:

$ git checkout master
Switched to branch 'master'
$ git merge feature
Auto-merging main.c
Merge made by the 'recursive' strategy.
 effulgent.c | 452 ++++++++++++++++++++++++
 epiphany.h  |  45 ++++++++++
 main.c      |  18 ++--
 3 files changed, 507 insertions(+), 9 deletion(-)
 create mode 100644 effulgent.c
 create mode 100644 epiphany.h

Warning

It’s best to have all your work committed before running git merge; that is, git status should show no outstanding changes or untracked files. Backing out of a merge to your initial state may prove difficult otherwise. You can use git stash as a quick way to save working changes and restore them later (see git stash).

This merge was simple. You had added the files effulgent.c and epiphany.h on the feature branch, and they did not exist on master, so Git just added them. You had made minor changes to main.c on both branches, but those changes did not conflict, so Git combined them automatically and committed its merged version. The ASCII graph with filenames on the left is called a “diffstat”; it is a summary of the changes made by this commit. The lines of plus and minus signs represent the relative number of line additions (“insertions”) and deletions made in the corresponding file.

Both aspects of merging have occurred here: content and structure. First, Git combined the content of both branches by adding and merging changes to files; then, it recorded the fact of the merge structurally by creating a merge commit tying both branches together in the commit graph. This indicates in the history those commits whose contents were combined to produce the new one, by making them its parents. A “merge commit” is defined simply as a commit having more than one parent.

You can continue this process as long as you like, working on the feature branch separately and periodically merging its work into master. If you do, you will probably also need to merge the other way as well, updating the feature branch with the latest work on master, so that you’re not working on outdated code; for this, just do the reverse: switch to feature and run git merge master.

When your new feature is fully incorporated into the main development, and you no longer need to work on it separately, you can delete the feature branch with git branch -d feature; as discussed in Deleting a Branch, Git will complain if you haven’t fully merged feature into master, to prevent you from accidentally losing work. Deleting feature doesn’t delete any of its content or history; it just removes the name “feature” as a reference point, a place at which you intend to add independent commits later on—since you no longer need it. You can reuse “feature” as a branch name in the future if you want, and there will be no collision with the earlier usage; in fact, aside from possibly in commit messages or reflogs, once you delete a branch, there is no record in the repository proper that it ever existed! Branch names serve to indicate what parts of the object database are still of interest, and where development is still occurring; if a branch’s content is merged into other branches, and you no longer need a line of development with that name, then you can just delete it, and reuse the name later for something else if you like. Similarly, looking back in the commit graph, it is not possible to know on which branch name a particular commit was made; even in a linear history, the current branch name might have been changed at some point in the past. It might be interesting or useful to know this in some situations, but Git just doesn’t keep this information. Git branches are ephemeral in a sense, just tools for building the commit graph, which is what matters.

Merge Conflicts

The previous merge went smoothly, but what if you had made changes in the two branches that Git could not combine on its own? These are called “merge conflicts,” and Git would stop and ask you to resolve them before committing. This process can range from simple to very complex, depending on the content and changes involved; fortunately, there are tools available to help, both in Git itself and with which Git can work. Let’s walk through a simple example. Suppose you have a file moebius with the following contents:

hello
doctor
name
continue
yesterday
tomorrow

and you make commits on branches chandra and floyd changing it thus:

chandra floyd

hello

hello

doctor

doctor

Jupiter

Europa

dolphin

monoliths

yesterday

yesterday

tomorrow

tomorrow

You have changed the same two lines on each side in different ways, and Git’s line-oriented merge approach will not attempt to guess at your intent or combine the lines (e.g., form a single line dolphin monoliths, interesting as those might be); it will signal a merge conflict:

$ git checkout chandra
Switched to branch 'chandra'
$ git merge floyd
Auto-merging moebius
CONFLICT (content): Merge conflict in moebius
Automatic merge failed; fix conflicts and then commit
the result.

The phrase CONFLICT (content) indicates that the conflict is due to irreconcilable content changes in this file. Git might indicate other reasons as well, such as an add/add conflict, in which the same filename is added to both branches but with different contents.

Tip

If you start a merge and then want to cancel it—perhaps you weren’t expecting so many conflicts and you don’t have time to deal with them now—just use git merge --abort.

To get an overview of the merge state, use git status. Any changes Git resolved on its own will be shown as already staged for commit, and there is a separate section at the end for merge conflicts:

$ git status
...
# Unmerged paths:
#   (use "git add <file>..." to mark resolution)
#
#       both modified:      moebius

“Unmerged paths” are files with conflicts Git could not resolve. To find out what went wrong in detail, use git diff. This command not only shows the differences between various combinations of working tree, index, and commits; it also has a special mode for helping with merge conflicts:

$ git diff
diff --cc moebius
index 1fcbe134,08dbe186..00000000
--- a/moebius
+++ b/moebius
@@@ -1,6 -1,6 +1,11 @@@
  hello
  doctor
++<<<<<<< ours
 +Jupiter
 +dolphin
++=======
+ Europa
+ monoliths
++>>>>>>> theirs
  yesterday
  tomorrow

This display shows the alternative versions of the section in conflict, separated by ======= and marked with the corresponding branch: ours (the current branch) and theirs (in this case floyd, the branch we are merging into ours). As usual, git diff shows differences between the working tree and the index, which in this case are the conflicts yet to be resolved; changes already made and staged are not shown. You can use git diff --staged to see those; add --stat for an overview. You’ll find that Git has updated the working file with similar markup:

hello
doctor
<<<<<<< ours
Jupiter
dolphin
=======
Europa
monoliths
>>>>>>> theirs
yesterday
tomorrow

Once you’ve edited the file to resolve the conflict, use git add to stage your fixed version for commit and remove it from the list of conflicted paths (if the resolution is actually to delete the file, use git rm). Once you’ve addressed all the conflicts and git status no longer reports any unmerged paths, you can use git commit to complete the merge. Git will present a commit message containing details about this merge including its branches and conflicts, which you can edit as you see fit; in this case:

Merge branch 'floyd' into chandra

Conflicts:
        moebius

and you can see you’ve created a “merge commit” having more than one parent:

$ git log --graph --oneline --decorate
*   aeba9d85  (HEAD, chandra) Merge branch 'floyd' in…
|\
| * a5374035  (floyd) back in black
* | e355785d  thanks for all the fish!
|/
* 50769fc9    star child

The other branch, floyd, has stayed where it was, while the current branch, chandra, has advanced one commit from e355785d to aeba9d85, and that last commit unifies the two branches. A new commit on floyd will cause them to diverge again, and you can merge again in the future if you need to (in either direction). Note that at this point, a simple git log will show commits from both branches, not just those made while on chandra:

$ git log --oneline --decorate
aeba9d85 (HEAD, chandra) Merge branch 'floyd' into ch…
a5374035 (floyd) back in black
e355785d thanks for all the fish!
50769fc9 star child

You might have expected to see only commits aeba9d85, e355785d, and 50769fc9. This presentation may seem odd at first, but it’s just a different way of looking at the notion of “branch.” A Git branch is defined as the set of all commits reachable in the commit graph from the branch tip; think of it as all commits that contributed content to the tip commit (which, after a merge, includes all commits prior to that one on both branches).

Tip

In simple cases, you may get what you think of as the history of “this branch” with git log --first-parent, which just follows the first parent of merge commits rather than all of them. However, this isn’t guaranteed, and in more complex histories it won’t mean much. Since Git allows nonlinear history, a simple list of commits is often not very useful, and you need visualization tools to help you interpret it (see Visual Tools).

Resolving Merge Conflicts

Git doesn’t have built-in tools to interactively address merge conflicts directly; that’s what external merge tools are for, which we’ll consider shortly in Merge Tools. However, here are some tips for use in simple cases.

  1. git log -p --merge shows all commits containing changes relevant to any unmerged files, on either branch, together with their diffs. This can help you identify the changes in the history that led to the conflicts.
  2. If you want to discard all the changes from one side of the merge, use git checkout --{ours,theirs} file to update the working file with the copy from the current or other branch, followed by git add file to stage the change and mark the conflict as resolved.
  3. Having done that, if you would like to apply some of the changes from the opposite side, use git checkout -p branch file. This starts an interactive loop that allows you to selectively apply or edit differing sections (see the “patch” item in the “Interactive Mode” section of git-add(1) for details).

In our example, if you decided to keep your version as a default, but selectively apply changes from the other branch, you could do:

$ git checkout --ours moebius
$ git add moebius
$ git checkout -p floyd moebius
diff --git b/moebius a/moebius
index 1fcbe134..08dbe186 100644
--- b/moebius
+++ a/moebius
@@ -1,6 +1,6 @@
hello
doctor
-Jupiter
-dolphin
+Europa
+monoliths
yesterday
tomorrow
Apply this hunk to index and worktree [y,n,q,a,d,/,e,…
y - apply this hunk to index and worktree
n - do not apply this hunk to index and worktree
q - quit; do not apply this hunk nor any of the remai…
a - apply this hunk and all later hunks in the file
...
$ git add moebius

Notes

  1. If the current branch is already contained in the other (that is, HEAD is an ancestor of the other branch tip), then git merge will just move the current branch up to meet the other in a “fast-forward” update, and not make a new commit at all. You can force a merge commit anyway with git merge --no-ff (“no fast-forward”), if you have some reason to do so.
  2. If the converse is true, and the other branch is already contained in this one, then Git will simply say that the current branch is “already up-to-date,” and do nothing. The goal of the merge is to incorporate into the current branch any changes on the other branch since the two diverged—but they haven’t diverged.
  3. If you want to use Git’s content-merging and conflict-resolution machinery, but do not want to create a merge commit, use git merge --squash. This operates like a normal merge with regard to content, but the commit it creates is just on the current branch (that is, has a single parent and does not connect to the other branch in the commit graph).
  4. You can use git merge -m to specify a commit message just as with git commit, although remember that Git provides useful information in its supplied message, which you may prefer to start with and edit instead (which happens by default).
  5. Use git merge --no-commit to stop Git from committing when an automatic merge succeeds, in case you want to have a look first. This isn’t strictly necessary, since you could always abort the commit by giving a blank commit message, or make any changes you want afterward and use git commit --amend.
  6. Git records that a merge is in progress by setting the ref MERGE_HEAD to point to the other branch; this is how it knows to make a merge commit (as opposed to a simple commit on the current branch) even when there are intervening commands while you resolve conflicts.

Details on Merging

When merging, Git considers the changes that have occurred on the branches in question since they last diverged. In the previous example, the branches chandra and floyd last diverged at commit 50769fc9, so the changes to be reconciled were those in commits e355785d and a5374035. These branches might have diverged and been merged several times previously, but you will only be asked to deal with new changes since that last happened. Some other version control systems do not have this feature, so that merging branches repeatedly is a problem: you end up resolving the same conflicts over and over.

More precisely, when merging several branches, Git seeks a “merge base”: a recent common ancestor of all the branch tips, to use as a reference point for arbitrating changes. Although in complicated situations there might be multiple possibilities for a merge base (see git-merge-base(1)), in the common case of our example, there is a single obvious choice, which Git finds automatically. Since our merge now involves three commits—two branch tips and the merge base—it is called a “three-way merge.”

Recall that git status showed our conflicts, the “unmerged paths.” Where does it keep this information? There are conflict markers in the working files, but it would be slow to read all the files for this purpose, and in any case that wouldn’t help for a modify/delete conflict. The answer demonstrates yet again the usefulness of the index. When there is a merge conflict for a file, Git simply stores not one version of the file in the index, but three: those belonging to the merge base and to the current and “other” branches, numbered 1, 2, and 3, respectively. The number is called the “stage” of the file and is a distinct property of an index entry along with the filename, mode bits, and so on. In fact, there is a third stage, 0, which is the usual state of an entry that has no associated merge conflict. We can see this using git ls-files, which shows the contents of the index. Prior to the merge, we see:

$ git ls-files -s --abbrev
100644 1fcbe134 0       moebius

The fields here are the mode bits, ID of the blob object holding the file’s contents, the stage number, and the filename. After running git merge floyd and getting a conflict for this file, we see something very different (using -u instead of -s would show only unmerged paths; here we have only one file anyway):

$ git ls-files -s --abbrev
100644 30b7cdab 1       moebius
100644 1fcbe134 2       moebius
100644 08dbe186 3       moebius

Note that the ID of stage 2 matches what was previously stage 0 earlier, since stage 2 is the version on the current branch. You can use git cat-file to see the contents of the different stages, here the stage 1 merge-base version:

$ git cat-file -p 30b7cdab
hello
doctor
name
continue
yesterday
tomorrow

You can refer to a specific stage of a file with the syntax :n:path; so git show :1:moebius is an easier equivalent for this.

Git records the three commits into the index in this way at the start of the merge. It then follows a set of simple rules to quickly resolve the easy cases; for example:

  • If all three stages match, reduce to a single stage 0.
  • If stage 1 matches stage 2, then reduce to a single stage 0 matching stage 3 (or vice versa): one side made a change while the other did nothing.
  • If stage 1 matches stage 2, but there is no stage 3, then remove the file: we made no change, while the other branch deleted it, so accept the other branch’s deletion.
  • If stages 1 and 2 differ, and there is no stage 3, then report a “modify/delete” conflict: we changed the file, while the other branch deleted it; the user must decide what to do.

…and so forth. Note that for matching, Git doesn’t need to fetch the actual files; it can just compare the blob object IDs already in the index, since they are hashes of the files’ contents. This is very fast; content-based addressing wins again. You can read about this process in more detail in git-read-tree(1). Any files that can’t be easily resolved this way must then actually be examined to attempt merging their contents.

Merge Tools

Merging can be a complex job, with you staring at scores of conflicting sections of source code changes from yourself and other people, and trying to combine them into a single working whole. There are tools available that go far beyond the simple text output of git diff in helping you to visualize and resolve such conflicts. Git integrates smoothly with these external “merge tools,” to help you get the job done more easily. Git supports over a dozen free and commercial merge tools out of the box, including araxis, emerge, opendiff, kdiff3, and gvimdiff. It also defines an interface with which you can use most any such tool, usually requiring only a simple wrapper script to connect it to Git.

We can’t delve into the details of the individual merge tools; many of them are complex programs in their own right and would require another small book each to describe. Here, we’ll just describe how they work with Git generally.

The driver for using a merge tool is git mergetool. Once invoked, this command runs over all the files with merge conflicts, asking for each if you want to invoke the selected merge tool on the file. The default merge tool is opendiff, but you can set a different default with the merge.tool Git configuration variable. The tool will usually present you with a view of the “ours” and “theirs” versions of the file, along with the merge base, and provide ways to move from one change or conflict to the next, select which side’s change to use (or combine them), etc. When you quit the merge tool indicating success, Git will add your merged version to the index (thus marking this conflict as resolved), and go on to the next unmerged file.

Notes

  • The -y switch to git mergetool tells it to run the tool on all unmerged files, without pausing to prompt yes or no for each one.
  • git mergetool leaves a backup foo.orig for each file foo it processes, since you might have modified it yourself before running the merge tool. You can set mergetool.keepBackup no to turn off this feature. Actually, Git still makes the backup; it just deletes it when the merge tool exits successfully, so that the backup is still there in case the tool were to crash.
  • If a merge tool exits unexpectedly or doesn’t work properly, you may see files like these left behind (for the file main.c):

    main.c.BACKUP.62981.c
    main.c.BASE.62981.c
    main.c.LOCAL.62981.c
    main.c.REMOTE.62981.c

    These are the temporary files that Git uses to pass the various file versions to the merge tool.

Custom Merge Tools

If you want to use a merge tool not directly supported by Git, it need only obey some simple conventions; usually, you’ll write a glue script to accommodate them. Git passes four filenames to the tool as environment variables:

LOCAL
The version from the current branch
REMOTE
The version from the other branch
BASE
The version from the merge base (common ancestor)
MERGED
File to which the merged version should be written

The tool should exit with a code of zero to indicate that the user is happy with the merged version, saved to the filename in the MERGED environment variable. A nonzero exit code means that Git should ignore that file and not mark this conflict resolved. To define a new Git merge tool named “foo” with your own program named newtool:

[mergetool "foo"]
        cmd = newtool $LOCAL $REMOTE $MERGED $BASE
        trustExitCode = true

This shows the files being passed on the command line to newtool; if your program reads the environment variables itself, then of course that’s not required. The trustExitCode setting means that Git will interpret the tool’s exit code as previously described; if this setting is false, Git will prompt the user for what to do anyway.

Merge Strategies

Git has a number of approaches it can take to automatically merge files that have been changed by both sides of a merge; that is to say, exactly what it does in analyzing text to determine the boundaries of changed blocks, when blocks have been moved, when changes can be safely merged, and when they should be punted to the user. These approaches are called “merge strategies,” and each may in turn have various options; Git can even be extended with new strategies by writing custom “merge drivers,” without having to touch Git proper.

The built-in merge strategies are described in git-merge(1). The many options are quite technical and involved, and Git’s default choice of strategy is usually sufficient; we will not cover them in depth here. However, here are a few tips involving merge strategies that are generally useful:

git merge -s ours
The ours strategy is simple: it discards all changes from the other branch. This leaves the content on your branch unchanged, and when you next merge from the other branch, Git will only consider changes made from this point forward. You might use this to retain the history of a branch, without incorporating its effects. (This strategy works with more than two branches as well.)
git merge -s recursive -X ours
This is the ours option to the recursive strategy, not to be confused with the ours strategy. The recursive strategy is often the default, and so you might not have to use -s, but we’ll be explicit here. This option directs Git to resolve conflicting changes in favor of the current branch. This is different from the ours strategy, in that nonconflicting changes can still be resolved in favor of either branch. You can use -X theirs as well, to resolve in favor of the other branch instead.
ignore-space-change, ignore-all-space, ignore-space-at-eol
These options to the recursive strategy automatically resolve conflicts differing only in certain types of whitespace; see git-merge(1) for details.
merge.verbosity
This configuration variable (or the GIT_MERGE_VERBOSITY environment variable, which takes precedence), holds a natural number indicating the level of information printed by the recursive strategy. Zero prints only a final error message on conflict, 2 is the default, and 5 and above show debugging information.
The “octopus” strategy
The octopus strategy can merge any number of branches, but only if all changes can be resolved automatically. If not, the strategy will abort in the middle of the merge attempt, possibly leaving your index and working tree in a not terribly meaningful state. Unlike when merging two branches, git merge --abort doesn’t work in this case (it says no merge is in progress); this may be a limitation that will be addressed in future versions of Git. You can use git reset to discard the index changes, adding --hard to reset the working tree as well, if you had no uncommitted changes to lose. “Octopus” is the default strategy when merging more than two branches, e.g., git merge bert ernie oscar.

Why the Octopus?

An octopus merge is generally used to tie together several topic branches with the master branch in preparation for a new release of a project, bringing in all their separate contributions. The individual branches should already have been reconciled with the master and have no conflicts amongst them, or else as mentioned, the octopus merge will not work. The octopus merge does not have any inherent advantage over simply merging all the topic branches into the master pairwise and reconciling the conflicts there; it accomplishes the same goal and incorporates the same history. However, with a large number of branches, it can make for a cleaner and more easily understood commit graph, and so some people prefer it. See Figure 7-1.

Pairwise and octopus merges
Figure 7-1. Pairwise and octopus merges

Reusing Previous Merge Decisions

Git can remember merge conflict resolutions you made in the past, and reuse them automatically if you run into similar conflicts later. This feature goes by the name git rerere, for “reuse recorded resolution.” This is useful if you’re working on a particularly difficult merge. You may abort a merge and retry it in various ways, but having resolved some conflicts in the meantime, this feature can remember and reapply those decisions. It can also be helpful if you’re rewriting repository history, or in maintaining branches on which you end up resolving the same conflicts repeatedly until the branch’s content can finally be merged appropriately.

Setting rerere.enabled in a repository turns on the feature, which is then automatically used by both git merge and git rebase. Since this is an advanced feature, we just note its existence here and refer the reader to git-rerere(1) for details.

Get Git Pocket Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.