Chapter 13. Miscellaneous

In this chapter, we cover some Git commands and topics that don’t fit easily into any of the foregoing discussions.

git cherry-pick

git cherry-pick allows you to apply the changeset of a given commit as a new commit on the current branch, preserving the original author information and commit message. As a very general rule, it’s best to avoid this in favor of factoring your work so that a commit appears in one place and is incorporated in multiple branches via merging instead, but that isn’t always possible or practical. Any arrangement of branches and merge discipline favors a certain flow of changes, and sometimes you need to buck that flow. For example, you might discover that a bug fix applied to a certain version actually needs to be applied to an earlier one as well, and merging in that direction is not desirable. Or, suppose you have your own repository for holding local changes made to your Unix distribution’s derivative of some open source project, such as Apache or OpenLDAP as modified and repackaged by Red Hat or Debian. If there is an upstream feature you need that the distribution does not provide (and they use Git), you can’t just merge it in, as your repository is not a clone of theirs—but you may be able to apply the relevant commits individually by cherry-picking.

The argument to git cherry-pick is a set of commits to apply, using the syntax described in Chapter 8. Some options:

--edit (-e)
Edit the commit message before committing.
-x
Append to the commit message a line indicating the original commit. Only use this if that commit is publicly available; if you’re cherry picking from a private branch, then this information is not useful to others.
--mainline n (-m)
For merge commits, compute the changeset for the new commit relative to the nth parent of the original. This is required to cherry pick merge commits at all, since otherwise it is not clear what set of changes should be replicated.
--no-commit (-n)
Apply the patch to the working tree and index, but do not commit. You can use this to take the commit’s changes as a starting point for further work, or to squash the effect of several cherry picked commits into a single one.
--stdin
Take the commit list from standard input.

As with other commands that apply patches, git cherry-pick can fail if a patch does not apply cleanly, and it uses the merge machinery in that case, recording conflicts in the index and working files in the usual way. It then prompts you to use the options --{continue,quit,abort} to continue after resolving the conflicts, skip the current commit, or abort the whole cherry pick, similar to git rebase.

git notes

Since commits are immutable, you can’t add to a commit message once you’ve made it (and you can’t replace a commit you’ve pushed without causing woe for others). git notes provides a way to annotate commits for yourself later on while avoiding this difficulty.

The set of notes for your repository is maintained on a branch named refs/notes/commits, in the following fashion: to find the notes for a commit, Git looks up its 40-digit hex commit ID as a pathname in the tree of the current notes commit (tip of the notes/commits branch); if present, that points to a blob that contains the text of the note. When you add or remove a note, Git simply commits the corresponding change to the notes branch (so you can see the history of your notes with git log notes/commits).

Though normally used to annotate commits, notes can in fact be attached to any Git object.

git notes Subcommands

You can use the -f option generally to override a complaint, such as to replace existing notes. A missing object argument defaults to HEAD, except where noted otherwise:

git notes list [object]
List the notes for object by ID, or all notes with no object. A plain git notes invokes this subcommand.
git notes {add,append,edit} [object]
Add a note for object, or edit or append to an existing note.
git notes copy first second
Copy the note from one object to another.
git notes show [object]
Display the note for object.
git notes remove [object]
Delete the note for object.

You can specify a notes ref other than notes/commits with the --ref option; the argument is taken to be in refs/notes if unqualified. You can use this feature to have different categories of notes; perhaps notes on different subjects, or from different people (e.g., git notes --ref=bugs).

Initially, git notes seemed mostly geared toward private use; there was no explicit support for merging notes from other sources. Recent Git versions have added a git notes merge command, and this is improving; see git-notes(1) for the current status of that as well as other options.

git grep

git grep lets you search your repository content using regular expressions: not only the working tree, but also the index or any commit in the history without having to check it out. You can even use it outside a Git repository, as a more powerful version of the usual Unix grep command.

Combining Regular Expressions

Instead of a single regular expression, git grep can handle Boolean combinations of expressions, combined with the options --{and,or,not} in infix notation (“or” is the default connective; “and” binds more tightly than “or”; use parentheses for grouping, which you may have to escape to protect from your shell). In this usage, patterns are preceded by -e. For example:

$ git grep -e '^#define' --and \( -e AGE_MAX -e MAX_AGE \)

This finds lines that begin with #define and contain either AGE_MAX or MAX_AGE; thus, it finds both #define AGE_MAX and #define MAX_AGE.

Note

“Infix notation” means placing binary connectives between their arguments, rather than in front of them in function-call style; thus foo --and bar --or baz, rather than --and (foo (--or bar baz)).

By default, git grep searches tracked files in the working tree, or given commit or tree objects. The given objects must be listed individually; you cannot use range expressions such as master..topic. You can add path limiters to restrict the files searched to those matching at least one glob-style pattern. For example:

$ git grep pattern HEAD~5 master -- '*.[ch]' README

Other options:

--untracked
Include untracked files; add --no-exclude-standard to skip the usual “ignore” rules
--cached
Search the index (that is, all blobs registered as files in the index)
--no-index
Search the current directory even if it’s not part of a Git repository; add --exclude-standard to honor the usual “ignore” rules

What to Show

By default, git grep shows all matching lines, annotated with filename and object as appropriate. Other options include:

--invert-match (-v)
Show nonmatching lines instead
-n
Show line numbers
-h
Omit filenames
--count (-c)
Show the number of lines that match, rather than the matching lines themselves
--files-with-matches (-l)
Just list the files containing matches
--files-without-matches (-L)
Just list the files containing no matches
--full-name
Show filenames relative to the working tree top, rather than the current directory
--break
Collate matches from the same file and print blank lines between resulting sets
--heading
Show the filename once before the matches in that file, rather than on each line
--all-match
With multiple patterns combined with “or,” only show files that contain at least one line matching each pattern

How to Match

-i (--regexp-ignore-case)
Ignore case differences (e.g., hello and HELLO will both match “Hello”).
-E (--extended-regexp)
Use extended regular expressions; the default type is basic.
-F (--fixed-strings)
Consider the limiting patterns as literal strings to be matched; that is, don’t interpret them as regular expressions at all.
--perl-regexp
Use Perl-style regular expressions. This will not be available if Git is not built with the --with-libpcre option, which is not on by default.

git rev-parse

git rev-parse is a plumbing command, meant mainly for use by other Git programs to parse and interpret portions of Git command lines that use common options for specifying revisions. You can use it directly, though, and we’ve mentioned it before as a tool for showing what a given commit name spelling translates to. However, it also has several useful options for showing various properties of a repository, including:

--git-dir
Show the Git directory for the current repository
--show-toplevel
Show the top of the working tree
--is-inside-git-dir
Indicate whether the current directory is inside the Git directory
--is-inside-working-tree
Indicate whether the current directory is inside the working tree of a repository
--is-bare-repository
Indicate whether the current repository is bare

git clean

git clean removes untracked files from the working tree, optionally limited by a glob pattern (e.g., git clean '*~' to remove backup files). Options include:

--force (-f)
Really do something. git clean will make no changes without this flag, unless you set clean.requireForce to false.
--dry-run (-n)
Show what would be done, but remove no files.
--quiet (-q)
Report only errors, not the files removed.
--exclude=pattern (-e)
Add pattern to the “ignore” rules in effect.
-d
Remove untracked directories as well as files. Directories that are in turn other Git repositories will not be removed unless you add -f -f (two “force” flags).
-x
Skip the normal “ignore” rules (but still obey rules given with -e).
-X
Remove only ignored files.

There is no single git clean command that is most common, really; it depends on what you’re trying to do. For example, often ignored files include compiled objects that are expensive to rebuild, so you don’t want to remove them while cleaning up other untracked cruft that has accumulated in your working tree. On the other hand, after you switch branches, you may want to remove all object files to ensure a correct new build, as the dependencies in a complex project as expressed by tools like make or ant may not handle such wholesale rearranging of files correctly.

git stash

git stash saves your current index and working tree, then resets the working tree to match the HEAD commit as git reset --hard would do. This allows you to conveniently set aside and later restore your working state so that you can change branches, pull, or perform other operations that would be blocked by your current changes.

The saved states are arranged in a “stack,” meaning that the last state you put into it is the first one you take out. That is: if you stash a state, make more changes, then stash again—when you next restore a state, it is the second state that is restored, not the first one. The terms “push” and “pop” used in the commands below are traditional in computer science for the operations of adding and removing something to and from a stack. Unlike a pure stack, however, the commands do generally allow you to bypass the stack order and directly address previous states, if you want to.

Subcommands

save

This is the default subcommand, saving the current working state as described. Options include:

--patch (-p)
Interactively select hunks to save, rather than the complete diff between HEAD and the working tree. This works the same way as the patch mode of git add.
--keep-index
Do not revert changes already applied to the index.
--include-untracked (-u)
Save untracked files (normally only tracked files are saved). This is useful to save compilation artifacts such as object files, normally ignored and untracked but that would be costly to recreate.

You can also give a comment as an argument, to be saved as the message on the commit representing the stash (e.g., git stash save "bugfix in progress"). Otherwise, Git generates a default message like:

WIP on master: 72e25df0 'commit subject'

The --keep-index option is useful for testing partially staged changes before you commit them. If you use git add -p to split your current worktree changes into multiple commits (Adding Partial Changes), you may want to test those commits first. git stash save --keep-index preserves your staged changes and reverts the rest, so that you can test this intermediate state. You then commit, restore the remaining changes with git stash pop, and repeat.

list

List the stack of stashes, which can be referred to symbolically as stash@{0}, stash@{1}, and so on (most recent first). You can add options as to git log.

show

Show the changes in a given stash, as the diff between the stash and its corresponding original worktree state. The default is the latest stash (stash@{0}), and you can add options as with git diff.

pop

The inverse of git stash: restore a stashed state and remove it from the stash list; the default state to use is stash@{0}, or you can supply a different stash. If the stash does not apply cleanly, this does not remove the stash; use git stash drop after resolving the conflicts. With --index, restores the saved index as well (which is otherwise discarded).

apply

Like git stash pop, but does not remove the restored state from the stash list.

branch <branchname> [stash]

Switches to new branch starting at the original commit for stash, and restores the stash there. This is useful when the working tree has changed such that the stash no longer applies cleanly.

drop [stash]

Remove stash from the stash list (default stash@{0}).

clear

Deletes the entire stash list.

git show

git show displays a given object (default HEAD) in a manner appropriate to the object type:

commit
Commit ID, author, date, and diff
tag
Tag message and tagged object
tree
Pathnames in (one level of) the tree
blob
Contents

For example, to see the diff from one commit to the next, you could use git diff foo~ foo, but git show foo is just simpler. The command takes any options valid for git diff-tree to control display of the diff, including -s to suppress the diff and just show the commit metadata. You can also use --format as described in Defining Your Own Formats to customize the output.

git tag

A Git tag gives a stable, human-readable name to a commit, such as “version-1.0” or “release/2012-08-01”. There are two kinds of tags:

  • A “lightweight tag” is just a ref in refs/tags pointing to the tagged commit.
  • An “annotated tag” is also a ref in refs/tags, but pointing to a tag-type object instead, which in turn not only points to the tagged commit, but records other information as well: the tag author, timestamp, a tag message, and an optional GnuPG cryptographic signature.

git tag tagname commit creates a new lightweight tag pointing to the given commit (default HEAD). Options include:

--annotate (-a)
Make an annotated tag instead
--sign (-s)
Make a signed tag (implies -a), using the GnuPG key for the committer’s email address or the value of user.signingkey
--local-user=key-ID (-u)
Make a signed tag (implies -a), using the specified GnuPG key
--force (-f)
Be willing to replace existing tags (this normally fails)
--delete (-d)
Delete a tag
--verify (-v)
Verify the GnuPG signature on a tag
--list pattern (-l)
List tags with names matching pattern. No pattern means list all tags, and this is the default for a plain git tag command without arguments. Multiple patterns means to list tags matching at least one pattern.
--contains commit
List tags containing the given commit; that is, those that have commit as an ancestor of the tagged commit
--points-at object
List tags that point to the given object
--message="text" (-m)
Use text as the tag message (instead of invoking the editor). Multiple -m options are concatenated as paragraphs. This implies an annotated tag.
--file=filename (-F)
Use the contents of filename as the tag message (instead of invoking the editor); “-” means standard input. This implies an annotated tag.

Deleting a Tag from a Remote

Deleting a tag from your repository will not automatically delete it from the origin when pushing; you have to do that explicitly:

$ git push origin :tagname

Following Tags

When you pull (or fetch) from a configured remote, Git will automatically fetch new tags, but a “one-shot” pull specifying the remote repository (git pull URL branch) will not do this. This rule tries to match the likely desires of people in the given situation. If you are collaborating closely with a set of people on a project, you are likely to want to share tags with them, and also likely to be using the push/pull mechanism with a configured remote. On the other hand, if you have to specify the other repository, then you probably aren’t collaborating closely over that particular content, and so you probably don’t want to automatically pull in the other group’s tags.

In any case, git pull never automatically overwrites tags. A tag can represent sensitive assertions about the tagged commit, such as its being a certain official release of a product, or containing an important security fix. Once accepted, a tag should not silently change without the user knowing. If you push out a botched tag, the preferred way to fix it is to simply use a new tag name. Actually updating an already pushed tag is awkward, by design. See the “DISCUSSION” section of git-tag(1) for more detail.

For new tags you create, use git push --tags to send them when pushing.

Backdating Tags

You can set the tag date with the GIT_COMMITTER_DATE environment variable. For example:

$ GIT_COMMITTER_DATE="2013-02-04 07:37" git tag…

git diff

git diff is a versatile command, showing the difference between content pairs in the working tree, commits, or index. The following are some common forms.

git diff

This shows your unstaged changes; that is, the difference between the working tree and the index.

git diff --staged

This shows your staged changes; that is, the difference between the latest commit and the index. These are the changes that will be included in the next commit. --cached is a synonym for --staged. You can give an alternate commit to compare as an argument; the default is HEAD.

git diff <commit>

This shows the difference between the working tree and the named commit.

git diff <A> <B>

This shows the difference between two commits, trees, or blobs A and B. A..B is a synonym for A B; note that this has no connection to the meaning of that syntax when naming sets of commits (see Naming Sets of Commits). If either A or B is omitted in A..B, the default is HEAD; this syntax is thus useful for specifying HEAD for one of these by just typing two dots, which is easier and faster than typing in all caps.

Options and Arguments

You can limit the comparison to specific files with trailing patterns; for example, this shows the unstaged changes only in Java and C source files:

$ git diff -- '*.java' '*.[ch]'

git diff accepts quite a few options controlling how Git computes or displays differences, most of which it has in common with git log, which we discuss in Chapter 9. For example, this summarizes the differences instead of displaying them:

$ git diff --stat
 foo.c     | 1 +
 icky.java | 1 +
 3 files changed, 3 insertions(+)

and this just lists the files that contain differences:

$ git diff --name-only
foo.c
icky.java

git instaweb

Git comes with a web-based repository browser called “gitweb.” Setting up a standalone web server to provide general access to a set of Git repositories is outside our scope here; however, Git has a convenience command git instaweb that starts a special-purpose web server giving gitweb access to the current repository. Just start it with:

$ git instaweb --start

and point your browser at http://localhost:1234/ (assuming your browser is running on the same host; otherwise, use the right hostname). Use --port to select a different TCP port, and --stop to stop the gitweb server when you’re done.

If you type just git instaweb, it will start or restart the gitweb server, and then launch a browser from the command line on the same host. This may not be what you want; you might be logged into that host remotely without any way to display graphics from it (e.g., a local X Windows server combined with SSH X forwarding), and so Git will end up starting a character-based browser such as lynx.

By default, this command uses the lighttpd web server, which must also be installed. It supports several other web servers as well, including Apache, which you can select with --httpd; see git-instaweb(1) for details.

Git Hooks

In computer jargon, a “hook” is a general means of inserting custom actions at a certain point in a program’s behavior, without having to modify the source code of the program itself. For example, the text editor Emacs has many “hooks” that allow you to supply your own code to be run whenever Emacs opens a file, saves a buffer, begins writing an email message, etc. Similarly, Git provides hooks that let you add your own actions to be run at key points. Each repository has its own set of hooks, implemented as programs in .git/hooks; a hook is run if the corresponding program file exists and is executable. Hooks are often shell scripts, but they can be any executable file. git init automatically copies a number of sample hooks into the new repository it creates, which you can use as a starting point. These are named hook-name.sample; rename one removing the .sample extension to enable it. The sample hooks themselves are part of your Git installation, typically under /usr/share/git-core/templates/hooks. The templates directory also contains a few other things copied into new repositories, such as the default .git/info/exclude file.

For example, there is a hook named commit-msg, which is run by git commit after the user edits his commit message but before actually making the commit. The hook gets the commit message in a file as an argument, and can edit the file in place to vet or alter the message. If the hook exists with a nonzero status, Git cancels the commit, so you can use this to suggest a certain style of commit message. It’s only a suggestion though, because the user can avoid hook with git commit --no-verify; it’s his repository, after all. You’d need a different kind of hook on the receiving end of a push to enforce your style on a shared repository.

The githooks(5) man pages describes in detail all the different hooks you can use, and how they work.

Visual Tools

Complex commit graphs, file differences, and merge conflicts are best viewed graphically, and there are a number of tools available for this. Git itself includes gitk, which is written with the Tcl/Tk language and graphics toolkit, as well as the simple git log --graph. Here are some other useful tools in this category:

tig
A terminal-based tool using the “curses” library.
QGit
Using the QT4 GUI framework, QGit builds and runs essentially identically on multiple platforms, including Linux, OS X, and Windows.
GitHub
There is a GitHub application for OS X, Windows, and the Eclipse programming environment. It can work with your own repositories as well as with ones hosted on the GitHub service.
SmartGit
SmartGit runs on Linux, OS X, and Windows, and works with the Mercurial version control system as well.
Gitbox
Specific to OS X with a very nice, native Mac look and feel.

Submodules

Sometimes, you need to use the source to another project in yours, but it is not possible or appropriate to combine the two into a single repository. This situation can be awkward to handle. You may not want to keep merging the entire history of another project into yours, where it will clutter up your own history (though the “subtree” merge strategy can be helpful if you decide to do this).

Git has a feature called “submodules” to address this: it allows you to maintain another Git repository as a tracked object within a subdirectory of yours. In the tree of a commit in your repository, the submodule reference includes a commit ID in the foreign repository, indicating a particular state of that repository. This defines the content of the corresponding directory for your commit, while still leaving all its refs and objects out of your repository proper.

As an advanced feature, we do not discuss submodules further here; see git-submodule(1) for details.

Get Git Pocket Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.