Chapter 4. Searching and Modifying Buffers
There will be lots of times when you want to search through a buffer for a string, perhaps replacing it with something else. In this chapter we’ll show a lot of powerful ways to do this. We’ll cover the functions that perform searches and also show you how to form regular expressions, which add great flexibility to the kinds of searches you can do.
Inserting the Current Time
It is sometimes useful to insert the current date or time into a file as you edit it. For instance, right now, as I’m writing this, it’s 10:30pm on Friday, 18 August, 1996. A few days ago, I was editing a file of Emacs Lisp code and I changed a comment that read
;; Each element of ENTRIES has the form ;; (NAME (VALUE-HIGH . VALUE-LOW))
;; Each element of ENTRIES has the form ;; (NAME (VALUE-HIGH . VALUE-LOW)) ;; [14 Aug 96] I changed this so NAME can now be a symbol, ;; a string, or a list of the form (NAME . PREFIX) [bg]
I placed a timestamp in the comment because it could be useful when editing that code in the future to look back and see when this change was made.
A command that merely inserts the current time is simple, once you know that the function
current-time-string yields today’s date and time as a string.
(defun insert-current-time () "Insert the current time" (interactive "*") (insert (current-time-string)))
The section More Asterisk Magic later in this chapter explains the meaning of
(interactive "*") and
The simple function above is pretty inflexible, as it always results in inserting a string of the form “Sun Aug 18 22:34:53 1996” (in the style of the standard C library functions
asctime). That’s cumbersome if all you want is the date, or just the time, or if you prefer 12-hour time instead of 24-hour time, or dates in the form “18 Aug 1996” or “8/18/96” or “18/8/96”.
Happily, we can get finer control if we’re willing to do a little extra work. Emacs includes a few other time-related functions, notably
current-time, which yields the current time in a raw form, and f
ormat-time-string, which can take such a time and format it in a wide variety of ways (in the style of C’s
strftime). For instance,
(format-time-string "%l.%M %p" (current-time))
". (The format codes used here are
%l, “hour from 1-12,”
%M, “minute from 00-59,” and
%p, “the string ‘AM’ or ‘PM’.” For a complete list of format codes, use
From here it’s a short leap to providing two commands, one for inserting the current time and one for inserting the current date. We can also easily permit the format used by each to be user-configurable, based on a configuration variable the user can set. Let’s call the two functions
insert-date. The corresponding configuration variables will be
User Options and Docstrings
First we’ll define the variables.
(defvar insert-time-format "%X" "*Format for \\[insert-time] (c.f. 'format-time-string').") (defvar insert-date-format "%x" "*Format for \\[insert-date] (c.f. 'format-time-string').")
There are two new things to note about these docstrings.
First, each begins with an asterisk (
*). A leading asterisk has special meaning in
defvardocstrings. It means that the variable in question is a user option. A user option is just like any other Lisp variable except that it’s treated specially in two cases:
User options can be set interactively using
set-variable, which prompts the user for a variable name (with completion of partially typed names) and a value. In some cases, the value can be entered in an intuitive way without having to dress it up in Lisp syntax; e.g., strings can be entered without their surrounding double-quotes.
To set variables interactively when they aren’t user options, you must do something like
M-: (setq variable value) RET
(using Lisp syntax for value).
User options, but not other variables, can be edited en masse using the option-editing mode available as M-x edit-options RET.
The second new thing about these docstrings is that each contains the special construct
]. (Yes, it’s \[…], but since it’s written inside a Lisp string, the backslash has to be doubled: \\[…].) This syntax is magic. When the docstring is displayed to the user—such as when the user uses
]is replaced with a representation of a keybinding that invokes command. For example, if C-x t invokes
insert-time, then the docstring
"*Format for \\[insert-time] (c.f. 'format-time-string')."
is displayed as
*Format for C-x t (c.f. 'format-time-string').
If there is no keybinding for
insert-time, then M-x insert-time is used. If there are two or more keybindings for
insert-time, Emacs chooses one.
Suppose you want the string
\[insert-time]to appear literally in a docstring. How could you prevent its keybinding being substituted? For this purpose there is a special escape sequence: \=. When
\[…], the magic replacement of
\[…]doesn’t happen. Of course, Lisp string syntax dictates that this be written as
\= is also useful for escaping the asterisk at the beginning of a
defvardocstring, if you don’t want the variable to be a user option but you absolutely must have a docstring that begins with an asterisk.
All variables that are shared between two or more functions should be declared with
defvar. Which of those should be user options? A rule of thumb is that if the variable directly controls a user-visible feature that a user might want to change, and if setting that variable is straightforward (i.e., no complex data structures or specially coded values), then it should be a user option.
More Asterisk Magic
Now that we’ve defined the variables that control
insert-date, here are the definitions of those simple functions.
(defun insert-time () "Insert the current time according to insert-time-format." (interactive "*") (insert (format-time-string insert-time-format (current-time)))) (defun insert-date () "Insert the current date according to insert-date-format." (interactive "*") (insert (format-time-string insert-date-format (current-time))))
The two functions are identical except that one uses
insert-time-format where the other uses
insert function takes any number of arguments (which must all be strings or characters) and inserts them one after another in the current buffer at the present location of point, moving point forward.
The main thing to notice about these functions is that each begins with
By now you know that
interactive turns a function into a command and specifies how to obtain the function’s arguments when invoked interactively. But we haven’t seen
* in the argument of
interactive before, and besides, these functions take no arguments, so why does
interactive have one?
The asterisk, when it is the first character in an
interactive argument, means “abort this function if the current buffer is read-only.” It is better to detect a read-only buffer before a function begins its work than to let it get halfway through then die from a “Buffer is read-only” error. In this case, if we omitted to check for read-onlyness, the call to
insert would trigger its own “Buffer is read-only” error almost right away and no harm done. A more complicated function, though, might cause irreversible side effects (such as changing global variables), expecting to be able to finish, before discovering that it can’t.
Inserting the current date and time automatically and in such a configurable format is pretty neat and probably beyond the ken of most text editors, but its usefulness is limited. Undoubtedly more useful would be the ability to store a writestamp in a file; that is, the date and/or time the file was last written to disk. A writestamp updates itself each time the file is saved anew.
The first thing we’ll need is a way to run our writestamp-updating code each time the file is saved. As we discovered in the section Hooks in Chapter 2, the best way to associate some code with a common action (such as saving a file) is by adding a function to a hook variable, provided that a suitable hook variable exists. Using M-x apropos RET hook RET, we discover four promising hook variables:
after-save-hook, local-write-file-hooks, write-contents-hooks, and
We can discard
after-save-hook right away. We don’t want our code executed, modifying writestamps, after the file is saved, because then it will be impossible to save an up-to-date version of the file!
The differences between the remaining candidates are subtle:
Code to execute for any buffer each time it is saved.
A buffer-local version of
write-file-hooks. Recall from the Hooks section of Chapter 2 that a buffer-local variable is one that can have different values in different buffers. Whereas
write-file-hookspertains to every buffer,
local-write-file-hookscan pertain to individual buffers. Thus, if you want to run one function while saving a Lisp file and another one when saving a text file,
local-write-file-hooksis the one to use.
local-write-file-hooksin that it’s buffer-local and it contains functions to execute each time the buffer is saved to a file. However—and I warned you this was subtle—the functions in
write-contents-hookspertain to the buffer’s contents, while the functions in the other two hooks pertain to the files being edited. In practice, this means that if you change the major mode of the buffer, you’re changing the way the contents should be considered, and therefore
local-write-file-hooksdoesn’t. On the other hand, if you change Emacs’s idea of which file is being edited, e.g. by invoking
We’ll rule out
write-file-hooks because we’ll want to invoke our writestamp-updater only in buffers that have writestamps, not every time any buffer is saved. And, hair-splitting semantics aside, we’ll rule out
write-contents-hooks because we want our chosen hook variable to be immune to changes in the buffer’s major mode. That leaves
Now, what should the writestamp updater that we’ll put in
local-write-file-hooks do? It must locate each writestamp, delete it, and replace it with an updated one. The most straightforward approach is to surround each writestamp with a distinguishing string of characters that we can search for. Let’s say that each writestamp is surrounded by the strings
" on the left and
" on the right, so that in a file it looks something like this:
... went into the castle and lived happily ever after. The end. WRITESTAMP((12:19pm 7 Jul 96))
Let’s say that the stuff inside the
WRITESTAMP((…)) is put there by
insert-date (which we defined earlier) and so its format can be controlled with
Now, supposing we have some writestamps in the file to begin with, we can update it at file-writing time like so:
(add-hook 'local-write-file-hooks 'update-writestamps) (defun update-writestamps () "Find writestamps and replace them with the current time." (save-excursion (save-restriction (save-match-data (widen) (goto-char (point-min)) (while (search-forward "WRITESTAMP((" nil t) (let ((start (point))) (search-forward "))") (delete-region start (- (point) 2)) (goto-char start) (insert-date)))))) nil)
There’s a lot here that’s new. Let’s go through this function a line at a time.
First we notice that the body of the function is wrapped inside a call to
save-excursion does is memorize the position of the cursor, execute the subexpressions it’s given as arguments, then restore the cursor to its original position. It’s useful in this case because the body of the function is going to move the cursor all over the buffer, but by the time the function finishes we’d like the caller of this function to perceive no cursor motion. There’ll be much more about
save-excursion in Chapter 8.
Next is a call to
save-restriction. This is like
save-excursion in that it memorizes some information, then executes its arguments, then restores the information. The information in this case is the buffer’s restriction, which is the result of narrowing. Narrowing is covered in Chapter 9. For now let’s just say that narrowing refers to Emacs’s ability to show only a portion of a buffer. Since
update-writestamps is going to call
widen, which undoes the effect of any narrowing, we need
save-restriction in order to clean up after ourselves.
Next is a call to
save-match-data that, like
save-restriction, memorizes some information, executes its arguments, then restores the information. This time the information in question is the result of the latest search. Each time a search occurs, information about the result of the search is stored in some global variables (as we will see shortly). Each search wipes out the result of the previous search. Our function will perform a search, but for the sake of other functions that might be calling ours, we don’t want to disrupt the global match data.
Next is a call to
widen. As previously mentioned, this undoes any narrowing in effect. It makes the entire buffer accessible, which is necessary if every write-stamp is to be found and updated.
Next we move the cursor to the beginning of the buffer with
(goto-char (point-min)) in preparation for the function’s main loop, which is going to search for each successive writestamp and rewrite it in place. The function
point-min returns the minimum value for point, normally
1. (The only time
(point-min) might not be
1 is when there’s narrowing in effect. Since we’ve called
widen, we know narrowing is not in effect, so we could write
(goto-char 1) instead. But it’s good practice to use
point-min where appropriate.)
The main loop looks like this:
This is a
while loop, which works very much like while loops in other languages. Its first argument is an expression that is tested each time around the loop. If the expression evaluates to true, the remaining arguments are executed and the whole cycle repeats.
(search-forward "WRITESTAMP((" nil t) searches for the first occurrence of the given string, starting from the current location of point. The
nil means the search is not bounded except by the end of the buffer. This is explained in more detail later. The
t means that if no match is found,
search-forward should simply return
nil. (Without the
t, search-forward signals an error, aborting the current command, if no match is found.) If the search is successful, point is moved to the first character after the matched text, and
search-forward returns that position. (It’s possible to find where the match began using
match-beginning, which is shown in Figure 4-1.)
(let ((start (point))) ...)
This creates a temporary variable,
start, that holds the location of point, which is the beginning of the date string inside the
start defined, the body of the
This call to
search-forward places point after the two closing parentheses. We still know the beginning of the timestamp, because this location is in
start, as shown in Figure 4-2.
This time, only the first argument to
search-forward, the search string, is given. Earlier we saw two additional arguments: the search bound, and whether to signal an error. When omitted, they default to
nil (unbounded search) and
nil (signal an error if the search fails).
search-forward succeeds—and if it fails, an error is signaled and execution of the function never gets past
delete-region deletes the text region that is the date in the writestamp, starting at position
start and ending before position
(- (point) 2) (two characters to the left of point), leaving the results shown in Figure 4-3.
(goto-char start) positions the cursor inside the writestamp delimiters and, finally,
(insert-date) inserts the current date.
while loop executes as many times as there are matches for the search string. It’s important that each time a match is found, the cursor remains “to the right” of the place where the match began. Otherwise, the next iteration of the loop will find the same match for the search string!
while loop is done,
save-match-data returns, restoring the match data; then
save-restriction returns, restoring any narrowing that was in effect; then
save-excursion returns, restoring point to its original location.
The final expression of
update-writestamps, after the call to
This is the function’s return value. The return value of a Lisp function is simply the value of the last expression in the function’s body. (All Lisp functions return a value, but so far every function we’ve written has done its job via “side effects” instead of by returning meaningful values.) In this case we force it to be
nil. The reason is that functions in
local-write-file-hooks are treated specially. Normally, the return value of a function in a hook variable doesn’t matter. But for functions in
local-write-file-hooks (also in
write-contents-hooks), a non-
nil return value means, “This hook function has taken over the job of writing the buffer to a file.” If the hook function returns a non-
nil value, the remaining functions in the hook variables are not called, and Emacs does not write the buffer to a file itself after the hook functions run. Since
update-writestamps is not taking over the job of writing the buffer to a file, we want to be sure it returns
This approach to implementing writestamps works, but there are a few problems. First, by hardwiring the strings
" we’ve doomed the user to an unaesthetic and inflexible way to distinguish writestamps in text. Second, the user’s preference might not be to use
insert-date for writestamps.
These problems are simple to fix. We can introduce three new variables: one that, like
insert-time-format, describes a time format to use; and two that describe the delimiters surrounding a writestamp.
(defvar writestamp-format "%C" "*Format for writestamps (c.f. 'format-time-string').") (defvar writestamp-prefix "WRITESTAMP((" "*Unique string identifying start of writestamp.") (defvar writestamp-suffix "))" "*String that terminates a writestamp.")
Now we can modify
update-writestamps to be more configurable.
(defun update-writestamps () "Find writestamps and replace them with the current time." (save-excursion (save-restriction (save-match-data (widen) (goto-char (point-min)) (while (search-forward writestamp-prefix nil t) (let ((start (point))) (search-forward writestamp-suffix) (delete-region start (match-beginning 0)) (goto-char start) (insert (format-time-string writestamp-format (current-time)))))))) nil)
In this version of
update-writestamps, we’ve replaced occurrences of
writestamp-suffix, and we’ve replaced
(insert (format-time-string writestamp-format (current-time)))
We also changed the call to
delete-region. Previously it looked like this:
(delete-region start (- (point) 2))
That was when we had the writestamp suffix hardwired to be
", which is two characters long. But now that the writestamp suffix is stored in a variable, we don’t know in advance how many characters long it is. We could certainly find out, by calling
(delete-region start (- (point) (length writestamp-suffix)))
but a better solution is to use
match-beginning. Remember that before the call to
No matter what
search-forward finds the first occurrence of it, if one exists, and returns the first position after the match. But extra data about the match, notably the position where the match begins, is stored in Emacs’s global match-data variables. The way to access this data is with the functions
match-end. For reasons that will become clear shortly,
match-beginning needs an argument of
0 to tell you the position of the beginning of the match for the latest search. In this case, that happens to be the beginning of the writestamp suffix, which also happens to be the end of the date inside the writestamp, and therefore the end of the region to delete:
(delete-region start (match-beginning 0))
Suppose the user chooses
" as the
writestamp-suffix, so that writestamps appear like so: “Written: 19 Aug 1996.” This is a perfectly reasonable preference, but the string
" is less likely than
" to be completely unique. In other words, the file may contain occurrences of
" that aren’t writestamps. When
update-writestamps searches for
writestamp-prefix, it might find one of these occurrences, then search for the next occurrence of a period and delete everything in between. Worse, this unwanted deletion takes place almost undetectably, just as the file is being saved, with the cursor location and other appearances preserved.
One way to solve this problem is to impose tighter constraints on how the writestamp may appear, making mismatches less likely. One natural restriction might be to require writestamps to appear alone on a line: in other words, a string is a writestamp only if
writestamp-prefix is the first thing on the line and
writestamp-suffix is the last thing on the line.
Now it won’t suffice to use
(search-forward writestamp-prefix ...)
to find writestamps, because this search isn’t constrained to find matches only at the beginnings of lines.
This is where regular expressions come in handy. A regular expression—called a regexp or regex for short—is a search pattern just like the first argument to
search-forward. Unlike a normal search pattern, regular expressions have certain syntactic rules that allow more powerful kinds of searches. For example, in the regular expression '^
Written: ', the caret (
^) is a special character that means, “this pattern must match at the beginning of a line.” The remaining characters in the regexp '
^Written: ' don’t have any special meaning in regexp syntax, so they match the same way ordinary search patterns do. Special characters are sometimes called metacharacters or (more poetically) magic.
Many UNIX programs use regular expressions, among them sed, grep, awk, and perl. The syntax of regular expressions tends to vary slightly from one application to another, unfortunately; but in all cases, most characters are non-"magic” (particularly letters and numbers) and can be used to search for occurrences of themselves; and longer regexps can be built up from shorter ones simply by stringing them together. Here is the syntax of regular expressions in Emacs.
A set of characters inside square brackets matches any one of the enclosed characters. So
[aeiou]matches any occurrence of
ior o or
u. There are some exceptions to this rule—the syntax of square brackets in regular expressions has its own “subsyntax,” as follows:
A range of consecutive characters, such as
abcd, can be abbreviated
a-d. Any number of such ranges can be included, and ranges can be intermixed with single characters. So
a, b, c, d, m, x, y, or
To include a right-square-bracket, it must be the first character in the set. So
[^]a]matches any character except
To include a hyphen, it must appear where it can’t be interpreted as part of a range; for example, as the first or last character in the set, or following the end of a range. So
a, b, c, d, e, -, or
To include a caret, it must appear someplace other than as the first character in the set.
Other characters that are normally “magic” in regexps, such as
., are not magic inside square brackets.
An asterisk, matching zero or more occurrences of x
A plus sign, matching one or more occurrences of x
A question mark, matching zero or one occurrence of x
a, aa, aaa, and even an empty string (zero
a, aa, aaa, but not an empty string; and
a?matches an empty string and
a. Note that x
+is equivalent to xx
^x matches whatever x matches, but only at the beginning of a line.
The regexp x
$matches whatever x matches, but only at the end of a line.
This means that
$matches a line containing nothing but a match for x. In this case, you could leave out x altogether;
^$matches a line containing no characters.
Two regular expressions x and y separated by
\ |match whatever x matches or whatever y matches. So
A regular expression x enclosed in escaped parentheses—
\)—matches whatever x matches. This can be used for grouping complicated expressions. So
ab, abab, ababab, and so on. Also,
As a side effect, any text matched by a parenthesized subexpression is called a submatch and is memorized in a numbered register. Submatches are numbered from 1 through 9 by counting occurrences of
\(in a regexp from left to right. So if the regexp
ab\(cd*e\)matches the text
abcddde, then the one and only submatch is the string
cddde. If the regexp
ab\(cd\|ef\(g+h\)\)j\(k*\)matches the text
abefgghjkk, then the first submatch is
efggh, the second submatch is
ggh, and the third submatch is
Backslash followed by a digit n matches the same text matched by the nth parenthesized subexpression from earlier in the same regexp. So the expression
abab, aabaab, and
aaabaaab, but not
abisn’t the same as
The empty string can be matched in a wide variety of ways.
\`matches the empty string that’s at the beginning of the buffer. So
\`hellomatches the string
helloat the beginning of the buffer, but no other occurrence of
\´matches the empty string that’s at the end of the buffer.
\=matches the empty string that’s at the current location of point.
\bmatches the empty string that’s at the beginning or end of a word. So
\bgnu\bmatches the word “gnu” but not the occurrence of “gnu” inside the word “interregnum”.
\<matches the empty string at the beginning of a word only.
\>matches the empty string at the end of a word only.
As you can see, regular expression syntax uses backslashes for many purposes. So does Emacs Lisp string syntax. Since regexps are written as Lisp strings when programming Emacs, the two sets of rules for using backslashes can cause some confusing results. For example, the regexp
ab\|cd, when expressed as a Lisp string, must be written as
"ab\\|cd". Even stranger is when you want to match a single
\ using the regexp
\\ : you must write the string
"\\\\". Emacs commands that prompt for regular expressions (such as
keep-lines) allow you to type them as regular expressions (not Lisp strings) when used interactively.
Now that we know how to assemble regular expressions, it might seem obvious that the way to search for
writestamp-prefix at the beginning of a line is to prepend a caret onto
writestamp-prefix and append a dollar sign onto
writestamp-suffix, like so:
(re-search-forward (concat "^" writestamp-prefix) ...) ;wrong! (re-search-forward (concat writestamp-suffix "$") ...) ;wrong!
concat concatenates its string arguments into a single string. The function
re-search-forward is the regular expression version of
This is almost right. However, it contains a common and subtle error: either
writestamp-suffix may contain “magic” characters. In fact,
writestamp-suffix does, in our example: it’s
. matches any character (except newline), this expression:
(re-search-forward (concat writestamp-suffix "$") ...)
which is equivalent to this expression:
(re-search-forward ".$" ...)
matches any character at the end of a line, whereas we only want to match a period (
When building up a regular expression as in this example, using pieces such as
writestamp-prefix whose content is beyond the programmer’s control, it is necessary to “remove the magic” from strings that are meant to be taken literally. Emacs provides a function for this purpose called
regexp-quote, which understands regexp syntax and can turn a possibly-magic string into the corresponding non-magic one. For example,
(regexp-quote ".") yields
"\\." as a string. You should always use
regexp-quote to remove the magic from variable strings that are used to build up regular expressions.
We now know how to begin a new version of
Let’s finish our new version of
update-writestamps by filling in the body of the
while loop. Just after
re-search-forward succeeds, we need to know whether the current line ends with
writestamp-suffix. But we can’t simply write
(re-search-forward (concat (regexp-quote writestamp-suffix) "$"))
because that could find a match several lines away. We’re only interested in knowing whether the match is on the current line.
One solution is to limit the search to the current line. The optional second argument to
re-search-forward, if non-
nil, is a buffer position beyond which the search may not go. If we plug in the buffer position corresponding to the end of the current line like so:
(re-search-forward (concat (regexp-quote writestamp-suffix) "$") end-of-line-position)
then the search is limited to the current line, and we’ll have the answer we need. So how do we come up with end-of-line-position? We simply put the cursor at the end of the current line using
end-of-line, then query the value of point. But after we do that and before
re-search-forward begins, we must make sure to return the cursor to its original location since the search must start from there. Moving the cursor then restoring it is exactly what
save-excursion is designed to do. So we could write:
(let ((end-of-line-position (save-excursion (end-of-line) (point)))) (re-search-forward (concat (regexp-quote writestamp-suffix) "$") end-of-line-position))
which creates a temporary variable,
end-of-line-position, that is used to limit
re-search-forward; but it’s simpler not to use a temporary variable if we don’t really need it:
(re-search-forward (concat (regexp-quote writestamp-suffix) "$") (save-excursion (end-of-line) (point)))
Observe that the value of the
save-excursion expression is, like so many other Lisp constructs, the value of its last subexpression
update-writestamps can be written like this:
(defun update-writestamps () "Find writestamps and replace them with the current time." (save-excursion (save-restriction (save-match-data (widen) (goto-char (point-min)) (while (re-search-forward (concat "^" (regexp-quote writestamp-prefix)) nil t) (let ((start (point))) (if (re-search-forward (concat (regexp-quote writestamp-suffix) "$") (save-excursion (end-of-line) (point)) t) (progn (delete-region start (match-beginning 0)) (goto-char start) (insert (format-time-string writestamp-format (current-time)))))))))) nil)
Notice that both calls to
t as the optional third argument, meaning “if the search fails, return
nil (as opposed to signaling an error).”
More Regexp Power
We have created a more or less straightforward translation of
update-writestamps from its original form to use regular expressions, but it doesn’t really exploit the power of regexps. In particular, the entire sequence of finding a writestamp prefix, checking for a matching writestamp suffix on the same line, and replacing the text in between can be reduced to just these two expressions:
(re-search-forward (concat "^" (regexp-quote writestamp-prefix) "\\(.*\\)" (regexp-quote writestamp-suffix) "$")) (replace-match (format-time-string writestamp-format (current-time)) t t nil 1)
The first expression, the call to
re-search-forward, constructs a regexp that looks like this:
where prefix and suffix are
regexp-quoted versions of
writestamp-suffix. This regexp matches one entire line, beginning with the writestamp prefix, followed by any string (which is made a submatch by the use of
\(…\)), and ending with the writestamp suffix.
The second expression is a call to
replace-match, which replaces some or all of the matched text from a previous search. It’s used like this:
The first argument is the new string to insert, which in this example is the result of
format-time-string. The remaining arguments, which are all optional, have the following meanings:
We set this to
t, which tells
replace-matchto preserve alphabetic case in new-string. If it’s
nil, replace-matchtries to intelligently match the case of the text being replaced.
t, which means “treat new-string literally.” If it’s
new-stringaccording to some special syntax rules (for which see
nil, which means “Modify the current buffer.” If this were a string, then
replace-matchwould perform the replacement in the string instead of in a buffer.
1, which means “Replace submatch 1, not the entire matched string” (which would include the prefix and the suffix).
So after finding the writestamp with
re-search-forward and “submatching” the text between the delimiters, our call to
replace-match snips out the text between the delimiters and inserts a fresh new string formatted according to
As a final improvement to
update-writestamps, we can observe that if we write
(while (re-search-forward (concat ...) ...) (replace-match ...))
concat function is called each time through the loop, constructing a new string each time even though its arguments never change. This is inefficient. It would be better to compute the desired string once, before the loop, and store it in a temporary variable. The best way to write
update-writestamps is therefore:
(defun update-writestamps () "Find writestamps and replace them with the current time." (save-excursion (save-restriction (save-match-data (widen) (goto-char (point-min)) (let ((regexp (concat "^" (regexp-quote writestamp-prefix) "\\(.*\\)" (regexp-quote writestamp-suffix) "$"))) (while (re-search-forward regexp nil t) (replace-match (format-time-string writestamp-format (current-time)) t t nil 1)))))) nil)
Well, timestamps were marginally useful, and writestamps were somewhat more so, but modifystamps may be even better. A modifystamp is a writestamp that records the time the file was last modified, which may not be the same as the last time it was saved to disk. For instance, if you visit a file and save it under a new name without making any changes to it, you shouldn’t cause the modifystamp to change.
Simple Approach #1
Emacs has a hook variable called
first-change-hook. Whenever a buffer is changed for the first time since it was last saved, the functions in
first-change-hook get executed. Implementing modifystamps by using this hook merely entails moving our old
update-writestamps function from
first-change-hook. Of course, we’ll also want to change its name to
update-modifystamps, and introduce new variables—
modifystamp-format, modifystamp-prefix, and
modifystamp-suffix—that work like their writestamp counterparts without overloading the writestamp variables. Then
update-modifystamps should be changed to use the new variables.
Before any of this happens,
first-change-hook, which is normally global, should be made buffer-local. If we add
first-change-hook while it is still global,
update-modifystamps will be called every time any buffer is saved. Making it buffer-local in the current buffer causes changes to the variable to be invisible outside that buffer. Other buffers continue to use the default global value.
Although ordinary variables are made buffer-local with either
make-variable-buffer-local (see below), hook variables must be made buffer-local with
(defvar modifystamp-format "%C" "*Format for modifystamps (c.f. 'format-time-string').") (defvar modifystamp-prefix "MODIFYSTAMP((" "*String identifying start of modifystamp.") (defvar modifystamp-suffix "))" "*String that terminates a modifystamp.") (defun update-modifystamps () "Find modifystamps and replace them with the current time." (save-excursion (save-restriction (save-match-data (widen) (goto-char (point-min)) (let ((regexp (concat "^" (regexp-quote modifystamp-prefix) "\\(.*\\)" (regexp-quote modifystamp-suffix) "$"))) (while (re-search-forward regexp nil t) (replace-match (format-time-string modifystamp-format (current-time)) t t nil 1)))))) nil) (add-hook 'first-change-hook 'update-modifystamps nil t)
nil argument to
add-hook is just a place holder. We care only about the last argument,
t, which means “change only the buffer-local copy of
The problem with this approach is that if you make ten changes to the file before saving it, the modifystamps will contain the time of the first change, not the last change. Close enough for some purposes, but we can do better.
Simple Approach #2
This time we’ll go back to using
local-write-file-hooks, but we’ll call
update-modifystamps from it only if
buffer-modified-p returns true, which tells us that the current buffer has been modified since it was last saved:
(defun maybe-update-modifystamps () "Call 'update-modifystamps' if the buffer has been modified." (if (buffer-modified-p) (update-modifystamps))) (add-hook 'local-write-file-hooks 'maybe-update-modifystamps)
Now we have the opposite problem from simple approach #1: the last-modified time isn’t computed until the file is saved, which may be much later than the actual time of the last modification. If you make a change to the file at 2:00 and save at 3:00, the modifystamps will record 3:00 as the last-modified time. This is a closer approximation, but it’s still not perfect.
Theoretically, we could call
update-modifystamps after every change to the buffer, but in practice it’s prohibitively expensive to scan through the whole file and rewrite parts of it after every keystroke. But it’s not too expensive to memorize the current time after each buffer change. Then, when the buffer is saved to a file, the memorized time can be used for computing the time in the modifystamps.
The hook variable
after-change-functions contains functions to call after each buffer change. First let’s make it buffer-local:
Now we define a buffer-local variable to hold this buffer’s latest modification time:
(defvar last-change-time nil "Time of last buffer modification.") (make-variable-buffer-local 'last-change-time)
make-variable-buffer-local causes the named variable to have a separate, buffer-local value in every buffer. This is subtly different from
make-local-variable, which makes a variable have a buffer-local value in the current buffer while allowing other buffers to share the same global value. In this case, we use
make-variable-buffer-local because there is no meaningful global value of
last-change-time for other buffers to share.
(add-hook 'after-change-functions 'remember-change-time nil t)
after-change-functions are passed three arguments describing the change that just took place (see the section called Mode Meat in Chapter 7). But
remember-change-time doesn’t care what the change was; only that there was a change. So we’ll allow
remember-change-time to take arguments, but we’ll ignore them.
(defun remember-change-time (&rest unused) "Store the current time in 'last-change-time'." (setq last-change-time (current-time)))
&rest, followed by a parameter name, must appear last in a function’s parameter list. It means “collect up any remaining arguments into a list and assign it to the last parameter” (
unused in this case). The function may have other parameters, including
&optional ones, but these must precede the
&rest parameter. After all the other parameters are assigned in the normal fashion, the
&rest parameter gets a list of whatever’s left. So if a function is defined as
(defun foo (a b &rest c) ...)
and is called with
(foo 1 2 3 4), then
a will be
1, b will be
c will be the list
In some situations,
&rest is very useful, even necessary; but right now we’re only using it out of laziness (or economy, if you prefer), to avoid having to name three separate parameters that we don’t plan to use.
Now we must revise
update-modifystamps: it must use the time stored in
last-change-time instead of using
(current-time). For efficiency, it should also reset
nil when it is done, so if the file is subsequently saved without being modified, we can avoid the overhead of calling
(defun update-modifystamps () "Find modifystamps and replace them with the saved time." (save-excursion (save-restriction (save-match-data (widen) (goto-char (point-min)) (let ((regexp (concat "^" (regexp-quote modifystamp-prefix) "\\(.*\\)" (regexp-quote modifystamp-suffix) "$"))) (while (re-search-forward regexp nil t) (replace-match (format-time-string modifystamp-format last-change-time) t t nil 1)))))) (setq last-change-time nil) nil)
Finally, we wish not to call
(defun maybe-update-modifystamps () "Call 'update-modifystamps' if the buffer has been modified." (if last-change-time ;instead of testing (buffer-modified-p) (update-modifystamps)))
There’s still one important thing missing from
maybe-update-modifystamps. Before reading ahead to the next section, can you figure out what it is?
A Subtle Bug
The problem is that every time a modifystamp gets rewritten by
update-modifystamps, the buffer changes, causing
last-change-time to change! Only the first modifystamp will be correctly rewritten. Subsequent ones will contain a time much closer to when the file was saved than when the last modification was made.
One way around this problem is to temporarily set the value of
nil while executing
update-modifystamps as shown below.
(add-hook 'local-write-file-hooks '(lambda () (if last-change-time (let ((after-change-functions nil)) (update-modifystamps)))))
This use of
let creates a temporary variable,
after-change-functions, that supersedes the global
after-change-functions during the call to
update-modifystamps in the body of the
let. After the
let exits, the temporary
after-change-functions disappears and the global one is again in effect.
This solution has a drawback: if there are other functions in
after-change-functions, they’ll also be disabled during the call to
update-modifystamps, though you might not intend for them to be.
A better solution would be to “capture” the value of
last-change-time before any modifystamps are updated. That way, when updating the first modifystamp causes
last-change-time to change, the new value of
last-change-time won’t affect any remaining modifystamps because
update-modifystamps won’t be referring to
The simplest way to “capture” the value of
last-change-time is to pass it as an argument to
(add-hook 'local-write-file-hooks '(lambda () (if last-change-time (update-modifystamps last-change-time))))
This requires changing
update-modifystamps to take one argument and use it in the call to
(defun update-modifystamps (time) "Find modifystamps and replace them with the given time." (save-excursion (save-restriction (save-match-data (widen) (goto-char (point-min)) (let ((regexp (concat "^" (regexp-quote modifystamp-prefix) "\\(.*\\)" (regexp-quote modifystamp-suffix) "$"))) (while (re-search-forward regexp nil t) (replace-match (format-time-string modifystamp-format time) t t nil 1)))))) (setq last-change-time nil) nil)
You might be thinking that setting up a buffer to use modifystamps involves evaluating a lot of expressions and setting up a lot of variables, and that it seems hard to keep track of what’s needed to make modifystamps work. If so, you’re right. So in the next chapter, we’ll look at how you can encapsulate a collection of related functions and variables in a Lisp file.
 How do you find this out in the first place? Using M-x apropos RET time RET, of course.
 Emacs 20.1, which was not yet released when this book went to press, will introduce a major new system for editing user options called “customize.” Hooking user options into the “customize” system requires using special functions called
 Inserting writestamps is similar to inserting the date or the time. A function for doing so is left as an exercise for the reader.
 The * regular expression operator is known among computer scientists as a “Kleene closure.”