Appendix B. The Outer Limits
As you’ve seen, GNU make
can do some pretty incredible things, but I haven’t seen very much that really pushes the limits of make
3.80 with its eval
construct. In this exercise, we’ll see if we can stretch it further than usual.
Data Structures
One of the limitations of make
that occasionally chaffs when writing complex makefiles is make
’s lack of data structures. In a very limited way, you can simulate a data structure by defining variables with embedded periods (or even ->
if you can stand it):
file.path = /foo/bar file.type = unix file.host = oscar
If pressed, you can even “pass” this file
structure to a function by using computed variables:
define remote-file $(if $(filter unix,$($1.type)), \ /net/$($1.host)/$($1.path), \ //$($1.host)/$($1.path)) endef
Nevertheless, this seems an unsatisfying solution for several reasons:
You cannot easily allocate a instance of this “structure.” Creating a new instance involves selecting a new variable name and assigning each element. This also means these pseudo-instances are not guaranteed to have the same fields (called slots).
The structure exists only in the user’s mind, and as a set of different
make
variables, rather than as a unified entity with its own name. And because the structure has no name, it is difficult to create a reference (or pointer) to a structure, so passing them as arguments or storing one in a variable is clumsy.There is no safe way to access a slot of the structure. Any typographical error in either part of the variable name yields the wrong value (or no value) with no warning from
make
.
But the remote-file
function hints at a more comprehensive solution. Suppose we implement structure instances using computed variables. Early Lisp object systems (and even some today) used similar techniques. A structure, say file-info
, can have instances represented by a symbolic name, such as file_info_1
.
Another instance might be called file_info_2
. The slots of this structure can be represented by computed variables:
file_info_1_path file_info_1_type file_info_1_host
Since the instance has a symbolic name, it can be saved in one or more variables (as usual, using recursive or simple variables is the choice of the programmer):
before_foo = file_info_1 another_foo = $(before_foo)
Elements of a file-info
can be accessed using Lisp-like getters and setters:
path := $(call get-value,before_foo,path) $(call set-value,before_foo,path,/usr/tmp/bar)
We can go further than this by creating a template for the file-info
structure to allow the convenient allocation of new instances:
orig_foo := $(call new,file-info) $(call set-value,orig_foo,path,/foo/bar) tmp_foo := $(call new,file-info) $(call set-value,tmp_foo,path,/tmp/bar)
Now, two distinct instances of file-info
exist. As a final convenience, we can add the concept of default values for slots. To declare the file-info
structure, we might use:
$(call defstruct,file-info, \ $(call defslot,path,), \ $(call defslot,type,unix), \ $(call defslot,host,oscar))
Here, the first argument to the defstruct
function is the structure name, followed by a list of defslot
calls. Each defslot
contains a single (name
, default value
) pair. Example B-1 shows the implementation of defstruct
and its supporting code.
# $(next-id) - return a unique number next_id_counter := define next-id $(words $(next_id_counter))$(eval next_id_counter += 1) endef # all_structs - a list of the defined structure names all_structs := value_sep := XxSepxX # $(call defstruct, struct_name, $(call defslot, slot_name, value), ...) define defstruct $(eval all_structs += $1) \ $(eval $1_def_slotnames :=) \ $(foreach v, $2 $3 $4 $5 $6 $7 $8 $9 $(10) $(11), \ $(if $($v_name), \ $(eval $1_def_slotnames += $($v_name)) \ $(eval $1_def_$($v_name)_default := $($v_value)))) endef # $(call defslot,slot_name,slot_value) define defslot $(eval tmp_id := $(next_id)) $(eval $1_$(tmp_id)_name := $1) $(eval $1_$(tmp_id)_value := $2) $1_$(tmp_id) endef # all_instances - a list of all the instances of any structure all_instances := # $(call new, struct_name) define new $(strip \ $(if $(filter $1,$(all_structs)),, \ $(error new on unknown struct '$(strip $1)')) \ $(eval instance := $1@$(next-id)) \ $(eval all_instances += $(instance)) \ $(foreach v, $($(strip $1)_def_slotnames), \ $(eval $(instance)_$v := $($(strip $1)_def_$v_default))) \ $(instance)) endef # $(call delete, variable) define delete $(strip \ $(if $(filter $($(strip $1)),$(all_instances)),, \ $(error Invalid instance '$($(strip $1))')) \ $(eval all_instances := $(filter-out $($(strip $1)),$(all_instances))) \ $(foreach v, $($(strip $1)_def_slotnames), \ $(eval $(instance)_$v := ))) endef # $(call struct-name, instance_id) define struct-name $(firstword $(subst @, ,$($(strip $1)))) endef # $(call check-params, instance_id, slot_name) define check-params $(if $(filter $($(strip $1)),$(all_instances)),, \ $(error Invalid instance '$(strip $1)')) \ $(if $(filter $2,$($(call struct-name,$1)_def_slotnames)),, \ $(error Instance '$($(strip $1))' does not have slot '$(strip $2)')) endef # $(call get-value, instance_id, slot_name) define get-value $(strip \ $(call check-params,$1,$2) \ $($($(strip $1))_$(strip $2))) endef # $(call set-value, instance_id, slot_name, value) define set-value $(call check-params,$1,$2) \ $(eval $($(strip $1))_$(strip $2) := $3) endef # $(call dump-struct, struct_name) define dump-struct { $(strip $1)_def_slotnames "$($(strip $1)_def_slotnames)" \ $(foreach s, \ $($(strip $1)_def_slotnames),$(strip \ $(strip $1)_def_$s_default "$($(strip $1)_def_$s_default)")) } endef # $(call print-struct, struct_name) define print-struct { $(foreach s, \ $($(strip $1)_def_slotnames),$(strip \ { "$s" "$($(strip $1)_def_$s_default)" })) } endef # $(call dump-instance, instance_id) define dump-instance { $(eval tmp_name := $(call struct-name,$1)) \ $(foreach s, \ $($(tmp_name)_def_slotnames),$(strip \ { $($(strip $1))_$s "$($($(strip $1))_$s)" })) } endef # $(call print-instance, instance_id) define print-instance { $(foreach s, \ $($(call struct-name,$1)_def_slotnames),"$(strip \ $(call get-value,$1,$s))") } endef
Examining this code one clause at a time, you can see that it starts by defining the function next-id
. This is a simple counter:
# $(next-id) - return a unique number next_id_counter := define next-id $(words $(next_id_counter))$(eval next_id_counter += 1) endef
It is often said that you cannot perform arithmetic in make
, because the language is too limited. In general, this is true, but for limited cases like this you can often compute what you need. This function uses eval
to redefine the value of a simple variable. The function contains two expressions: the first expression returns the number of words in next_id_counter
; the second expression appends another word to the variable. It isn’t very efficient, but for numbers in the small thousands it is fine.
The next section defines the defstruct
function itself and creates the supporting data structures.
# all_structs - a list of the defined structure names all_structs := value_sep := XxSepxX # $(call defstruct, struct_name, $(call defslot, slot_name, value), ...) define defstruct $(eval all_structs += $1) \ $(eval $1_def_slotnames :=) \ $(foreach v, $2 $3 $4 $5 $6 $7 $8 $9 $(10) $(11), \ $(if $($v_name), \ $(eval $1_def_slotnames += $($v_name)) \ $(eval $1_def_$($v_name)_default := $($v_value)))) endef # $(call defslot,slot_name,slot_value) define defslot $(eval tmp_id := $(next_id)) $(eval $1_$(tmp_id)_name := $1) $(eval $1_$(tmp_id)_value := $2) $1_$(tmp_id) endef
The variable all_structs
is a list of all known structures defined with defstruct
. This list allows the new
function to perform type checking on the structures it allocates. For each structure, S
, the defstruct
function defines a set of variables:
S_def_slotnames S_def_slotn_default
The first variable defines the set of slots for a structure. The other variables define the default value for each slot. The first two lines of the defstruct
function append to all_structs
and initialize the slot names list, respectively. The remainder of the function iterates through the slots, building the slot list and saving the default value.
Each slot definition is handled by defslot
. The function allocates an id, saves the slot name and value in two variables, and returns the prefix. Returning the prefix allows the argument list of defstruct
to be a simple list of symbols, each of which provides access to a slot definition. If more attributes are added to slots later, incorporating them into defslot
is straightforward. This technique also allows default values to have a wider range of values (including spaces) than simpler, alternative implementations.
The foreach
loop in defstruct
determines the maximum number of allowable slots. This version allows for 10 slots. The body of the foreach
processes each argument by appending the slot name to S
_def_slotnames
and assigning the default value to a variable. For example, our file-info
structure would define:
file-info_def_slotnames := path type host file-info_def_path_default := file-info_def_type_default := unix file-info_def_host_default := oscar
This completes the definition of a structure.
Now that we can define structures, we need to be able to instantiate one. The new
function performs this operation:
# $(call new, struct_name) define new $(strip \ $(if $(filter $1,$(all_structs)),, \ $(error new on unknown struct '$(strip $1)')) \ $(eval instance := $1@$(next-id)) \ $(eval all_instances += $(instance)) \ $(foreach v, $($(strip $1)_def_slotnames), \ $(eval $(instance)_$v := $($(strip $1)_def_$v_default))) \ $(instance)) endef
The first if
in the function checks that the name refers to a known structure. If the structure isn’t found in all_structs
, an error is signaled. Next, we construct a unique id for the new instance by concatenating the structure name with a unique integer suffix. We use an at sign to separate the structure name from the suffix so we can easily separate the two later. The new
function then records the new instance name for type checking by accessors later. Then the slots of the structure are initialized with their default values. The initialization code is interesting:
$(foreach v, $($(strip $1)_def_slotnames), \ $(eval $(instance)_$v := $($(strip $1)_def_$v_default)))
The foreach
loop iterates over the slot names of the structure. Using strip
around on the structure name allows the user to add spaces after commas in the call
to new
. Recall that each slot is represented by concatenating the instance name and the slot name (for instance, file_info@1_path
). The righthand side is the default value computed from the structure name and slot name. Finally, the instance name is returned by the function.
Note that I call these constructs functions, but they are actually macros. That is, the symbol new
is recursively expanded to yield a new piece of text that is inserted into the makefile for reparsing. The reason the defstruct
macro does what we want is because all the work is eventually embedded within eval
calls, which collapse to nothing. Similarly, the new
macro performs its significant work within eval
calls. It can reasonably be termed a function, because the expansion of the macro conceptually yields a single value, the symbol representing the new instance.
Next, we need to be able to get and set values within our structures. To do this, we define two new functions:
# $(call get-value, instance_id, slot_name) define get-value $(strip \ $(call check-params,$1,$2) \ $($($(strip $1))_$(strip $2))) endef # $(call set-value, instance_id, slot_name, value) define set-value $(call check-params,$1,$2) \ $(eval $($(strip $1))_$(strip $2) := $3) endef
To get the value of a slot, we simply need to compute the slot variable name from the instance id and the slot name. We can improve safety by first checking that the instance and slot name are valid strings with the check-params
function. To allow more aesthetic formating and to ensure that extraneous spaces do not corrupt the slot value, we wrap most of these parameters in strip
calls.
The set function also checks parameters before setting the value. Again, we strip the two function arguments to allow users the freedom to add spaces in the argument list. Note that we do not strip the slot value, because the user might actually need the spaces.
# $(call check-params, instance_id, slot_name) define check-params $(if $(filter $($(strip $1)),$(all_instances)),, \ $(error Invalid instance '$(strip $1)')) \ $(if $(filter $2,$($(call struct-name,$1)_def_slotnames)),, \ $(error Instance '$($(strip $1))' does not have slot '$(strip $2)')) endef # $(call struct-name, instance_id) define struct-name $(firstword $(subst @, ,$($(strip $1)))) endef
The check-params
function simply checks that the instance id passed to the setter and getter functions is contained within the known instances list. Likewise, it checks that the slot name is contained within the list of slots belonging to this structure. The structure name is computed from the instance name by splitting the symbol on the @
and taking the first word. This means that structure names cannot contain an at sign.
To round out the implementation, we can add a couple of print and debugging functions. The following print function displays a simple user-readable representation of a structure definition and a structure instance, while the dump function displays the implementation details of the two constructs. See Example B-1 for the implementations.
Here’s an example defining and using our file-info
structure:
include defstruct.mk $(call defstruct, file-info, \ $(call defslot, path,), \ $(call defslot, type,unix), \ $(call defslot, host,oscar)) before := $(call new, file-info) $(call set-value, before, path,/etc/password) $(call set-value, before, host,wasatch) after := $(call new,file-info) $(call set-value, after, path,/etc/shadow) $(call set-value, after, host,wasatch) demo: # before = $(before) # before.path = $(call get-value, before, path) # before.type = $(call get-value, before, type) # before.host = $(call get-value, before, host) # print before = $(call print-instance, before) # dump before = $(call dump-instance, before) # # all_instances = $(all_instances) # all_structs = $(all_structs) # print file-info = $(call print-struct, file-info) # dump file-info = $(call dump-struct, file-info)
and the output:
$ make # before = file-info@0 # before.path = /etc/password # before.type = unix # before.host = wasatch # print before = { "/etc/password" "unix" "wasatch" } # dump before = { { file-info@0_path "/etc/password" } { file-info@0_type "unix" } { file-info@0_host "wasatch" } } # # all_instances = file-info@0 file-info@1 # all_structs = file-info # print file-info = { { "path" "" } { "type" "unix" } { "host" "oscar" } } # dump file-info = { file-info_def_slotnames " path type host" file-info_def_path_ default "" file-info_def_type_default "unix" file-info_def_host_default "oscar" }
Also note how illegal structure uses are trapped:
$ cat badstruct.mk include defstruct.mk $(call new, no-such-structure) $ make -f badstruct.mk badstruct.mk:2: *** new on unknown struct 'no-such-structure'. Stop. $ cat badslot.mk include defstruct.mk $(call defstruct, foo, defslot(size, 0)) bar := $(call new, foo) $(call set-value, bar, siz, 10) $ make -f badslot.mk badslot.mk:4: *** Instance 'foo@0' does not have slot 'siz'. Stop.
Of course, there are lots of improvements that can be made to the code, but the basic ideas are sound. Here is a list of possible enhancements:
Add a validation check to the slot assignment. This could be done with a hook function that must yield empty after the assignment has been performed. The hook could be used like this:
# $(call set-value, instance_id, slot_name, value) define set-value $(call check-params,$1,$2) \ $(if $(call $(strip $1)_$(strip $2)_hook, value), \ $(error set-value hook, $(strip $1)_$(strip $2)_hook, failed)) \ $(eval $($(strip $1))_$(strip $2) := $3) endef
Support for inheritance. A
defstruct
could accept anotherdefstruct
name as a superclass, duplicating all the superclass’s members in the subclass.Better support for structure references. With the current implementation, a slot can hold the ID of another structure, but accessing is awkward. A new version of the
get-value
function could be written to check for references (by looking fordefstruct
@number
), and perform automatic dereferencing.
Arithmetic
In the previous section, I noted that it is not possible to perform arithmetic in make
using only its native features. I then showed how you could implement a simple counter by appending words to a list and returning the length of the list. Soon after I discovered the increment trick, Michael Mounteney posted a cool trick for performing a limited form of addition on integers in make
.
His trick manipulates the number line to compute the sum of two integers of size one or greater. To see how this works, imagine the number line:
2 3 4 5 6 7 8 9 10 11 12 13 14 15
Now, notice that (if we could get the subscripts right), we could add, say 4 plus 5, by first taking a subset of the line from the fourth element to the end then selecting the fifth element of the subset. We can do this with native make
functions:
number_line = 2 3 4 5 6 7 8 9 10 11 12 13 14 15 plus = $(word $2, $(wordlist $1, 15, $(number_line))) four+five = $(call plus, 4, 5)
Very clever, Michael! Notice that the number line starts with 2 rather than 0 or 1. You can see that this is necessary if you run the plus
function with 1 and 1. Both subscripts will yield the first element and the answer must be 2, therefore, the first element of the list must be 2. The reason for this is that, for the word
and wordlist
functions, the first element of the list has subscript 1 rather than 0 (but I haven’t bothered to prove it).
Now, given a number line, we can perform addition, but how do we create a number line in make
without typing it in by hand or using a shell program? We can create all numbers between 00 and 99 by combining all possible values in the tens place with all possible values in the ones place. For example:
make -f - <<< '$(warning $(foreach i, 0 1 2, $(addprefix $i, 0 1 2)))' /c/TEMP/Gm002568:1: 00 01 02 10 11 12 20 21 22
By including all digits 0 through 9, we would produce all numbers from 00 to 99. By combining the foreach
again with a hundreds column, we would get the numbers from 000 to 999, etc. All that is left is to strip the leading zeros where necessary.
Here is a modified form of Mr. Mounteney’s code to generate a number line and define the plus
and gt
operations:
# combine - concatentate one sequence of numbers with another combine = $(foreach i, $1, $(addprefix $i, $2)) # stripzero - Remove one leading zero from each word stripzero = $(patsubst 0%,%,$1) # generate - Produce all permutations of three elements from the word list generate = $(call stripzero, \ $(call stripzero, \ $(call combine, $1, \ $(call combine, $1, $1)))) # number_line - Create a number line from 0 to 999 number_line := $(call generate,0 1 2 3 4 5 6 7 8 9) length := $(word $(words $(number_line)), $(number_line)) # plus - Use the number line to add two integers plus = $(word $2, \ $(wordlist $1, $(length), \ $(wordlist 3, $(length), $(number_line)))) # gt - Use the number line to determine if $1 is greater than $2 gt = $(filter $1, \ $(wordlist 3, $(length), \ $(wordlist $2, $(length), $(number_line)))) all: @echo $(call plus,4,7) @echo $(if $(call gt,4,7),is,is not) @echo $(if $(call gt,7,4),is,is not) @echo $(if $(call gt,7,7),is,is not)
When run, the makefile yields:
$ make 11 is not is is not
We can extend this code to include subtraction by noting that subscripting a reversed list is just like counting backwards. For example, to compute 7 minus 4, first create the number line subset 0 to 6, reverse it, then select the fourth element:
number_line := 0 1 2 3 4 5 6 7 8 9... 1through6 := 0 1 2 3 4 5 6 reverse_it := 6 5 4 3 2 1 0 fourth_item := 3
Here is the algorithm in make
syntax:
# backwards - a reverse number line backwards := $(call generate, 9 8 7 6 5 4 3 2 1 0) # reverse - reverse a list of words reverse = $(strip \ $(foreach f, \ $(wordlist 1, $(length), $(backwards)), \ $(word $f, $1))) # minus - compute $1 minus $2 minus = $(word $2, \ $(call reverse, \ $(wordlist 1, $1, $(number_line)))) minus: # $(call minus, 7, 4)
Multiplication and division are left as an exercise for the reader.
Get Managing Projects with GNU Make, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.