Chapter 4. Variable Vernacular
It is not uncommon to see an error message or an assignment statement that contains the idiom ${0##*/}
, which looks to be some sort of reference to $0
, but something more is going on.
Letâs take a closer look at variable references and what some of these extra characters do for us.
What weâll find is a whole array of string manipulations that give you quite a bit of power in a few special characters.
Variable Reference
Referencing a variableâs value is very straightforward in most programming languages.
You either just use the name of the variable or add a character to the name to explicitly say that you want to retrieve the value.
Thatâs true with bash: you assign to the variable by name, VAR=something
, and you retrieve the value with a dollar-sign prefix: $VAR
.
If youâre wondering why we need the dollar sign, consider that bash deals largely with strings, so:
MSG
=
"Error: FILE not found"
will give you a simple literal string of the four words shown, whereas:
MSG
=
"Error:
$FILE
not found"
will replace the $FILE
with the value of that variable (which, presumably, would hold the name of the file that it was looking for).
Variable Interpolation
Be sure to use double quotes if you want this string substitution to occur. Using single quotes takes all characters literally, and no substitutions happen.
To avoid confusion over where the variable name ends (the spaces make it easy in this example), a more complete syntax for variable reference uses braces around the variable name ${FILE}
, and could have been used in our example.
This syntax, with the braces, is the foundation for much special syntax around variable references. For example, we can put a hash sign in front of a variable name ${#VAR}
, to return not its value but the string length of the value.
${VAR} | ${#VAR} |
---|---|
oneword |
7 |
/usr/bin/longpath.txt |
21 |
many words in one string |
24 |
3 |
1 |
2356 |
4 |
1427685 |
7 |
But bash can do more than simply retrieve the value or its length.
Parameter Expansion
When retrieving the value of a variable, certain substitutions or edits can be specified, affecting the value that is returned (though not the value in the variableâexcept in one case).
The syntax involves special sequences of characters inside the braces used to delineate the variableâs name, like the characters inside these braces: ${VAR##*/}
.
Here are a few such expansions worth knowing.
Shorthand for basename
When you invoke a script, you might use just its filename as the command to invoke the script, but that assumes that the script has execute permissions and is in a directory located in one of the directories in your PATH
variable.
You might invoke the script with ./scriptname
if the script is in your current directory.
You might invoke it with a full pathname, /home/smith/utilities/scriptname
, or even a relative pathname if your current working directory is nearby.
Whichever way you invoke the script, $0
will contain the sequence of characters that you used to invoke the scriptârelative path or absolute path, however you expressed it.
When you want to print that scriptâs name out in a usage message, you likely want just the basename, the name of the file itself, not any of the path that got you there:
echo
"usage:
${
0
##*/
}
namesfile datafile"
You might see it in a usage message, telling the user the correct syntax for running the script, or it might be the righthand side of an assignment to a variable.
In that later case, we hope that the variable is called something like PROGRAM
or SCRIPT
because thatâs what this expression returnsâthe name of the script that is executing.
Letâs take a closer look at this particular parameter expansion on $0
, one that you can use to get just the basename without all the other parts of the path.
Path or Prefix Removal
You can remove characters from the front (prefix or lefthand side) or the tail (suffix or righthand side) of that value.
To remove a certain set of characters from the left side of a string, you add a #
and a shell pattern onto the parameter reference, a pattern that matches those characters that you want to remove.
The expression ${MYVAL#img_}
would remove the characters img_
if they were the first characters of the string in the MYVAL
variable.
Using a more complex pattern, we could write ${MYVAL#*_}
. This would remove any sequence of characters up to, and including, an underscore. (If there was no such pattern that matched, its full value is returned unaltered.)
A single #
says that it will use the shortest match possible (nongreedy). A double ##
says to use the longest match possible (greedy).
Now, perhaps, can you see what the expression ${0##*/}
will do?
It will start with the value in $0
, the pathname used to invoke the script.
Then, from the lefthand side of the value, it will remove
the longest match of any number of characters ending in a slash.
Thus, it is removing all the parts of the path used in invoking the script, leaving just the name of the script itself.
Here are some possible values for $0
and this pattern weâve discussed, to see how both the short (#
) and long (##
) match might differ in results:
Value in $0 | Expression | Result returned |
---|---|---|
./ascript |
${0#*/} |
ascript |
./ascript |
${0##*/} |
ascript |
../bin/ascript |
${0#*/} |
bin/ascript |
../bin/ascript |
${0##*/} |
ascript |
/home/guy/bin/ascript |
${0#*/} |
home/guy/bin/ascript |
/home/guy/bin/ascript |
${0##*/} |
ascript |
Notice that the shortest matching pattern for */
can match just the slash by itself.
Shorthand for dirname or Suffix Removal
Similar to how #
will remove a prefix, that is, remove from the lefthand side, we can remove a suffix, that is, from the righthand side, by using %
.
A double percent sign indicates removing the longest possible match.
Here are some examples that show how to remove a suffix.
The first examples show a variable $FN
, which holds the name of an image file.
It might end in .jpg
or .jpeg
or .png
or .gif
.
See how the different patterns remove various parts of the righthand side of the string.
The last few examples show how to get something similar to dirname
from the $0
parameter:
Value in shell variable | Expression | Result returned |
---|---|---|
img.1231.jpg |
${FN%.*} |
img.1234 |
img.1231.jpg |
${FN%%.*} |
img |
./ascript |
${0%/*} |
. |
./ascript |
${0%%/*} |
. |
/home/guy/bin/ascript |
${0%/*} |
/home/guy/bin |
/home/guy/bin/ascript |
${0%%/*} |
This parameter substitution for dirname
isnât an exact replica of the output from the command. It differs in the case where the path is /file
because dirname
would return just a slash, whereas our parameter substitution would remove it all. You can check for this if you want with some additional logic in your script, you could ignore this case if you donât expect to see it, or you can just add a slash to the end of the parameter, as in ${0%/*}/
, so that all results would end in a slash.
Prefix and Suffix Removal
You can remember that #
removes the left part and %
the right part because, at least on a standard US keyboard, #
is shift-3, which is to the left of %
at shift-5.
Other Modifiers
More than just #
and %
, there are a few other modifiers that can alter a value via parameter expansion.
You can convert either the first character or all characters in a string to uppercase via ^
or ^^
, respectively, or to lowercase via ,
or ,,
as shown in these examples:
Value in shell variable TXT | Expression | Result returned |
---|---|---|
message to send |
${TXT^} |
Message to send |
message to send |
${TXT^^} |
MESSAGE TO SEND |
Some Words |
${TXT,} |
some Words |
Do Not YELL |
${TXT,,} |
do not yell |
You might also consider declare -u UPPER
and declare -l lower
, which declare these shell variables to have their content converted to upper- or lowercase, respectively, for any text assigned to those variables.
The most flexible modifier is the one that does a substitution anywhere in the string, not just at the front or tail of the string.
Similar to the sed
command, it uses the slash, /
, to indicate what pattern to match and what value to replace it with.
A single slash means a single substitution (of the first occurrence).
Using two slashes means to replace every occurrence.
Here are a few examples:
Value in shell variable FN | Expression | Result returned |
---|---|---|
FN="my filename with spaces.txtâ |
${FN/ /_} |
my_filename with spaces.txt |
FN="my filename with spaces.txtâ |
${FN// /_} |
my_filename_with_spaces.txt |
FN="my filename with spaces.txtâ |
${FN// /} |
myfilenamewithspaces.txt |
FN="/usr/bin/filenameâ |
${FN//\// } |
usr bin filename |
FN="/usr/bin/filenameâ |
${FN/\// } |
usr/bin/filename |
No Trailing Slash
Note that there is no trailing slash like you would find in other similar commands like sed
or vi
. The closing brace ends the substitution.
Why not always use this substitution mechanism?
Why bother with #
or %
substitution from the ends of the string?
Consider this filename: frank.gifford.gif
,
and suppose you wanted to change this filename to a jpg
file
using Image Magickâs convert
command (thatâs another story).
The substitute using /
doesnât have a way to anchor the search to one end of the string or the other.
If you had read in the filename and tried to replace the .gif
with .jpg
,
what you would end up with is frank.jpgford.gif
.
For situations like this, the %
substitution, which takes from the end of the string, works much better.
Another useful modifier will extract a substring of the variable. After the variable name, put a colon, then the offset to the first character of the substring that you want to extract. Since this is an offset, start at 0 for the first character of the string. Next, put another colon and the length of the substring you want. If you leave off this second colon and a length, then you get the whole rest of the string. Here are a few examples:
Value in shell variable FN | Expression | Result returned |
---|---|---|
/home/bin/util.sh |
${FN:0:1} |
/ |
/home/bin/util.sh |
${FN:1:1} |
h |
/home/bin/util.sh |
${FN:3:2} |
me |
/home/bin/util.sh |
${FN:10:4} |
util |
/home/bin/util.sh |
${FN:10} |
util.sh |
Example 4-1 shows the use of parameter expansion to parse data out of some input to create and process specific fields to use when automatically creating a configuration for firewall rules. Weâve also included a larger table of bash parameter expansions in the code, as we do a lot in this book, as a âreal code readabilityâ example. The output follows in Example 4-2.
Example 4-1. Parsing using parameter expansions: code
#!/usr/bin/env bash
# parameter-expansion.sh: parameter expansion for parsing, and a big list
# Original Author & date: _bash Idioms_ 2022
# bash Idioms filename: examples/ch04/parameter-expansion.sh
#_________________________________________________________________________
# Does not work on Zsh 5.4.2!
customer_subnet_name
=
'Acme Inc subnet 10.11.12.13/24'
echo
''
echo
"Say we have this string:
$customer_subnet_name
"
customer_name
=
${
customer_subnet_name
%subnet*
}
# Trim from 'subnet' to end
subnet
=
${
customer_subnet_name
##*
}
# Remove leading 'space*'
ipa
=
${
subnet
%/*
}
# Remove trailing '/*'
cidr
=
${
subnet
#*/
}
# Remove up to '/*'
fw_object_name
=
${
customer_subnet_name
// /_
}
# Replace space with '_-
fw_object_name
=
${
fw_object_name
////-
}
# Replace '/' with '-'
fw_object_name
=
${
fw_object_name
,,
}
# Lowercase
echo
''
echo
'When the code runs we get:'
echo
''
echo
"Customer name:
$customer_name
"
echo
"Subnet:
$subnet
"
echo
"IPA
$ipa
"
echo
"CIDR mask:
$cidr
"
echo
"FW Object:
$fw_object_name
"
# bash Shell Parameter Expansion: https://oreil.ly/Af8lw
# ${var#pattern} Remove shortest (nongreedy) leading pattern
# ${var##pattern} Remove longest (greedy) leading pattern
# ${var%pattern} Remove shortest (nongreedy) trailing pattern
# ${var%%pattern} Remove longest (greedy) trailing pattern
# ${var/pattern/replacement} Replace first +pattern+ with +replacement+
# ${var//pattern/replacement} Replace all +pattern+ with +replacement+
# ${var^pattern} Uppercase first matching optional pattern
# ${var^^pattern} Uppercase all matching optional pattern
# ${var,pattern} Lowercase first matching optional pattern
# ${var,,pattern} Lowercase all matching optional pattern
# ${var:offset} Substring starting at +offset+
# ${var:offset:length} Substring starting at +offset+ for +length+
# ${var:-default} Var if set, otherwise +default+
# ${var:-default} Assign +default+ to +var+ if +var+ not already set
# ${var:?error_message} Barf with +error_message+ if +var+ not set
# ${var:+replaced} Expand to +replaced+ if +var+ _is_ set
# ${#var} Length of var
# ${!var[*]} Expand to indexes or keys
# ${!var[@]} Expand to indexes or keys, quoted
# ${!prefix*} Expand to variable names starting with +prefix+
# ${!prefix@} Expand to variable names starting with +prefix+, quoted
# ${var@Q} Quoted
# ${var@E} Expanded (better than `eval`!)
# ${var@P} Expanded as prompt
# ${var@A} Assign or declare
# ${var@a} Return attributes
Example 4-2. Parsing using parameter expansions: output
Say we have this string: Acme Inc subnet 10.11.12.13/24 When the code runs we get: Customer name: Acme Inc Subnet: 10.11.12.13/24 IPA 10.11.12.13 CIDR mask: 24 FW Object: acme_inc_subnet_10.11.12.13-24
Conditional Substitutions
Some of these variable substitutions are conditional, that is, they happen only if certain conditions are met.
You could accomplish the same thing using if
statements around the assignments, but these idioms make for shorter code for certain common cases.
These conditional substitutions are shown here with a colon and then another special character: a minus, plus, or equal sign.
The condition that they check for is this: is the variable null or unset?
A null variable is a variable whose value is the null string.
An unset variable is one that hasnât yet been assigned or was explicitly unset (think âdiscardedâ) with the unset
command.
With positional parameters (like $1
, $2
, etc.), they are unset if the user doesnât supply a parameter in that position.
If you donât include the colon in these conditional substitutions, then they only consider the case of an unset variable; null values are returned as is.
Default Values
A common scenario is a script with a single, optional parameter. If the parameter isnât supplied when the script is invoked, then a default value should be used. In bash, we might write something like this:
LEN
=
${
1
:-
5
}
This will set the variable LEN
either to the value of the first parameter ($1
)âif one was suppliedâor else to the value 5
.
Here is an example script:
LEN
=
"
${
1
:-
5
}
"
cut -d','
-f2-3 /tmp/megaraid.out|
sort|
uniq -c|
sort -rn|
head -n"
$LEN
"
It takes the second and third fields from a comma-separated values file called /tmp/megaraid.out
, sorts those values, provides a count of the number of occurrences of each value pair, then shows the top 5 from the list.
You can override the default value of 5 and show the top 3 or 10 (or however many you want) simply by specifying that count as the sole parameter to the script.
Comma-Separated Lists
Another conditional substitution, using the plus sign, also checks to see if the variable has a value and if so, if it will return a different value. That is, it returns the specified different value only if the variable is not null. Yes, that does sound strange; if it has a value, why return a different value?
A handy use for this seemingly odd logic is to construct a comma-separated list.
You typically construct such a list by repeatedly appending â,valueâ or âvalue,â for every value.
When doing so, you usually need an if
statement to avoid having an extra comma on the front or end of this listâbut not when you use this join idiom:
for
fnin
*;
do
S
=
${
LIST
:+,
}
# S for separator
LIST
=
"
${
LIST
}${
S
}${
fn
}
"
done
See also Example 7-1.
Modified Value
Up to now, none of these substitutions have modified the underlying value of the variable.
There is, however, one that does.
If we write ${VAR:=value}
, it will act much like our preceding default value idiom, but with one big exception.
If VAR
is empty or unset, it will assign that value to the variable (hence, the equal sign) and return that value.
(If VAR
is already set, it will simply return its existing value.)
Note, however, that this assigning of a value does not work for positional parameters (like $1
), which is why you donât see it used nearly as often.
$RANDOM
Bash has a very handy $RANDOM
variable. As the âBash Variablesâ section in the Bash Reference Manual says:
Each time this parameter is referenced, a random integer between 0 and 32767 is generated. Assigning a value to this variable seeds the random number generator.
While this is not suitable for cryptographic functions, itâs useful for rolling the dice or adding a bit of noise into otherwise too-predictable operations. We use this later in âA Simple Word Count Exampleâ.
As shown in Example 4-3, you can pick a random element out of a list.
Example 4-3. Pick a random list element
declare
-a mylistmylist
=(
foo bar baz one two"three four"
)
range
=
${#
mylist
[@]
}
random
=
$((
$RANDOM
%
$range
))
# 0 to list length count
echo
"range =
$range
, random =
$random
, choice =
${
mylist
[
$random
]
}
"
# Shorter but less readable 6 months from now:
# echo "choice = ${mylist[$(( $RANDOM % ${#mylist[@]} ))]}"
You may also see something like this:
TEMP_DIR
=
"
$TMP
/myscript.
$RANDOM
"
[
-d"
$TEMP_DIR
"
]
||
mkdir"
$TEMP_DIR
"
However, that is subject to race conditions, and is obviously a simple pattern. It is also partly predictable, but sometimes you want to have a clue as to what code is cluttering up $TMP
. Donât forget to set a trap
(see âItâs a Trap!â) to clean up after yourself. We recommend you consider using mktemp
, though thatâs a large issue outside the scope of bash idioms.
$RANDOM
and dash
$RANDOM
is not available in dash, which is /bin/sh in some Linux distributions. Notably, current versions of Debian and Ubuntu use dash because it is smaller and faster than bash and thus helps to boot faster. But that means that /bin/sh, which used to be a symlink to bash, is now a symlink to dash instead, and various bash-specific features will not work. It does work in Zsh though.
Command Substitution
Weâve already used command substitution quite a bit in Chapter 2, but we havenât talked about it. The old Bourne way to do it is ``
(backticks/backquotes), but we prefer the more readable POSIX $()
instead. You will see a lot of both forms, because itâs how you pull output into a variable; for example:
unique_lines_in_file
=
"
$(
sort -u"
$my_file
"
|
wc -l)
"
Note that these are the same, but the second one is internal and faster:
for
argin
$(
cat /some/file)
for
argin
$(
< /some/file)
# Faster than shelling out to cat
Command Substitution
Command substitution is critical to cloud and other DevOps automation because it allows you to gather and use all the IDs and details that only exist at runtime; for example:
instance_id
=
$(
aws ec2 run-instances --image$base_ami_id
...\
--output text --query'Instances[*].InstanceId'
)
state
=
$(
aws ec2 describe-instances --instance-ids$instance_id
\
--output text --query'Reservations[*].Instances[*].State.Name'
)
Nesting Command Substitution
Nesting command substitution using ``
gets very ugly, very fast, because you must escape the inner backticks in each nesting layer. Itâs much easier to use $()
if you can, as shown:
### Just Works
$echo
$(
echo
$(
echo
$(
echo
inside)))
inside### Broken
$echo
`
echo
`
echo
`
echo
inside```
echo
inside### "Works" but very ugly
$echo
`
echo
\`
echo
\\\`
echo
inside\\\`\`
`
inside
Thanks to our reviewer Ian Miell for pointing this out and providing the example.
Style and Readability: Recap
When referencing a variable in bash, you have the opportunity to edit the value as you set or retrieve it. A few special characters at the end of the variable reference can remove characters from the front or end of the string value, alter its characters to upper- or lowercase, substitute characters, or give you just a substring of the original value.
Common use of these handy features results in idioms for default values, basename
and dirname
substitutes, and the creation of a comma-separated list without using an explicit if
statement.
Variable substitutions are a great feature in bash, and we recommend making good use of them. However, we also strongly recommend that you comment those statements to make it clear what sort of substitution you are attempting. The next reader of your code will thank you.
Get bash Idioms now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.