454 lines
18 KiB
XML
454 lines
18 KiB
XML
<?xml version='1.0' encoding='utf-8' ?>
|
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
|
|
]>
|
|
<chapter id="chap-Defensive_Coding-Shell">
|
|
<title>Shell Programming and <application>bash</application></title>
|
|
<para>
|
|
This chapter contains advice about shell programming, specifically
|
|
in <application>bash</application>. Most of the advice will apply
|
|
to scripts written for other shells because extensions such as
|
|
integer or array variables have been implemented there as well, with
|
|
comparable syntax.
|
|
</para>
|
|
<section id="sect-Defensive_Coding-Shell-Alternatives">
|
|
<title>Consider alternatives</title>
|
|
<para>
|
|
Once a shell script is so complex that advice in this chapter
|
|
applies, it is time to step back and consider the question: Is
|
|
there a more suitable implementation language available?
|
|
</para>
|
|
<para>
|
|
For example, Python with its <literal>subprocess</literal> module
|
|
can be used to write scripts which are almost as concise as shell
|
|
scripts when it comes to invoking external programs, and Python
|
|
offers richer data structures, with less arcane syntax and more
|
|
consistent behavior.
|
|
</para>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Language">
|
|
<title>Shell language features</title>
|
|
<para>
|
|
The following sections cover subtleties concerning the shell
|
|
programming languages. They have been written with the
|
|
<application>bash</application> shell in mind, but some of these
|
|
features apply to other shells as well.
|
|
</para>
|
|
<para>
|
|
Some of the features described may seem like implementation defects,
|
|
but these features have been replicated across multiple independent
|
|
implementations, so they now have to be considered part of the shell
|
|
programming language.
|
|
</para>
|
|
<section id="sect-Defensive_Coding-Shell-Parameter_Expansion">
|
|
<title>Parameter expansion</title>
|
|
<para>
|
|
The mechanism by which named shell variables and parameters are
|
|
expanded is called <emphasis>parameter expansion</emphasis>. The
|
|
most basic syntax is
|
|
“<literal>$</literal><emphasis>variable</emphasis>” or
|
|
“<literal>${</literal><emphasis>variable</emphasis><literal>}</literal>”.
|
|
</para>
|
|
<para>
|
|
In almost all cases, a parameter expansion should be enclosed in
|
|
double quotation marks <literal>"</literal>…<literal>"</literal>.
|
|
</para>
|
|
<informalexample>
|
|
<programlisting language="Bash">
|
|
external-program "$arg1" "$arg2"
|
|
</programlisting>
|
|
</informalexample>
|
|
<para>
|
|
If the double quotation marks are omitted, the value of the
|
|
variable will be split according to the current value of the
|
|
<envar>IFS</envar> variable. This may allow the injection of
|
|
additional options which are then processed by
|
|
<literal>external-program</literal>.
|
|
</para>
|
|
<para>
|
|
Parameter expansion can use special syntax for specific features,
|
|
such as substituting defaults or performing string or array
|
|
operations. These constructs should not be used because they can
|
|
trigger arithmetic evaluation, which can result in code execution.
|
|
See <xref linkend="sect-Defensive_Coding-Shell-Arithmetic"/>.
|
|
</para>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Double_Expansion">
|
|
<title>Double expansion</title>
|
|
<para>
|
|
<emphasis>Double expansion</emphasis> occurs when, during the
|
|
expansion of a shell variable, not just the variable is expanded,
|
|
replacing it by its value, but the <emphasis>value</emphasis> of
|
|
the variable is itself is expanded as well. This can trigger
|
|
arbitrary code execution, unless the value of the variable is
|
|
verified against a restrictive pattern.
|
|
</para>
|
|
<para>
|
|
The evaluation process is in fact recursive, so a self-referential
|
|
expression can cause an out-of-memory condition and a shell crash.
|
|
</para>
|
|
<para>
|
|
Double expansion may seem like as a defect, but it is implemented
|
|
by many shells, and has to be considered an integral part of the
|
|
shell programming language. However, it does make writing robust
|
|
shell scripts difficult.
|
|
</para>
|
|
<para>
|
|
Double expansion can be requested explicitly with the
|
|
<literal>eval</literal> built-in command, or by invoking a
|
|
subshell with “<literal>bash -c</literal>”. These constructs
|
|
should not be used.
|
|
</para>
|
|
<para>
|
|
The following sections give examples of places where implicit
|
|
double expansion occurs.
|
|
</para>
|
|
<section id="sect-Defensive_Coding-Shell-Arithmetic">
|
|
<title>Arithmetic evaluation</title>
|
|
<para>
|
|
<emphasis>Arithmetic evaluation</emphasis> is a process by which
|
|
the shell computes the integer value of an expression specified
|
|
as a string. It is highly problematic for two reasons: It
|
|
triggers double expansion (see <xref
|
|
linkend="sect-Defensive_Coding-Shell-Double_Expansion"/>), and the
|
|
language of arithmetic expressions is not self-contained. Some
|
|
constructs in arithmetic expressions (notably array subscripts)
|
|
provide a trapdoor from the restricted language of arithmetic
|
|
expressions to the full shell language, thus paving the way
|
|
towards arbitrary code execution. Due to double expansion,
|
|
input which is (indirectly) referenced from an arithmetic
|
|
expression can trigger execution of arbitrary code, which is
|
|
potentially harmful.
|
|
</para>
|
|
<para>
|
|
Arithmetic evaluation is triggered by the follow constructs:
|
|
</para>
|
|
<!-- The list was constructed by looking at the bash sources and
|
|
search for the string "expand_". -->
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
The <emphasis>expression</emphasis> in
|
|
“<literal>$((</literal><emphasis>expression</emphasis><literal>))</literal>”
|
|
is evaluated. This construct is called <emphasis>arithmetic
|
|
expansion</emphasis>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
“<literal>$[</literal><emphasis>expression</emphasis><literal>]</literal>”
|
|
is a deprecated syntax with the same effect.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The arguments to the <literal>let</literal> shell built-in
|
|
are evaluated.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
“<literal>((</literal><emphasis>expression</emphasis><literal>))</literal>”
|
|
is an alternative syntax for “<literal>let
|
|
</literal><emphasis>expression</emphasis>”.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Conditional expressions surrounded by
|
|
“<literal>[[</literal>…<literal>]]</literal>” can trigger
|
|
arithmetic evaluation if certain operators such as
|
|
<literal>-eq</literal> are used. (The
|
|
<literal>test</literal> built-in does not perform arithmetic
|
|
evaluation, even with integer operators such as
|
|
<literal>-eq</literal>.)
|
|
</para>
|
|
<para>
|
|
The conditional expression
|
|
“<literal>[[ $</literal><emphasis>variable</emphasis><literal> =~ </literal><emphasis>regexp</emphasis><literal> ]]</literal>”
|
|
can be used for input validation, assuming that
|
|
<emphasis>regexp</emphasis> is a constant regular
|
|
expression.
|
|
See <xref linkend="sect-Defensive_Coding-Shell-Input_Validation"/>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Certain parameter expansions, for example
|
|
“<literal>${</literal><emphasis>variable</emphasis><literal>[</literal><emphasis>expression</emphasis><literal>]}</literal>”
|
|
(array indexing) or
|
|
“<literal>${</literal><emphasis>variable</emphasis><literal>:</literal><emphasis>expression</emphasis><literal>}</literal>”
|
|
(string slicing), trigger arithmetic evaluation of
|
|
<emphasis>expression</emphasis>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Assignment to array elements using
|
|
“<emphasis>array_variable</emphasis><literal>[</literal><emphasis>subscript</emphasis><literal>]=</literal><emphasis>expression</emphasis>”
|
|
triggers evaluation of <emphasis>subscript</emphasis>, but
|
|
not <emphasis>expression</emphasis>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The expressions in the arithmetic <literal>for</literal>
|
|
command,
|
|
“<literal>for ((</literal><emphasis>expression1</emphasis><literal>; </literal><emphasis>expression2</emphasis><literal>; </literal><emphasis>expression3</emphasis><literal>)); do </literal><emphasis>commands</emphasis><literal>; done</literal>”
|
|
are evaluated. This does not apply to the regular
|
|
for command,
|
|
“<literal>for </literal><emphasis>variable</emphasis><literal> in </literal><emphasis>list</emphasis><literal>; do </literal><emphasis>commands</emphasis><literal>; done</literal>”.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<important>
|
|
<para>
|
|
Depending on the <application>bash</application> version, the
|
|
above list may be incomplete.
|
|
</para>
|
|
<para>
|
|
If faced with a situation where using such shell features
|
|
appears necessary, see <xref
|
|
linkend="sect-Defensive_Coding-Shell-Alternatives"/>.
|
|
</para>
|
|
</important>
|
|
<para>
|
|
If it is impossible to avoid shell arithmetic on untrusted
|
|
inputs, refer to <xref
|
|
linkend="sect-Defensive_Coding-Shell-Input_Validation"/>.
|
|
</para>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Types">
|
|
<title>Type declarations</title>
|
|
<para>
|
|
<application>bash</application> supports explicit type
|
|
declarations for shell variables:
|
|
</para>
|
|
<informalexample>
|
|
<programlisting language="Bash">
|
|
declare -i integer_variable
|
|
declare -a array_variable
|
|
declare -A assoc_array_variable
|
|
|
|
typeset -i integer_variable
|
|
typeset -a array_variable
|
|
typeset -A assoc_array_variable
|
|
|
|
local -i integer_variable
|
|
local -a array_variable
|
|
local -A assoc_array_variable
|
|
|
|
readonly -i integer_variable
|
|
readonly -a array_variable
|
|
readonly -A assoc_array_variable
|
|
</programlisting>
|
|
</informalexample>
|
|
<para>
|
|
Variables can also be declared as arrays by assigning them an
|
|
array expression, as in:
|
|
</para>
|
|
<informalexample>
|
|
<programlisting language="Bash">
|
|
array_variable=(1 2 3 4)
|
|
</programlisting>
|
|
</informalexample>
|
|
<para>
|
|
Some built-ins (such as <literal>mapfile</literal>) can
|
|
implicitly create array variables.
|
|
</para>
|
|
<para>
|
|
Such type declarations should not be used because assignment to
|
|
such variables (independent of the concrete syntax used for the
|
|
assignment) triggers arithmetic expansion (and thus double
|
|
expansion) of the right-hand side of the assignment operation.
|
|
See <xref linkend="sect-Defensive_Coding-Shell-Arithmetic"/>.
|
|
</para>
|
|
<para>
|
|
Shell scripts which use integer or array variables should be
|
|
rewritten in another, more suitable language. Se <xref
|
|
linkend="sect-Defensive_Coding-Shell-Alternatives"/>.
|
|
</para>
|
|
</section>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Obscure">
|
|
<title>Other obscurities</title>
|
|
<para>
|
|
Obscure shell language features should not be used. Examples are:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Exported functions (<literal>export -f</literal> or
|
|
<literal>declare -f</literal>).
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Function names which are not valid variable names, such as
|
|
“<literal>module::function</literal>”.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The possibility to override built-ins or external commands
|
|
with shell functions.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Changing the value of the <envar>IFS</envar> variable to
|
|
tokenize strings.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Invoke">
|
|
<title>Invoking external commands</title>
|
|
<para>
|
|
When passing shell variables as single command line arguments,
|
|
they should always be surrounded by double quotes. See
|
|
<xref linkend="sect-Defensive_Coding-Shell-Parameter_Expansion"/>.
|
|
</para>
|
|
<para>
|
|
Care is required when passing untrusted values as positional
|
|
parameters to external commands. If the value starts with a hyphen
|
|
“<literal>-</literal>”, it may be interpreted by the external
|
|
command as an option. Depending on the external program, a
|
|
“<literal>--</literal>” argument stops option processing and treats
|
|
all following arguments as positional parameters. (Double quotes
|
|
are completely invisible to the command being invoked, so they do
|
|
not prevent variable values from being interpreted as options.)
|
|
</para>
|
|
<para>
|
|
Cleaning the environment before invoking child processes is
|
|
difficult to implement in script. <application>bash</application>
|
|
keeps a hidden list of environment variables which do not correspond
|
|
to shell variables, and unsetting them from within a
|
|
<application>bash</application> script is not possible. To reset
|
|
the environment, a script can re-run itself under the “<literal>env
|
|
-i</literal>” command with an additional parameter which indicates
|
|
the environment has been cleared and suppresses a further
|
|
self-execution. Alternatively, individual commands can be executed
|
|
with “<literal>env -i</literal>”.
|
|
</para>
|
|
<important>
|
|
<para>
|
|
Complete isolation from its original execution environment
|
|
(which is required when the script is executed after a trust
|
|
transition, e.g., triggered by the SUID mechanism) is impossible
|
|
to achieve from within the shell script itself. Instead, the
|
|
invoking process has to clear the process environment (except for
|
|
few trusted variables) before running the shell script.
|
|
</para>
|
|
</important>
|
|
<para>
|
|
Checking for failures in executed external commands is recommended.
|
|
If no elaborate error recovery is needed, invoking “<literal>set
|
|
-e</literal>” may be sufficient. This causes the script to stop on
|
|
the first failed command. However, failures in pipes
|
|
(“<literal>command1 | command2</literal>”) are only detected for the
|
|
last command in the pipe, errors in previous commands are ignored.
|
|
This can be changed by invoking “<literal>set -o pipefail</literal>”.
|
|
Due to architectural limitations, only the process that spawned
|
|
the entire pipe can check for failures in individual commands;
|
|
it is not possible for a process to tell if the process feeding
|
|
data (or the process consuming data) exited normally or with
|
|
an error.
|
|
</para>
|
|
<para>
|
|
See <xref linkend="sect-Defensive_Coding-Tasks-Processes-Creation"/>
|
|
for additional details on creating child processes.
|
|
</para>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Temporary_Files">
|
|
<title>Temporary files</title>
|
|
<para>
|
|
Temporary files should be created with the
|
|
<literal>mktemp</literal> command, and temporary directories with
|
|
“<literal>mktemp -d</literal>”.
|
|
</para>
|
|
<para>
|
|
To clean up temporary files and directories, write a clean-up
|
|
shell function and register it as a trap handler, as shown in
|
|
<xref linkend="ex-Defensive_Coding-Tasks-Temporary_Files"/>.
|
|
Using a separate function avoids issues with proper quoting of
|
|
variables.
|
|
</para>
|
|
<example id="ex-Defensive_Coding-Tasks-Temporary_Files">
|
|
<title>Creating and cleaning up temporary files</title>
|
|
<informalexample>
|
|
<programlisting language="Bash">
|
|
tmpfile="$(mktemp)"
|
|
|
|
cleanup () {
|
|
rm -f -- "$tmpfile"
|
|
}
|
|
|
|
trap cleanup 0
|
|
</programlisting>
|
|
</informalexample>
|
|
</example>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Input_Validation">
|
|
<title>Performing input validation</title>
|
|
<para>
|
|
In some cases, input validation cannot be avoided. For example,
|
|
if arithmetic evaluation is absolutely required, it is imperative
|
|
to check that input values are, in fact, integers. See <xref
|
|
linkend="sect-Defensive_Coding-Shell-Arithmetic"/>.
|
|
</para>
|
|
<para>
|
|
<xref linkend="ex-Defensive_Coding-Shell-Input_Validation"/>
|
|
shows a construct which can be used to check if a string
|
|
“<literal>$value</literal>” is an integer. This construct is
|
|
specific to <application>bash</application> and not portable to
|
|
POSIX shells.
|
|
</para>
|
|
<example id="ex-Defensive_Coding-Shell-Input_Validation">
|
|
<title>Input validation in <application>bash</application></title>
|
|
<xi:include href="snippets/Shell-Input_Validation.xml"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude" />
|
|
</example>
|
|
<para>
|
|
Using <literal>case</literal> statements for input validation is
|
|
also possible and supported by other (POSIX) shells, but the
|
|
pattern language is more restrictive, and it can be difficult to
|
|
write suitable patterns.
|
|
</para>
|
|
<para>
|
|
The <literal>expr</literal> external command can give misleading
|
|
results (e.g., if the value being checked contains operators
|
|
itself) and should not be used.
|
|
</para>
|
|
</section>
|
|
<section id="sect-Defensive_Coding-Shell-Edit_Guard">
|
|
<title>Guarding shell scripts against changes</title>
|
|
<para>
|
|
<application>bash</application> only reads a shell script up to
|
|
the point it is needed for executed the next command. This means
|
|
that if script is overwritten while it is running, execution can
|
|
jump to a random part of the script, depending on what is modified
|
|
in the script and how the file offsets change as a result. (This
|
|
behavior is needed to support self-extracting shell archives whose
|
|
script part is followed by a stream of bytes which does not follow
|
|
the shell language syntax.)
|
|
</para>
|
|
<para>
|
|
Therefore, long-running scripts should be guarded against
|
|
concurrent modification by putting as much of the program logic
|
|
into a <literal>main</literal> function, and invoking the
|
|
<literal>main</literal> function at the end of the script, using
|
|
this syntax:
|
|
</para>
|
|
<informalexample>
|
|
<programlisting language="Bash">
|
|
main "$@" ; exit $?
|
|
</programlisting>
|
|
</informalexample>
|
|
<para>
|
|
This construct ensures that <application>bash</application> will
|
|
stop execution after the <literal>main</literal> function, instead
|
|
of opening the script file and trying to read more commands.
|
|
</para>
|
|
</section>
|
|
</chapter>
|