Add a chapter on shell programming
This commit is contained in:
parent
fab2049127
commit
2a829115ff
2 changed files with 410 additions and 0 deletions
|
@ -8,6 +8,7 @@
|
|||
<xi:include href="CXX.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
|
||||
<xi:include href="Java.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
|
||||
<xi:include href="Python.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
|
||||
<xi:include href="Shell.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
|
||||
<xi:include href="Go.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
|
||||
<xi:include href="Vala.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
|
||||
</part>
|
||||
|
|
409
defensive-coding/en-US/Shell.xml
Normal file
409
defensive-coding/en-US/Shell.xml
Normal file
|
@ -0,0 +1,409 @@
|
|||
<?xml version='1.0' encoding='utf-8' ?>
|
||||
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
|
||||
]>
|
||||
<chapter id="chap-Defensive_Coding-Shell">
|
||||
<title>Shell Programming and <application>bash</application></title>
|
||||
<para>
|
||||
This chapter contains advice about shell programming, specifically
|
||||
in <application>bash</application>. Most of the advice will apply
|
||||
to scripts written for other shells because extensions such as
|
||||
integer or array variables have been implemented there as well, with
|
||||
comparable syntax.
|
||||
</para>
|
||||
<section id="sect-Defensive_Coding-Shell-Alternatives">
|
||||
<title>Consider alternatives</title>
|
||||
<para>
|
||||
Once a shell script is so complex that advice in this chapter
|
||||
applies, it is time to step back and consider the question: Is
|
||||
there a more suitable implementation language available?
|
||||
</para>
|
||||
<para>
|
||||
For example, Python with its <literal>subprocess</literal> module
|
||||
can be used to write scripts which are almost as concise as shell
|
||||
scripts when it comes to invoking external programs, and Python
|
||||
offers richer data structures, with less arcane syntax and more
|
||||
consistent behavior.
|
||||
</para>
|
||||
</section>
|
||||
<section id="sect-Defensive_Coding-Shell-Language">
|
||||
<title>Shell language features</title>
|
||||
<para>
|
||||
The following sections cover subtleties concerning the shell
|
||||
programming languages. They have been written with the
|
||||
<application>bash</application> shell in mind, but some of these
|
||||
features apply to other shells as well.
|
||||
</para>
|
||||
<para>
|
||||
Some of the features described may seem like implementation defects,
|
||||
but these features have been replicated across multiple independent
|
||||
implementations, so they now have to be considered part of the shell
|
||||
programming language.
|
||||
</para>
|
||||
<section id="sect-Defensive_Coding-Shell-Parameter_Expansion">
|
||||
<title>Parameter expansion</title>
|
||||
<para>
|
||||
The mechanism by which named shell variables and parameters are
|
||||
expanded is called <emphasis>parameter expansion</emphasis>. The
|
||||
most basic syntax is
|
||||
“<literal>$</literal><emphasis>variable</emphasis>” or
|
||||
“<literal>${</literal><emphasis>variable</emphasis><literal>}</literal>”.
|
||||
</para>
|
||||
<para>
|
||||
In almost all cases, a parameter expansion should be enclosed in
|
||||
double quotation marks <literal>"</literal>…<literal>"</literal>.
|
||||
</para>
|
||||
<informalexample>
|
||||
<programlisting language="Bash">
|
||||
external-program "$arg1" "$arg2"
|
||||
</programlisting>
|
||||
</informalexample>
|
||||
<para>
|
||||
If the double quotation marks are omitted, the value of the
|
||||
variable will be split according to the current value of the
|
||||
<envar>IFS</envar> variable. This may allow the injection of
|
||||
additional options which are then processed by
|
||||
<literal>external-program</literal>.
|
||||
</para>
|
||||
<para>
|
||||
Parameter expansion can use special syntax for specific features,
|
||||
such as substituting defaults or performing string or array
|
||||
operations. These constructs should not be used because they can
|
||||
trigger arithmetic evaluation, which can result in code execution.
|
||||
See <xref linkend="sect-Defensive_Coding-Shell-Arithmetic"/>.
|
||||
</para>
|
||||
</section>
|
||||
<section id="sect-Defensive_Coding-Shell-Double_Expansion">
|
||||
<title>Double expansion</title>
|
||||
<para>
|
||||
<emphasis>Double expansion</emphasis> occurs when, during the
|
||||
expansion of a shell variable, not just the variable is expanded,
|
||||
replacing it by its value, but the <emphasis>value</emphasis> of
|
||||
the variable is itself is expanded as well. This can trigger
|
||||
arbitrary code execution, unless the value of the variable is
|
||||
verified against a restrictive pattern.
|
||||
</para>
|
||||
<para>
|
||||
The evaluation process is in fact recursive, so a self-referential
|
||||
expression can cause an out-of-memory condition and a shell crash.
|
||||
</para>
|
||||
<para>
|
||||
Double expansion may seem like as a defect, but it is implemented
|
||||
by many shells, and has to be considered an integral part of the
|
||||
shell programming language. However, it does make writing robust
|
||||
shell scripts difficult.
|
||||
</para>
|
||||
<para>
|
||||
Double evaluation can be requested explicitly with the
|
||||
<literal>eval</literal> built-in command, or by invoking a
|
||||
subshell with “<literal>bash -c</literal>”. These constructs
|
||||
should not be used.
|
||||
</para>
|
||||
<para>
|
||||
The following sections give examples of places where implicit
|
||||
double expansion occurs.
|
||||
</para>
|
||||
<section id="sect-Defensive_Coding-Shell-Arithmetic">
|
||||
<title>Arithmetic evaluation</title>
|
||||
<para>
|
||||
<emphasis>Arithmetic evaluation</emphasis> is a process by which
|
||||
the shell computes the integer value of an expression specified
|
||||
as a string. It is highly problematic for two reasons: It
|
||||
triggers double evaluation (see <xref
|
||||
linkend="sect-Defensive_Coding-Shell-Arithmetic"/>), and the
|
||||
language of arithmetic expressions is not self-contained. Some
|
||||
constructs in arithmetic expressions (notably array subscripts)
|
||||
provide a trapdoor from the restricted language of arithmetic
|
||||
expressions to the full shell language, thus paving the way
|
||||
towards arbitrary code execution. Due to double expansion,
|
||||
input which is (indirectly) referenced from an arithmetic
|
||||
expression can trigger execution of arbitrary code, which is
|
||||
potentially harmful.
|
||||
</para>
|
||||
<para>
|
||||
Arithmetic evaluation is triggered by the follow constructs:
|
||||
</para>
|
||||
<!-- The list was constructed by looking at the bash sources and
|
||||
search for the string "expand_". -->
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
The <emphasis>expression</emphasis> in
|
||||
“<literal>$((</literal><emphasis>expression</emphasis><literal>))</literal>”
|
||||
is evaluated. This construct is called <emphasis>arithmetic
|
||||
expansion</emphasis>.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
“<literal>$[</literal><emphasis>expression</emphasis><literal>]</literal>”
|
||||
is a deprecated syntax with the same effect.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
The arguments to the <literal>let</literal> shell built-in
|
||||
are evaluated.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
“<literal>((</literal><emphasis>expression</emphasis><literal>))</literal>”
|
||||
is an alternative syntax for “<literal>let
|
||||
</literal><emphasis>expression</emphasis>”.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Conditional expressions surrounded by
|
||||
“<literal>[[</literal>…<literal>]]</literal>” can trigger
|
||||
arithmetic evaluation if certain operators such as
|
||||
<literal>-eq</literal> are used. (The
|
||||
<literal>test</literal> built-in does not perform arithmetic
|
||||
evaluation, even with integer operators such as
|
||||
<literal>-eq</literal>.)
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Certain parameter expansions, for example
|
||||
“<literal>${</literal><emphasis>variable</emphasis><literal>[</literal><emphasis>expression</emphasis><literal>]}</literal>”
|
||||
(array indexing) or
|
||||
“<literal>${</literal><emphasis>variable</emphasis><literal>:</literal><emphasis>expression</emphasis><literal>}</literal>”
|
||||
(string slicing), trigger arithmetic evaluation of
|
||||
<emphasis>expression</emphasis>.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Assignment to array elements using
|
||||
“<emphasis>array_variable</emphasis><literal>[</literal><emphasis>subscript</emphasis><literal>]=</literal><emphasis>expression</emphasis>”
|
||||
triggers evaluation of <emphasis>subscript</emphasis>, but
|
||||
not <emphasis>expression</emphasis>.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
The expressions in the arithmetic <literal>for</literal>
|
||||
command,
|
||||
“<literal>for ((</literal><emphasis>expression1</emphasis><literal>; </literal><emphasis>expression2</emphasis><literal>; </literal><emphasis>expression3</emphasis><literal>)); do </literal><emphasis>commands</emphasis><literal>; done</literal>”
|
||||
are evaluated. This does not apply to the regular
|
||||
for command,
|
||||
“<literal>for </literal><emphasis>variable</emphasis><literal> in </literal><emphasis>list</emphasis><literal>; do </literal><emphasis>commands</emphasis><literal>; done</literal>”.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<important>
|
||||
<para>
|
||||
Depending on the <application>bash</application> version, the
|
||||
above list may be incomplete.
|
||||
</para>
|
||||
<para>
|
||||
If faced with a situation where using such shell features
|
||||
appears necessary, see <xref
|
||||
linkend="sect-Defensive_Coding-Shell-Alternatives"/>.
|
||||
</para>
|
||||
</important>
|
||||
</section>
|
||||
<section id="sect-Defensive_Coding-Shell-Types">
|
||||
<title>Type declarations</title>
|
||||
<para>
|
||||
<application>bash</application> supports explicit type
|
||||
declarations for shell variables:
|
||||
</para>
|
||||
<informalexample>
|
||||
<programlisting language="Bash">
|
||||
declare -i integer_variable
|
||||
declare -a array_variable
|
||||
declare -A assoc_array_variable
|
||||
|
||||
typeset -i integer_variable
|
||||
typeset -a array_variable
|
||||
typeset -A assoc_array_variable
|
||||
|
||||
local -i integer_variable
|
||||
local -a array_variable
|
||||
local -A assoc_array_variable
|
||||
|
||||
readonly -i integer_variable
|
||||
readonly -a array_variable
|
||||
readonly -A assoc_array_variable
|
||||
</programlisting>
|
||||
</informalexample>
|
||||
<para>
|
||||
Variables can also be declared as arrays by assigning them an
|
||||
array expression, as in:
|
||||
</para>
|
||||
<informalexample>
|
||||
<programlisting language="Bash">
|
||||
array_variable=(1 2 3 4)
|
||||
</programlisting>
|
||||
</informalexample>
|
||||
<para>
|
||||
Some built-ins (such as <literal>mapfile</literal>) can
|
||||
implicitly create array variables.
|
||||
</para>
|
||||
<para>
|
||||
Such type declarations should not be used because assignment to
|
||||
such variables (independent of the concrete syntax used for the
|
||||
assignment) triggers arithmetic expansion (and thus double
|
||||
expansion) of the right-hand side of the assignment operation.
|
||||
See <xref linkend="sect-Defensive_Coding-Shell-Arithmetic"/>.
|
||||
</para>
|
||||
<para>
|
||||
Shell scripts which use integer or array variables should be
|
||||
rewritten in another, more suitable language. Se <xref
|
||||
linkend="sect-Defensive_Coding-Shell-Alternatives"/>.
|
||||
</para>
|
||||
</section>
|
||||
</section>
|
||||
<section id="sect-Defensive_Coding-Shell-Obscure">
|
||||
<title>Other obscurities</title>
|
||||
<para>
|
||||
Obscure shell language features should not be used. Examples are:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Exported functions (<literal>export -f</literal> or
|
||||
<literal>declare -f</literal>).
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Function names which are not valid variable names, such as
|
||||
“<literal>module::function</literal>”.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
The possibility to override built-ins or external commands
|
||||
with shell functions.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Changing the value of the <envar>IFS</envar> variable to
|
||||
tokenize strings.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</section>
|
||||
</section>
|
||||
<section id="sect-Defensive_Coding-Shell-Invoke">
|
||||
<title>Invoking external commands</title>
|
||||
<para>
|
||||
When passing shell variables as single command line arguments,
|
||||
they should always be surrounded by double quotes. See
|
||||
<xref linkend="sect-Defensive_Coding-Shell-Parameter_Expansion"/>.
|
||||
</para>
|
||||
<para>
|
||||
Care is required when passing untrusted values as positional
|
||||
parameters to external commands. If the value starts with a hyphen
|
||||
“<literal>-</literal>”, it may be interpreted by the external
|
||||
command as an option. Depending on the external program, a
|
||||
“<literal>--</literal>” argument stops option processing and treats
|
||||
all following arguments as positional parameters. (Double quotes
|
||||
are completely invisible to the command being invoked, so they do
|
||||
not prevent variable values from being interpreted as options.)
|
||||
</para>
|
||||
<para>
|
||||
Cleaning the environment before invoking child processes is
|
||||
difficult to implement in script. <application>bash</application>
|
||||
keeps a hidden list of environment variables which do not correspond
|
||||
to shell variables, and unsetting them from within a
|
||||
<application>bash</application> script is not possible. To reset
|
||||
the environment, a script can re-run itself under the “<literal>env
|
||||
-i</literal>” command with an additional parameter which indicates
|
||||
the environment has been cleared and suppresses a further
|
||||
self-execution. Alternatively, individual commands can be executed
|
||||
with “<literal>env -i</literal>”.
|
||||
</para>
|
||||
<important>
|
||||
<para>
|
||||
Completely isolation from its original execution environment
|
||||
(which is required when the script is executed after a trust
|
||||
transition, e.g., triggered by the SUID mechanism) is impossible
|
||||
to achieve from within the shell script itself. Instead, the
|
||||
invoking process has to clear the process environment (except for
|
||||
few trusted variables) before running the shell script.
|
||||
</para>
|
||||
</important>
|
||||
<para>
|
||||
Checking for failures in executed external commands is recommended.
|
||||
If no elaborate error recovery is needed, invoking “<literal>set
|
||||
-e</literal>” may be sufficient. This causes the script to stop on
|
||||
the first failed command. However, failures in pipes
|
||||
(“<literal>command1 | command2</literal>”) are only detected for the
|
||||
last command in the pipe, errors in previous commands are ignored.
|
||||
This can be changed by invoking “<literal>set -o pipefail</literal>”.
|
||||
Due to architectural limitations, only the process that spawned
|
||||
the entire pipe can check for failures in individual commands;
|
||||
it is not possible for a process to tell if the process feeding
|
||||
data (or the process consuming data) exited normally or with
|
||||
an error.
|
||||
</para>
|
||||
<para>
|
||||
See <xref linkend="sect-Defensive_Coding-Tasks-Processes-Creation"/>
|
||||
for additional details on creating child processes.
|
||||
</para>
|
||||
</section>
|
||||
<section id="sect-Defensive_Coding-Shell-Temporary_Files">
|
||||
<title>Temporary files</title>
|
||||
<para>
|
||||
Temporary files should be created with the
|
||||
<literal>mktemp</literal> command, and temporary directories with
|
||||
“<literal>mktemp -d</literal>”.
|
||||
</para>
|
||||
<para>
|
||||
To clean up temporary files and directories, write a clean-up
|
||||
shell function and register it as a trap handler, as shown in
|
||||
<xref linkend="ex-Defensive_Coding-Tasks-Temporary_Files"/>.
|
||||
Using a separate function avoids issues with proper quoting of
|
||||
variables.
|
||||
</para>
|
||||
<example id="ex-Defensive_Coding-Tasks-Temporary_Files">
|
||||
<title>Creating and cleaning up temporary files</title>
|
||||
<informalexample>
|
||||
<programlisting language="Bash">
|
||||
tmpfile="$(mktemp)"
|
||||
|
||||
cleanup () {
|
||||
rm -f -- "$tmpfile"
|
||||
}
|
||||
|
||||
trap cleanup 0
|
||||
</programlisting>
|
||||
</informalexample>
|
||||
</example>
|
||||
</section>
|
||||
<section id="sect-Defensive_Coding-Shell-Edit_Guard">
|
||||
<title>Guarding shell scripts against changes</title>
|
||||
<para>
|
||||
<application>bash</application> only reads a shell script up to
|
||||
the point it is needed for executed the next command. This means
|
||||
that if script is overwritten while it is running, execution can
|
||||
jump to a random part of the script, depending on what is modified
|
||||
in the script and how the file offsets change as a result. (This
|
||||
behavior is needed to support self-extracting shell archives whose
|
||||
script part is followed by a stream of bytes which does not follow
|
||||
the shell language syntax.)
|
||||
</para>
|
||||
<para>
|
||||
Therefore, long-running scripts should be guarded against
|
||||
concurrent modification by putting as much of the program logic
|
||||
into a <literal>main</literal> function, and invoking the
|
||||
<literal>main</literal> function at the end of the script, using
|
||||
this syntax:
|
||||
</para>
|
||||
<informalexample>
|
||||
<programlisting language="Bash">
|
||||
main "$@" ; exit $?
|
||||
</programlisting>
|
||||
</informalexample>
|
||||
<para>
|
||||
This construct ensures that <application>bash</application> will
|
||||
stop execution after the <literal>main</literal> function, instead
|
||||
of opening the script file and trying to read more commands.
|
||||
</para>
|
||||
</section>
|
||||
</chapter>
|
Loading…
Add table
Add a link
Reference in a new issue