Removed non-Defensive Coding Guide bits and promoted source to root

2016-07-18 10:41:17 -04:00 · 2016-07-18 10:41:17 -04:00 · 9dc8a003e5
commit 9dc8a003e5
parent 9eb72b454b
402 changed files with 0 additions and 2049 deletions
--- a/en-US/Tasks-Processes.xml
+++ b/en-US/Tasks-Processes.xml
@ -0,0 +1,483 @@
+<?xml version='1.0' encoding='utf-8' ?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
+]>
+<chapter id="sect-Defensive_Coding-Tasks-Processes">
+  <title>Processes</title>
+
+  <section id="sect-Defensive_Coding-Tasks-Processes-Creation">
+    <title>Safe process creation</title>
+    <para>
+      This section describes how to create new child processes in a
+      safe manner.  In addition to the concerns addressed below, there
+      is the possibility of file descriptor leaks, see <xref
+      linkend="sect-Defensive_Coding-Tasks-Descriptors-Child_Processes"/>.
+    </para>
+    <section>
+      <title>Obtaining the program path and the command line
+      template</title>
+      <para>
+	The name and path to the program being invoked should be
+	hard-coded or controlled by a static configuration file stored
+	at a fixed location (at an file system absolute path).  The
+	same applies to the template for generating the command line.
+      </para>
+      <para>
+	The configured program name should be an absolute path.  If it
+	is a relative path, the contents of the <envar>PATH</envar>
+	must be obtained in a secure manner (see <xref
+	linkend="sect-Defensive_Coding-Tasks-secure_getenv"/>).
+	If the <envar>PATH</envar> variable is not set or untrusted,
+	the safe default <literal>/bin:/usr/bin</literal> must be
+	used.
+      </para>
+      <para>
+	If too much flexibility is provided here, it may allow
+	invocation of arbitrary programs without proper authorization.
+      </para>
+    </section>
+
+    <section id="sect-Defensive_Coding-Tasks-Processes-execve">
+      <title>Bypassing the shell</title>
+      <para>
+	Child processes should be created without involving the system
+	shell.
+      </para>
+      <para>
+	For C/C++, <function>system</function> should not be used.
+	The <function>posix_spawn</function> function can be used
+	instead, or a combination <function>fork</function> and
+	<function>execve</function>.  (In some cases, it may be
+	preferable to use <function>vfork</function> or the
+	Linux-specific <function>clone</function> system call instead
+	of <function>fork</function>.)
+      </para>
+      <para>
+	In Python, the <literal>subprocess</literal> module bypasses
+	the shell by default (when the <literal>shell</literal>
+	keyword argument is not set to true).
+	<function>os.system</function> should not be used.
+      </para>
+      <para>
+	The Java class <type>java.lang.ProcessBuilder</type> can be
+	used to create subprocesses without interference from the
+	system shell.
+      </para>
+      <important>
+	<title>Portability notice</title>
+	<para>
+	  On Windows, there is no argument vector, only a single
+	  argument string.  Each application is responsible for parsing
+	  this string into an argument vector.  There is considerable
+	  variance among the quoting style recognized by applications.
+	  Some of them expand shell wildcards, others do not.  Extensive
+	  application-specific testing is required to make this secure.
+	</para>
+      </important>
+      <para>
+	Note that some common applications (notably
+	<application>ssh</application>) unconditionally introduce the
+	use of a shell, even if invoked directly without a shell.  It is
+	difficult to use these applications in a secure manner.  In this
+	case, untrusted data should be supplied by other means.  For
+	example, standard input could be used, instead of the command
+	line.
+      </para>
+    </section>
+
+    <section id="sect-Defensive_Coding-Tasks-Processes-environ">
+      <title>Specifying the process environment</title>
+      <para>
+	Child processes should be created with a minimal set of
+	environment variables.  This is absolutely essential if there
+	is a trust transition involved, either when the parent process
+	was created, or during the creation of the child process.
+      </para>
+      <para>
+	In C/C++, the environment should be constructed as an array of
+	strings and passed as the <varname>envp</varname> argument to
+	<function>posix_spawn</function> or <function>execve</function>.
+	The functions <function>setenv</function>,
+	<function>unsetenv</function> and <function>putenv</function>
+	should not be used.  They are not thread-safe and suffer from
+	memory leaks.
+      </para>
+      <para>
+	Python programs need to specify a <literal>dict</literal> for
+	the the <varname>env</varname> argument of the
+	<function>subprocess.Popen</function> constructor.
+	The Java class <literal>java.lang.ProcessBuilder</literal>
+	provides a <function>environment()</function> method,
+	which returns a map that can be manipulated.
+      </para>
+      <para>
+	The following list provides guidelines for selecting the set
+	of environment variables passed to the child process.
+      </para>
+      <itemizedlist>
+	<listitem>
+	  <para>
+	    <envar>PATH</envar> should be initialized to
+	    <literal>/bin:/usr/bin</literal>.
+	  </para>
+	</listitem>
+	<listitem>
+	  <para>
+	    <envar>USER</envar> and <envar>HOME</envar> can be inhereted
+	    from the parent process environment, or they can be
+	    initialized from the <literal>pwent</literal> structure
+	    for the user. <!-- ??? refer to dropping privileges -->
+	  </para>
+	</listitem>
+	<listitem>
+	  <para>The <envar>DISPLAY</envar> and <envar>XAUTHORITY</envar>
+	  variables should be passed to the subprocess if it is an X
+	  program.  Note that this will typically not work across trust
+	  boundaries because <envar>XAUTHORITY</envar> refers to a file
+	  with <literal>0600</literal> permissions.
+	  </para>
+	</listitem>
+	<listitem>
+	  <para>
+	    The location-related environment variables
+	    <envar>LANG</envar>, <envar>LANGUAGE</envar>,
+	    <envar>LC_ADDRESS</envar>, <envar>LC_ALL</envar>,
+	    <envar>LC_COLLATE</envar>, <envar>LC_CTYPE</envar>,
+	    <envar>LC_IDENTIFICATION</envar>,
+	    <envar>LC_MEASUREMENT</envar>, <envar>LC_MESSAGES</envar>,
+	    <envar>LC_MONETARY</envar>, <envar>LC_NAME</envar>,
+	    <envar>LC_NUMERIC</envar>, <envar>LC_PAPER</envar>,
+	    <envar>LC_TELEPHONE</envar> and <envar>LC_TIME</envar>
+	    can be passed to the subprocess if present.
+	  </para>
+	</listitem>
+	<listitem>
+	  <para>
+	    The called process may need application-specific
+	    environment variables, for example for passing passwords.
+	    (See <xref
+	    linkend="sect-Defensive_Coding-Tasks-Processes-Command_Line_Visibility"/>.)
+	  </para>
+	</listitem>
+	<listitem>
+	  <para>
+	    All other environment variables should be dropped.  Names
+	    for new environment variables should not be accepted from
+	    untrusted sources.
+	  </para>
+	</listitem>
+      </itemizedlist>
+    </section>
+
+    <section>
+      <title>Robust argument list processing</title>
+      <para>
+	When invoking a program, it is sometimes necessary to include
+	data from untrusted sources.  Such data should be checked
+	against embedded <literal>NUL</literal> characters because the
+	system APIs will silently truncate argument strings at the first
+	<literal>NUL</literal> character.
+      </para>
+      <para>
+	The following recommendations assume that the program being
+	invoked uses GNU-style option processing using
+	<function>getopt_long</function>.  This convention is widely
+	used, but it is just that, and individual programs might
+	interpret a command line in a different way.
+      </para>
+      <para>
+	If the untrusted data has to go into an option, use the
+	<literal>--option-name=VALUE</literal> syntax, placing the
+	option and its value into the same command line argument.
+	This avoids any potential confusion if the data starts with
+	<literal>-</literal>.
+      </para>
+      <para>
+	For positional arguments, terminate the option list with a
+	single <option>--</option> marker after the last option, and
+	include the data at the right position.  The
+	<option>--</option> marker terminates option processing, and
+	the data will not be treated as an option even if it starts
+	with a dash.
+      </para>
+    </section>
+
+    <section id="sect-Defensive_Coding-Tasks-Processes-Command_Line_Visibility">
+      <title>Passing secrets to subprocesses</title>
+      <para>
+	The command line (the name of the program and its argument) of
+	a running process is traditionally available to all local
+	users.  The called program can overwrite this information, but
+	only after it has run for a bit of time, during which the
+	information may have been read by other processes.  However,
+	on Linux, the process environment is restricted to the user
+	who runs the process.  Therefore, if you need a convenient way
+	to pass a password to a child process, use an environment
+	variable, and not a command line argument.  (See <xref
+	linkend="sect-Defensive_Coding-Tasks-Processes-environ"/>.)
+      </para>
+      <important>
+	<title>Portability notice</title>
+	<para>
+	  On some UNIX-like systems (notably Solaris), environment
+	  variables can be read by any system user, just like command
+	  lines.
+	</para>
+      </important>
+      <para>
+	If the environment-based approach cannot be used due to
+	portability concerns, the data can be passed on standard
+	input.  Some programs (notably <application>gpg</application>)
+	use special file descriptors whose numbers are specified on
+	the command line.  Temporary files are an option as well, but
+	they might give digital forensics access to sensitive data
+	(such as passphrases) because it is difficult to safely delete
+	them in all cases.
+      </para>
+    </section>
+  </section>
+
+  <section>
+    <title>Handling child process termination</title>
+    <para>
+      When child processes terminate, the parent process is signalled.
+      A stub of the terminated processes (a
+      <emphasis>zombie</emphasis>, shown as
+      <literal>&lt;defunct&gt;</literal> by
+      <application>ps</application>) is kept around until the status
+      information is collected (<emphasis>reaped</emphasis>) by the
+      parent process.  Over the years, several interfaces for this
+      have been invented:
+    </para>
+    <itemizedlist>
+      <listitem>
+	<para>
+	  The parent process calls <function>wait</function>,
+	  <function>waitpid</function>, <function>waitid</function>,
+	  <function>wait3</function> or <function>wait4</function>,
+	  without specifying a process ID.  This will deliver any
+	  matching process ID.  This approach is typically used from
+	  within event loops.
+	</para>
+      </listitem>
+      <listitem>
+	<para>
+	  The parent process calls <function>waitpid</function>,
+	  <function>waitid</function>, or <function>wait4</function>,
+	  with a specific process ID.  Only data for the specific
+	  process ID is returned.  This is typically used in code
+	  which spawns a single subprocess in a synchronous manner.
+	</para>
+      </listitem>
+      <listitem>
+	<para>
+	  The parent process installs a handler for the
+	  <literal>SIGCHLD</literal> signal, using
+	  <function>sigaction</function>, and specifies to the
+	  <literal>SA_NOCLDWAIT</literal> flag.
+	  This approach could be used by event loops as well.
+	</para>
+      </listitem>
+    </itemizedlist>
+    <para>
+      None of these approaches can be used to wait for child process
+      terminated in a completely thread-safe manner.  The parent
+      process might execute an event loop in another thread, which
+      could pick up the termination signal.  This means that libraries
+      typically cannot make free use of child processes (for example,
+      to run problematic code with reduced privileges in a separate
+      address space).
+    </para>
+    <para>
+      At the moment, the parent process should explicitly wait for
+      termination of the child process using
+      <function>waitpid</function> or <function>waitid</function>,
+      and hope that the status is not collected by an event loop
+      first.
+    </para>
+  </section>
+
+  <section>
+    <title><literal>SUID</literal>/<literal>SGID</literal>
+    processes</title>
+    <!-- ??? need to document real vs effective UID -->
+    <para>
+      Programs can be marked in the file system to indicate to the
+      kernel that a trust transition should happen if the program is
+      run.  The <literal>SUID</literal> file permission bit indicates
+      that an executable should run with the effective user ID equal
+      to the owner of the executable file.  Similarly, with the
+      <literal>SGID</literal> bit, the effective group ID is set to
+      the group of the executable file.
+    </para>
+    <para>
+      Linux supports <emphasis>fscaps</emphasis>, which can grant
+      additional capabilities to a process in a finer-grained manner.
+      Additional mechanisms can be provided by loadable security
+      modules.
+    </para>
+    <para>
+      When such a trust transition has happened, the process runs in a
+      potentially hostile environment.  Additional care is necessary
+      not to rely on any untrusted information.  These concerns also
+      apply to libraries which can be linked into such processes.
+    </para>
+
+    <section id="sect-Defensive_Coding-Tasks-secure_getenv">
+      <title>Accessing environment variables</title>
+      <para>
+	The following steps are required so that a program does not
+	accidentally pick up untrusted data from environment
+	variables.
+      </para>
+      <itemizedlist>
+	<listitem><para>
+	  Compile your C/C++ sources with <literal>-D_GNU_SOURCE</literal>.
+	  The Autoconf macro <literal>AC_GNU_SOURCE</literal> ensures this.
+	</para></listitem>
+	<listitem><para>
+	  Check for the presence of the <function>secure_getenv</function>
+	  and <function>__secure_getenv</function> function.  The Autoconf
+	  directive <literal>AC_CHECK_FUNCS([__secure_getenv secure_getenv])</literal>
+	  performs these checks.
+	</para></listitem>
+	<listitem><para>
+	  Arrange for a proper definition of the
+	  <function>secure_getenv</function> function.  See <xref
+	  linkend="ex-Defensive_Coding-Tasks-secure_getenv"/>.
+	</para></listitem>
+	<listitem><para>
+	  Use <function>secure_getenv</function> instead of
+	  <function>getenv</function> to obtain the value of critical
+	  environment variables.  <function>secure_getenv</function>
+	  will pretend the variable has not bee set if the process
+	  environment is not trusted.
+	</para></listitem>
+      </itemizedlist>
+      <para>
+	Critical environment variables are debugging flags,
+	configuration file locations, plug-in and log file locations,
+	and anything else that might be used to bypass security
+	restrictions or cause a privileged process to behave in an
+	unexpected way.
+      </para>
+      <para>
+	Either the <function>secure_getenv</function> function or the
+	<function>__secure_getenv</function> is available from GNU libc.
+      </para>
+      <example id="ex-Defensive_Coding-Tasks-secure_getenv">
+	<title>Obtaining a definition for <function>secure_getenv</function></title>
+	<programlisting language="C">
+<![CDATA[
+#include <stdlib.h>
+
+#ifndef HAVE_SECURE_GETENV
+#  ifdef HAVE__SECURE_GETENV
+#    define secure_getenv __secure_getenv
+#  else
+#    error neither secure_getenv nor __secure_getenv are available
+#  endif
+#endif
+]]>
+	</programlisting>
+      </example>
+    </section>
+  </section>
+
+  <section id="sect-Defensive_Coding-Tasks-Processes-Daemons">
+    <title>Daemons</title>
+    <para>
+      Background processes providing system services
+      (<emphasis>daemons</emphasis>) need to decouple themselves from
+      the controlling terminal and the parent process environment:
+    </para>
+    <itemizedlist>
+      <listitem>
+	<para>Fork.</para>
+      </listitem>
+      <listitem>
+	<para>
+	  In the child process, call <function>setsid</function>.  The
+	  parent process can simply exit (using
+	  <function>_exit</function>, to avoid running clean-up
+	  actions twice).
+	</para>
+      </listitem>
+      <listitem>
+	<para>
+	  In the child process, fork again.  Processing continues in
+	  the child process.  Again, the parent process should just
+	  exit.
+	</para>
+      </listitem>
+      <listitem>
+	<para>
+	  Replace the descriptors 0, 1, 2 with a descriptor for
+	  <filename>/dev/null</filename>.  Logging should be
+	  redirected to <application>syslog</application>.
+	</para>
+      </listitem>
+    </itemizedlist>
+    <para>
+      Older instructions for creating daemon processes recommended a
+      call to <literal>umask(0)</literal>.  This is risky because it
+      often leads to world-writable files and directories, resulting
+      in security vulnerabilities such as arbitrary process
+      termination by untrusted local users, or log file truncation.
+      If the <emphasis>umask</emphasis> needs setting, a restrictive
+      value such as <literal>027</literal> or <literal>077</literal>
+      is recommended.
+    </para>
+    <para>
+      Other aspects of the process environment may have to changed as
+      well (environment variables, signal handler disposition).
+    </para>
+    <para>
+      It is increasingly common that server processes do not run as
+      background processes, but as regular foreground process under a
+      supervising master process (such as
+      <application>systemd</application>).  Server processes should
+      offer a command line option which disables forking and
+      replacement of the standard output and standard error streams.
+      Such an option is also useful for debugging.
+    </para>
+  </section>
+
+  <section>
+    <title>Semantics of command line arguments</title>
+    <!-- ??? This applies in two ways, safely calling an other process
+         and support for being called safely.  Also need to address
+         untrusted current directory on USB sticks.  -->
+    <para>
+      After process creation and option processing, it is up to the
+      child process to interpret the arguments. Arguments can be
+      file names, host names, or URLs, and many other things.  URLs
+      can refer to the local network, some server on the Internet,
+      or to the local file system.  Some applications even accept
+      arbitrary code in arguments (for example,
+      <application>python</application> with the
+      <option>-c</option> option).
+    </para>
+    <para>
+      Similar concerns apply to environment variables, the contents
+      of the current directory and its subdirectories.
+      <!-- ??? refer to section on temporary directories -->
+    </para>
+    <para>
+      Consequently, careful analysis is required if it is safe to
+      pass untrusted data to another program.
+    </para>
+  </section>
+
+  <section id="sect-Defensive_Coding-Tasks-Processes-Fork-Parallel">
+    <title><function>fork</function> as a primitive for parallelism</title>
+    <para>
+      A call to <function>fork</function> which is not immediately
+      followed by a call to <function>execve</function> (perhaps after
+      rearranging and closing file descriptors) is typically unsafe,
+      especially from a library which does not control the state of
+      the entire process.  Such use of <function>fork</function>
+      should be replaced with proper child processes or threads.
+    </para>
+  </section>
+
+</chapter>