866 lines
29 KiB
Text
866 lines
29 KiB
Text
:experimental:
|
||
:toc:
|
||
|
||
include::partial$entities.adoc[]
|
||
|
||
= The C Programming Language
|
||
|
||
[[sect-Defensive_Coding-C-Language]]
|
||
== The Core Language
|
||
|
||
C provides no memory safety. Most recommendations in this section
|
||
deal with this aspect of the language.
|
||
|
||
[[sect-Defensive_Coding-C-Undefined]]
|
||
=== Undefined Behavior
|
||
|
||
Some C constructs are defined to be undefined by the C standard.
|
||
This does not only mean that the standard does not describe
|
||
what happens when the construct is executed. It also allows
|
||
optimizing compilers such as GCC to assume that this particular
|
||
construct is never reached. In some cases, this has caused
|
||
GCC to optimize security checks away. (This is not a flaw in GCC
|
||
or the C language. But C certainly has some areas which are more
|
||
difficult to use than others.)
|
||
|
||
Common sources of undefined behavior are:
|
||
|
||
* out-of-bounds array accesses
|
||
|
||
* null pointer dereferences
|
||
|
||
* overflow in signed integer arithmetic
|
||
|
||
[[sect-Defensive_Coding-C-Pointers]]
|
||
=== Recommendations for Pointers and Array Handling
|
||
|
||
Always keep track of the size of the array you are working with.
|
||
Often, code is more obviously correct when you keep a pointer
|
||
past the last element of the array, and calculate the number of
|
||
remaining elements by subtracting the current position from
|
||
that pointer. The alternative, updating a separate variable
|
||
every time when the position is advanced, is usually less
|
||
obviously correct.
|
||
|
||
<<ex-Defensive_Coding-C-Pointers-remaining>>
|
||
shows how to extract Pascal-style strings from a character
|
||
buffer. The two pointers kept for length checks are
|
||
`inend` and `outend`.
|
||
`inp` and `outp` are the
|
||
respective positions.
|
||
The number of input bytes is checked using the expression
|
||
`len > (size_t)(inend - inp)`.
|
||
The cast silences a compiler warning;
|
||
`inend` is always larger than
|
||
`inp`.
|
||
|
||
[[ex-Defensive_Coding-C-Pointers-remaining]]
|
||
.Array processing in C
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Pointers-remaining.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
It is important that the length checks always have the form
|
||
`len > (size_t)(inend - inp)`, where
|
||
`len` is a variable of type
|
||
`size_t` which denotes the *total*
|
||
number of bytes which are about to be read or written next. In
|
||
general, it is not safe to fold multiple such checks into one,
|
||
as in `len1 + len2 > (size_t)(inend - inp)`,
|
||
because the expression on the left can overflow or wrap around
|
||
(see <<sect-Defensive_Coding-C-Arithmetic>>), and it
|
||
no longer reflects the number of bytes to be processed.
|
||
|
||
[[sect-Defensive_Coding-C-Arithmetic]]
|
||
=== Recommendations for Integer Arithmetic
|
||
|
||
Overflow in signed integer arithmetic is undefined. This means
|
||
that it is not possible to check for overflow after it happened,
|
||
see <<ex-Defensive_Coding-C-Arithmetic-bad>>.
|
||
|
||
[[ex-Defensive_Coding-C-Arithmetic-bad]]
|
||
.Incorrect overflow detection in C
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Arithmetic-add.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
The following approaches can be used to check for overflow,
|
||
without actually causing it.
|
||
|
||
* Use a wider type to perform the calculation, check that the
|
||
result is within bounds, and convert the result to the
|
||
original type. All intermediate results must be checked in
|
||
this way.
|
||
|
||
* Perform the calculation in the corresponding unsigned type
|
||
and use bit fiddling to detect the overflow.
|
||
<<ex-Defensive_Coding-C-Arithmetic-add_unsigned>>
|
||
shows how to perform an overflow check for unsigned integer
|
||
addition. For three or more terms, all the intermediate
|
||
additions have to be checked in this way.
|
||
|
||
[[ex-Defensive_Coding-C-Arithmetic-add_unsigned]]
|
||
.Overflow checking for unsigned addition
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Arithmetic-add_unsigned.adoc[]
|
||
----
|
||
|
||
====
|
||
|
||
* Compute bounds for acceptable input values which are known
|
||
to avoid overflow, and reject other values. This is the
|
||
preferred way for overflow checking on multiplications,
|
||
see <<ex-Defensive_Coding-C-Arithmetic-mult>>.
|
||
|
||
[[ex-Defensive_Coding-C-Arithmetic-mult]]
|
||
.Overflow checking for unsigned multiplication
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Arithmetic-mult.adoc[]
|
||
----
|
||
|
||
====
|
||
|
||
Basic arithmetic operations are commutative, so for bounds checks,
|
||
there are two different but mathematically equivalent
|
||
expressions. Sometimes, one of the expressions results in
|
||
better code because parts of it can be reduced to a constant.
|
||
This applies to overflow checks for multiplication `a *
|
||
b` involving a constant `a`, where the
|
||
expression is reduced to `b > C` for some
|
||
constant `C` determined at compile time. The
|
||
other expression, `b && a > ((unsigned)-1) /
|
||
b`, is more difficult to optimize at compile time.
|
||
|
||
When a value is converted to a signed integer, GCC always
|
||
chooses the result based on 2's complement arithmetic. This GCC
|
||
extension (which is also implemented by other compilers) helps a
|
||
lot when implementing overflow checks.
|
||
|
||
Sometimes, it is necessary to compare unsigned and signed
|
||
integer variables. This results in a compiler warning,
|
||
*comparison between signed and unsigned integer
|
||
expressions*, because the comparison often gives
|
||
unexpected results for negative values. When adding a cast,
|
||
make sure that negative values are covered properly. If the
|
||
bound is unsigned and the checked quantity is signed, you should
|
||
cast the checked quantity to an unsigned type as least as wide
|
||
as either operand type. As a result, negative values will fail
|
||
the bounds check. (You can still check for negative values
|
||
separately for clarity, and the compiler will optimize away this
|
||
redundant check.)
|
||
|
||
Legacy code should be compiled with the [option]`-fwrapv`
|
||
GCC option. As a result, GCC will provide 2's complement
|
||
semantics for integer arithmetic, including defined behavior on
|
||
integer overflow.
|
||
|
||
[[sect-Defensive_Coding-C-Globals]]
|
||
=== Global Variables
|
||
|
||
Global variables should be avoided because they usually lead to
|
||
thread safety hazards. In any case, they should be declared
|
||
`static`, so that access is restricted to a
|
||
single translation unit.
|
||
|
||
Global constants are not a problem, but declaring them can be
|
||
tricky. <<ex-Defensive_Coding-C-Globals-String_Array>>
|
||
shows how to declare a constant array of constant strings.
|
||
The second `const` is needed to make the
|
||
array constant, and not just the strings. It must be placed
|
||
after the `*`, and not before it.
|
||
|
||
[[ex-Defensive_Coding-C-Globals-String_Array]]
|
||
.Declaring a constant array of constant strings
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Globals-String_Array.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
Sometimes, static variables local to functions are used as a
|
||
replacement for proper memory management. Unlike non-static
|
||
local variables, it is possible to return a pointer to static
|
||
local variables to the caller. But such variables are
|
||
well-hidden, but effectively global (just as static variables at
|
||
file scope). It is difficult to add thread safety afterwards if
|
||
such interfaces are used. Merely dropping the
|
||
`static` keyword in such cases leads to
|
||
undefined behavior.
|
||
|
||
Another source for static local variables is a desire to reduce
|
||
stack space usage on embedded platforms, where the stack may
|
||
span only a few hundred bytes. If this is the only reason why
|
||
the `static` keyword is used, it can just be
|
||
dropped, unless the object is very large (larger than
|
||
128 kilobytes on 32-bit platforms). In the latter case, it is
|
||
recommended to allocate the object using
|
||
`malloc`, to obtain proper array checking, for
|
||
the same reasons outlined in <<sect-Defensive_Coding-C-Allocators-alloca>>.
|
||
|
||
[[sect-Defensive_Coding-C-Libc]]
|
||
== The C Standard Library
|
||
|
||
Parts of the C standard library (and the UNIX and GNU extensions)
|
||
are difficult to use, so you should avoid them.
|
||
|
||
Please check the applicable documentation before using the
|
||
recommended replacements. Many of these functions allocate
|
||
buffers using `malloc` which your code must
|
||
deallocate explicitly using `free`.
|
||
|
||
[[sect-Defensive_Coding-C-Absolutely-Banned]]
|
||
=== Absolutely Banned Interfaces
|
||
|
||
The functions listed below must not be used because they are
|
||
almost always unsafe. Use the indicated replacements instead.
|
||
|
||
* `gets`
|
||
⟶ `fgets`
|
||
|
||
* `getwd`
|
||
⟶ `getcwd`
|
||
or `get_current_dir_name`
|
||
|
||
* `readdir_r` ⟶ `readdir`
|
||
|
||
* `realpath` (with a non-NULL second parameter)
|
||
⟶ `realpath` with NULL as the second parameter,
|
||
or `canonicalize_file_name`
|
||
|
||
The constants listed below must not be used, either. Instead,
|
||
code must allocate memory dynamically and use interfaces with
|
||
length checking.
|
||
|
||
* `NAME_MAX` (limit not actually enforced by
|
||
the kernel)
|
||
|
||
* `PATH_MAX` (limit not actually enforced by
|
||
the kernel)
|
||
|
||
* `_PC_NAME_MAX` (This limit, returned by the
|
||
`pathconf` function, is not enforced by
|
||
the kernel.)
|
||
|
||
* `_PC_PATH_MAX` (This limit, returned by the
|
||
`pathconf` function, is not enforced by
|
||
the kernel.)
|
||
|
||
The following structure members must not be used.
|
||
|
||
* `f_namemax` in `struct
|
||
statvfs` (limit not actually enforced by the kernel,
|
||
see `_PC_NAME_MAX` above)
|
||
|
||
[[sect-Defensive_Coding-C-Avoid]]
|
||
=== Functions to Avoid
|
||
|
||
The following string manipulation functions can be used securely
|
||
in principle, but their use should be avoided because they are
|
||
difficult to use correctly. Calls to these functions can be
|
||
replaced with `asprintf` or
|
||
`vasprintf`. (For non-GNU targets, these
|
||
functions are available from Gnulib.) In some cases, the
|
||
`snprintf` function might be a suitable
|
||
replacement, see <<sect-Defensive_Coding-C-String-Functions-Length>>.
|
||
|
||
* `sprintf`
|
||
|
||
* `strcat`
|
||
|
||
* `strcpy`
|
||
|
||
* `vsprintf`
|
||
|
||
Use the indicated replacements for the functions below.
|
||
|
||
* `alloca` ⟶
|
||
`malloc` and `free`
|
||
(see <<sect-Defensive_Coding-C-Allocators-alloca>>)
|
||
|
||
* `putenv` ⟶
|
||
explicit `envp` argument in process creation
|
||
(see xref:tasks/Tasks-Processes.adoc#sect-Defensive_Coding-Tasks-Processes-environ[Specifying the Process Environment])
|
||
|
||
* `setenv` ⟶
|
||
explicit `envp` argument in process creation
|
||
(see xref:tasks/Tasks-Processes.adoc#sect-Defensive_Coding-Tasks-Processes-environ[Specifying the Process Environment])
|
||
|
||
* `strdupa` ⟶
|
||
`strdup` and `free`
|
||
(see <<sect-Defensive_Coding-C-Allocators-alloca>>)
|
||
|
||
* `strndupa` ⟶
|
||
`strndup` and `free`
|
||
(see <<sect-Defensive_Coding-C-Allocators-alloca>>)
|
||
|
||
* `system` ⟶
|
||
`posix_spawn`
|
||
or `fork`pass:attributes[{blank}]/pass:attributes[{blank}]`execve`pass:attributes[{blank}]/
|
||
(see xref:tasks/Tasks-Processes.adoc#sect-Defensive_Coding-Tasks-Processes-execve[Bypassing the Shell])
|
||
|
||
* `unsetenv` ⟶
|
||
explicit `envp` argument in process creation
|
||
(see xref:tasks/Tasks-Processes.adoc#sect-Defensive_Coding-Tasks-Processes-environ[Specifying the Process Environment])
|
||
|
||
[[sect-Defensive_Coding-C-String-Functions-Length]]
|
||
=== String Functions with Explicit Length Arguments
|
||
|
||
The C run-time library provides string manipulation functions
|
||
which not just look for NUL characters for string termination,
|
||
but also honor explicit lengths provided by the caller.
|
||
However, these functions evolved over a long period of time, and
|
||
the lengths mean different things depending on the function.
|
||
|
||
[[sect-Defensive_Coding-C-Libc-snprintf]]
|
||
==== `snprintf`
|
||
|
||
The `snprintf` function provides a way to
|
||
construct a string in a statically-sized buffer. (If the buffer
|
||
size is allocated on the heap, consider use
|
||
`asprintf` instead.)
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-String-Functions-snprintf.adoc[]
|
||
|
||
----
|
||
|
||
The second argument to the `snprintf` call
|
||
should always be the size of the buffer in the first argument
|
||
(which should be a character array). Elaborate pointer and
|
||
length arithmetic can introduce errors and nullify the
|
||
security benefits of `snprintf`.
|
||
|
||
In particular, `snprintf` is not well-suited
|
||
to constructing a string iteratively, by appending to an
|
||
existing buffer. `snprintf` returns one of
|
||
two values, `-1` on errors, or the number of
|
||
characters which *would have been written to the
|
||
buffer if the buffer were large enough*. This means
|
||
that adding the result of `snprintf` to the
|
||
buffer pointer to skip over the characters just written is
|
||
incorrect and risky. However, as long as the length argument
|
||
is not zero, the buffer will remain null-terminated. <<ex-Defensive_Coding-C-String-Functions-snprintf-incremental>>
|
||
works because `end -current > 0` is a loop
|
||
invariant. After the loop, the result string is in the
|
||
`buf` variable.
|
||
|
||
[[ex-Defensive_Coding-C-String-Functions-snprintf-incremental]]
|
||
.Repeatedly writing to a buffer using `snprintf`
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-String-Functions-snprintf-incremental.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
If you want to avoid the call to `strlen`
|
||
for performance reasons, you have to check for a negative
|
||
return value from `snprintf` and also check
|
||
if the return value is equal to the specified buffer length or
|
||
larger. Only if neither condition applies, you may advance
|
||
the pointer to the start of the write buffer by the number
|
||
return by `snprintf`. However, this
|
||
optimization is rarely worthwhile.
|
||
|
||
Note that it is not permitted to use the same buffer both as
|
||
the destination and as a source argument.
|
||
|
||
[[sect-Defensive_Coding-C-Libc-vsnprintf]]
|
||
==== `vsnprintf` and Format Strings
|
||
|
||
If you use `vsnprintf` (or
|
||
`vasprintf` or even
|
||
`snprintf`) with a format string which is
|
||
not a constant, but a function argument, it is important to
|
||
annotate the function with a `format`
|
||
function attribute, so that GCC can warn about misuse of your
|
||
function (see <<ex-Defensive_Coding-C-String-Functions-format-Attribute>>).
|
||
|
||
[[ex-Defensive_Coding-C-String-Functions-format-Attribute]]
|
||
.The `format` function attribute
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-String-Functions-format.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
[[sect-Defensive_Coding-C-Libc-strncpy]]
|
||
==== `strncpy`
|
||
|
||
The `strncpy` function does not ensure that
|
||
the target buffer is null-terminated. A common idiom for
|
||
ensuring NUL termination is:
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-String-Functions-strncpy.adoc[]
|
||
|
||
----
|
||
|
||
Another approach uses the `strncat`
|
||
function for this purpose:
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-String-Functions-strncat-as-strncpy.adoc[]
|
||
|
||
----
|
||
|
||
[[sect-Defensive_Coding-C-Libc-strncat]]
|
||
==== `strncat`
|
||
|
||
The length argument of the `strncat`
|
||
function specifies the maximum number of characters copied
|
||
from the source buffer, excluding the terminating NUL
|
||
character. This means that the required number of bytes in
|
||
the destination buffer is the length of the original string,
|
||
plus the length argument in the `strncat`
|
||
call, plus one. Consequently, this function is rarely
|
||
appropriate for performing a length-checked string operation,
|
||
with the notable exception of the `strcpy`
|
||
emulation described in <<sect-Defensive_Coding-C-Libc-strncpy>>.
|
||
|
||
To implement a length-checked string append, you can use an
|
||
approach similar to <<ex-Defensive_Coding-C-String-Functions-snprintf-incremental>>:
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-String-Functions-strncat-emulation.adoc[]
|
||
|
||
----
|
||
|
||
In many cases, including this one, the string concatenation
|
||
can be avoided by combining everything into a single format
|
||
string:
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-String-Functions-strncat-merged.adoc[]
|
||
|
||
----
|
||
|
||
But you should must not dynamically construct format strings
|
||
to avoid concatenation because this would prevent GCC from
|
||
type-checking the argument lists.
|
||
|
||
It is not possible to use format strings like
|
||
`"%s%s"` to implement concatenation, unless
|
||
you use separate buffers. `snprintf` does
|
||
not support overlapping source and target strings.
|
||
|
||
==== `strlcpy` and `strlcat`
|
||
|
||
Some systems support `strlcpy` and
|
||
`strlcat` functions which behave this way,
|
||
but these functions are not part of GNU libc.
|
||
`strlcpy` is often replaced with
|
||
`snprintf` with a `"%s"`
|
||
format string. See <<sect-Defensive_Coding-C-Libc-strncpy>> for a caveat
|
||
related to the `snprintf` return value.
|
||
|
||
To emulate `strlcat`, use the approach
|
||
described in <<sect-Defensive_Coding-C-Libc-strncat>>.
|
||
|
||
==== ISO C11 Annex K *pass:attributes[{blank}]`_s` functions
|
||
|
||
ISO C11 adds another set of length-checking functions, but GNU
|
||
libc currently does not implement them.
|
||
|
||
==== Other `strn*` and `stpn*` functions
|
||
|
||
GNU libc contains additional functions with different variants
|
||
of length checking. Consult the documentation before using
|
||
them to find out what the length actually means.
|
||
|
||
=== Using tricky syscalls or library functions
|
||
==== `readlink`
|
||
This is the hardest system call to use correctly because of everything you have to do
|
||
|
||
* The buf should be of PATH_MAX length, that includes space for the terminating NUL character.
|
||
* The bufsize should be `sizeof(buf) - 1`
|
||
* `readlink` return value should be caught as a signed integer (ideally type `ssize_t`).
|
||
* It should be checked for < 0 for indication of errors.
|
||
* The caller needs to '\0' -terminate the buffer using the returned value as an index.
|
||
|
||
==== `chroot`
|
||
* Target dir should be writable only by root (this implies owned by).
|
||
* Must call `chdir` immediately after chroot or you are not really in the changed root.
|
||
|
||
==== `stat`, `lstat`, `fstatat`
|
||
* These functions have an inherent race in that you operate on the path name which could change in the mean time. Using fstat is recommended when stat is used.
|
||
* If `S_ISLNK` macro is used, the stat buffer MUST come from lstat or from fstatat with `AT_SYMLINK_NOFOLLOW`
|
||
* If you are doing something really important, call fstat after opening and compare the before and after stat buffers before trusting them.
|
||
|
||
==== `setgid`, `setuid`:
|
||
* Call these in the right order: groups and then uid.
|
||
* Always check the return code.
|
||
* If `setgid` & `setuid` are used, supplemental groups are not reset. This must be done with setgroups or initgroups before the uid change.
|
||
|
||
[[sect-Defensive_Coding-C-Allocators]]
|
||
== Memory Allocators
|
||
|
||
=== `malloc` and Related Functions
|
||
|
||
The C library interfaces for memory allocation are provided by
|
||
`malloc`, `free` and
|
||
`realloc`, and the
|
||
`calloc` function. In addition to these
|
||
generic functions, there are derived functions such as
|
||
`strdup` which perform allocation using
|
||
`malloc` internally, but do not return
|
||
untyped heap memory (which could be used for any object).
|
||
|
||
The C compiler knows about these functions and can use their
|
||
expected behavior for optimizations. For instance, the compiler
|
||
assumes that an existing pointer (or a pointer derived from an
|
||
existing pointer by arithmetic) will not point into the memory
|
||
area returned by `malloc`.
|
||
|
||
If the allocation fails, `realloc` does not
|
||
free the old pointer. Therefore, the idiom `ptr =
|
||
realloc(ptr, size);` is wrong because the memory
|
||
pointed to by `ptr` leaks in case of an error.
|
||
|
||
[[sect-Defensive_Coding-C-Use-After-Free]]
|
||
==== Use-after-free errors
|
||
|
||
After `free`, the pointer is invalid.
|
||
Further pointer dereferences are not allowed (and are usually
|
||
detected by [application]*valgrind*). Less obvious
|
||
is that any *use* of the old pointer value is
|
||
not allowed, either. In particular, comparisons with any other
|
||
pointer (or the null pointer) are undefined according to the C
|
||
standard.
|
||
|
||
The same rules apply to `realloc` if the
|
||
memory area cannot be enlarged in-place. For instance, the
|
||
compiler may assume that a comparison between the old and new
|
||
pointer will always return false, so it is impossible to detect
|
||
movement this way.
|
||
|
||
On a related note, `realloc` frees the memory area if the new size is
|
||
zero. If the size unintentionally becomes zero, as a result of
|
||
unsigned integer wrap-around for instance, the following idiom causes
|
||
a double-free.
|
||
|
||
[source,c]
|
||
----
|
||
new_size = size + x; /* 'x' is a very large value and the result wraps around to zero */
|
||
new_ptr = realloc(ptr, new_size);
|
||
if (!new_ptr) {
|
||
free(ptr);
|
||
}
|
||
----
|
||
|
||
==== Handling Memory Allocation Errors
|
||
|
||
Recovering from out-of-memory errors is often difficult or even
|
||
impossible. In these cases, `malloc` and
|
||
other allocation functions return a null pointer. Dereferencing
|
||
this pointer lead to a crash. Such dereferences can even be
|
||
exploitable for code execution if the dereference is combined
|
||
with an array subscript.
|
||
|
||
In general, if you cannot check all allocation calls and
|
||
handle failure, you should abort the program on allocation
|
||
failure, and not rely on the null pointer dereference to
|
||
terminate the process. See
|
||
xref:tasks/Tasks-Serialization.adoc#sect-Defensive_Coding-Tasks-Serialization-Decoders[Recommendations for Manually-written Decoders]
|
||
for related memory allocation concerns.
|
||
|
||
[[sect-Defensive_Coding-C-Allocators-alloca]]
|
||
=== `alloca` and Other Forms of Stack-based Allocation
|
||
|
||
Allocation on the stack is risky because stack overflow checking
|
||
is implicit. There is a guard page at the end of the memory
|
||
area reserved for the stack. If the program attempts to read
|
||
from or write to this guard page, a `SIGSEGV`
|
||
signal is generated and the program typically terminates.
|
||
|
||
This is sufficient for detecting typical stack overflow
|
||
situations such as unbounded recursion, but it fails when the
|
||
stack grows in increments larger than the size of the guard
|
||
page. In this case, it is possible that the stack pointer ends
|
||
up pointing into a memory area which has been allocated for a
|
||
different purposes. Such misbehavior can be exploitable.
|
||
|
||
A common source for large stack growth are calls to
|
||
`alloca` and related functions such as
|
||
`strdupa`. These functions should be avoided
|
||
because of the lack of error checking. (They can be used safely
|
||
if the allocated size is less than the page size (typically,
|
||
4096 bytes), but this case is relatively rare.) Additionally,
|
||
relying on `alloca` makes it more difficult
|
||
to reorganize the code because it is not allowed to use the
|
||
pointer after the function calling `alloca`
|
||
has returned, even if this function has been inlined into its
|
||
caller.
|
||
|
||
Similar concerns apply to *variable-length
|
||
arrays* (VLAs), a feature of the C99 standard which
|
||
started as a GNU extension. For large objects exceeding the
|
||
page size, there is no error checking, either.
|
||
|
||
In both cases, negative or very large sizes can trigger a
|
||
stack-pointer wraparound, and the stack pointer and end up
|
||
pointing into caller stack frames, which is fatal and can be
|
||
exploitable.
|
||
|
||
If you want to use `alloca` or VLAs for
|
||
performance reasons, consider using a small on-stack array (less
|
||
than the page size, large enough to fulfill most requests). If
|
||
the requested size is small enough, use the on-stack array.
|
||
Otherwise, call `malloc`. When exiting the
|
||
function, check if `malloc` had been called,
|
||
and free the buffer as needed.
|
||
|
||
If portability is not important in your program, an alternative way of
|
||
automatic memory management is to leverage the `cleanup` attribute
|
||
supported by the recent versions of GCC and Clang. If a local variable
|
||
is declared with the attribute, the specified cleanup function will be
|
||
called when the variable goes out of scope.
|
||
|
||
[source,c]
|
||
----
|
||
static inline void freep(void *p) {
|
||
free(*(void**) p);
|
||
}
|
||
|
||
void somefunction(const char *param) {
|
||
if (strcmp(param, "do_something_complex") == 0) {
|
||
__attribute__((cleanup(freep))) char *ptr = NULL;
|
||
|
||
/* Allocate a temporary buffer */
|
||
ptr = malloc(size);
|
||
|
||
/* Do something on it, but do not need to manually call free() */
|
||
}
|
||
}
|
||
----
|
||
|
||
[[sect-Defensive_Coding-C-Allocators-Arrays]]
|
||
=== Array Allocation
|
||
|
||
When allocating arrays, it is important to check for overflows.
|
||
The `calloc` function performs such checks.
|
||
|
||
If `malloc` or `realloc`
|
||
is used, the size check must be written manually. For instance,
|
||
to allocate an array of `n` elements of type
|
||
`T`, check that the requested size is not
|
||
greater than `((size_t) -1) / sizeof(T)`. See
|
||
<<sect-Defensive_Coding-C-Arithmetic>>.
|
||
|
||
GNU libc provides a dedicated function `reallocarray` that allocates
|
||
an array with those checks performed internally. However, care must
|
||
be taken if portability is important: while the interface originated
|
||
in OpenBSD and has been adopted in many other platforms, NetBSD
|
||
exposes an incompatible behavior with the same interface.
|
||
|
||
[[sect-Defensive_Coding-C-Allocators-Custom]]
|
||
=== Custom Memory Allocators
|
||
|
||
Custom memory allocates come in two forms: replacements for
|
||
`malloc`, and completely different interfaces
|
||
for memory management. Both approaches can reduce the
|
||
effectiveness of [application]*valgrind* and similar
|
||
tools, and the heap corruption detection provided by GNU libc, so
|
||
they should be avoided.
|
||
|
||
Memory allocators are difficult to write and contain many
|
||
performance and security pitfalls.
|
||
|
||
* When computing array sizes or rounding up allocation
|
||
requests (to the next allocation granularity, or for
|
||
alignment purposes), checks for arithmetic overflow are
|
||
required.
|
||
|
||
* Size computations for array allocations need overflow
|
||
checking. See <<sect-Defensive_Coding-C-Allocators-Arrays>>.
|
||
|
||
* It can be difficult to beat well-tuned general-purpose
|
||
allocators. In micro benchmarks, pool allocators can show
|
||
huge wins, and size-specific pools can reduce internal
|
||
fragmentation. But often, utilization of individual pools
|
||
is poor, and external fragmentation increases the overall
|
||
memory usage.
|
||
|
||
=== Conservative Garbage Collection
|
||
|
||
Garbage collection can be an alternative to explicit memory
|
||
management using `malloc` and
|
||
`free`. The Boehm-Dehmers-Weiser allocator
|
||
can be used from C programs, with minimal type annotations.
|
||
Performance is competitive with `malloc` on
|
||
64-bit architectures, especially for multi-threaded programs.
|
||
The stop-the-world pauses may be problematic for some real-time
|
||
applications, though.
|
||
|
||
However, using a conservative garbage collector may reduce
|
||
opportunities for code reduce because once one library in a
|
||
program uses garbage collection, the whole process memory needs
|
||
to be subject to it, so that no pointers are missed. The
|
||
Boehm-Dehmers-Weiser collector also reserves certain signals for
|
||
internal use, so it is not fully transparent to the rest of the
|
||
program.
|
||
|
||
|
||
[[sect-Defensive_Coding-C-Other]]
|
||
== Other C-related Topics
|
||
|
||
[[sect-Defensive_Coding-C-Wrapper-Functions]]
|
||
=== Wrapper Functions
|
||
|
||
Some libraries provide wrappers for standard library functions.
|
||
Common cases include allocation functions such as
|
||
`xmalloc` which abort the process on
|
||
allocation failure (instead of returning a
|
||
`NULL` pointer), or alternatives to relatively
|
||
recent library additions such as `snprintf`
|
||
(along with implementations for systems which lack them).
|
||
|
||
In general, such wrappers are a bad idea, particularly if they
|
||
are not implemented as inline functions or preprocessor macros.
|
||
The compiler lacks knowledge of such wrappers outside the
|
||
translation unit which defines them, which means that some
|
||
optimizations and security checks are not performed. Adding
|
||
`__attribute__` annotations to function
|
||
declarations can remedy this to some extent, but these
|
||
annotations have to be maintained carefully for feature parity
|
||
with the standard implementation.
|
||
|
||
At the minimum, you should apply these attributes:
|
||
|
||
* If you wrap function which accepts are GCC-recognized format
|
||
string (for example, a `printf`-style
|
||
function used for logging), you should add a suitable
|
||
`format` attribute, as in <<ex-Defensive_Coding-C-String-Functions-format-Attribute>>.
|
||
|
||
* If you wrap a function which carries a
|
||
`warn_unused_result` attribute and you
|
||
propagate its return value, your wrapper should be declared
|
||
with `warn_unused_result` as well.
|
||
|
||
* Duplicating the buffer length checks based on the
|
||
`__builtin_object_size` GCC builtin is
|
||
desirable if the wrapper processes arrays. (This
|
||
functionality is used by the
|
||
`-D_FORTIFY_SOURCE=2` checks to guard
|
||
against static buffer overflows.) However, designing
|
||
appropriate interfaces and implementing the checks may not
|
||
be entirely straightforward.
|
||
|
||
For other attributes (such as `malloc`),
|
||
careful analysis and comparison with the compiler documentation
|
||
is required to check if propagating the attribute is
|
||
appropriate. Incorrectly applied attributes can result in
|
||
undesired behavioral changes in the compiled code.
|
||
|
||
[[sect-Defensive_Coding-C-Common-Mistakes]]
|
||
=== Common mistakes
|
||
|
||
==== Mistakes in macros
|
||
A macro is a name given to a block of C statements as a pre-processor
|
||
directive. Being a pre-processor the block of code is transformed by
|
||
the compiler before being compiled.
|
||
|
||
A macro starts with the preprocessor directive, #define. It can
|
||
define a single value or any 'substitution', syntactically valid or
|
||
not.
|
||
|
||
A common mistake when working with macros is that programmers treat
|
||
arguments to macros like they would functions. This becomes an issue
|
||
when the argument may be expanded multiple times in a macro.
|
||
|
||
For example:
|
||
|
||
macro-misuse.c
|
||
[source,C]
|
||
----
|
||
#define simple(thing) do { \
|
||
if (thing < 1) { \
|
||
y = thing; \
|
||
} \
|
||
else if (thing > 100) { \
|
||
y = thing * 2 + thing; \
|
||
} \
|
||
else { \
|
||
y = 200; \
|
||
} \
|
||
} while (0)
|
||
|
||
int main(void) {
|
||
int x = 200;
|
||
int y = 0;
|
||
simple(x++);
|
||
|
||
return 0;
|
||
}
|
||
----
|
||
|
||
Each pass through the simple() macro would mean that x could be
|
||
expanded in-place each time 'thing' was mentioned.
|
||
|
||
The 'main' function would be processed and expanded as follows:
|
||
|
||
macro-misuse-post-processing.c
|
||
[source,C]
|
||
----
|
||
|
||
int main(void) {
|
||
int x = 200;
|
||
int y = 0;
|
||
do {
|
||
if ( x++ < 1) {
|
||
y = x++;
|
||
}
|
||
else if (thing > 100) {
|
||
y = x++ * 2 + x++;
|
||
}
|
||
else {
|
||
x = 200;
|
||
}
|
||
} while (0)
|
||
|
||
return 0;
|
||
}
|
||
----
|
||
|
||
Each evaluation of the argument to 'simple' (x++) would be executed
|
||
each time it was referenced.
|
||
|
||
While this may be 'expected' behaviour by the original creator, large
|
||
projects may have programmers who were unaware of how the macro may
|
||
expand and this may introduce unexpected behaviour, especially if the
|
||
value is later used as indexing into an array or able to be
|
||
overflowed.
|