216 lines
7.1 KiB
Text
216 lines
7.1 KiB
Text
|
||
:experimental:
|
||
|
||
[[sect-Defensive_Coding-C-Language]]
|
||
== The Core Language
|
||
|
||
C provides no memory safety. Most recommendations in this section
|
||
deal with this aspect of the language.
|
||
|
||
[[sect-Defensive_Coding-C-Undefined]]
|
||
=== Undefined Behavior
|
||
|
||
Some C constructs are defined to be undefined by the C standard.
|
||
This does not only mean that the standard does not describe
|
||
what happens when the construct is executed. It also allows
|
||
optimizing compilers such as GCC to assume that this particular
|
||
construct is never reached. In some cases, this has caused
|
||
GCC to optimize security checks away. (This is not a flaw in GCC
|
||
or the C language. But C certainly has some areas which are more
|
||
difficult to use than others.)
|
||
|
||
Common sources of undefined behavior are:
|
||
|
||
* out-of-bounds array accesses
|
||
|
||
* null pointer dereferences
|
||
|
||
* overflow in signed integer arithmetic
|
||
|
||
[[sect-Defensive_Coding-C-Pointers]]
|
||
=== Recommendations for Pointers and Array Handling
|
||
|
||
Always keep track of the size of the array you are working with.
|
||
Often, code is more obviously correct when you keep a pointer
|
||
past the last element of the array, and calculate the number of
|
||
remaining elements by subtracting the current position from
|
||
that pointer. The alternative, updating a separate variable
|
||
every time when the position is advanced, is usually less
|
||
obviously correct.
|
||
|
||
<<ex-Defensive_Coding-C-Pointers-remaining>>
|
||
shows how to extract Pascal-style strings from a character
|
||
buffer. The two pointers kept for length checks are
|
||
`inend` and `outend`.
|
||
`inp` and `outp` are the
|
||
respective positions.
|
||
The number of input bytes is checked using the expression
|
||
`len > (size_t)(inend - inp)`.
|
||
The cast silences a compiler warning;
|
||
`inend` is always larger than
|
||
`inp`.
|
||
|
||
[[ex-Defensive_Coding-C-Pointers-remaining]]
|
||
.Array processing in C
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Pointers-remaining.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
It is important that the length checks always have the form
|
||
`len > (size_t)(inend - inp)`, where
|
||
`len` is a variable of type
|
||
`size_t` which denotes the *total*
|
||
number of bytes which are about to be read or written next. In
|
||
general, it is not safe to fold multiple such checks into one,
|
||
as in `len1 + len2 > (size_t)(inend - inp)`,
|
||
because the expression on the left can overflow or wrap around
|
||
(see <<sect-Defensive_Coding-C-Arithmetic>>), and it
|
||
no longer reflects the number of bytes to be processed.
|
||
|
||
[[sect-Defensive_Coding-C-Arithmetic]]
|
||
=== Recommendations for Integer Arithmetic
|
||
|
||
Overflow in signed integer arithmetic is undefined. This means
|
||
that it is not possible to check for overflow after it happened,
|
||
see <<ex-Defensive_Coding-C-Arithmetic-bad>>.
|
||
|
||
[[ex-Defensive_Coding-C-Arithmetic-bad]]
|
||
.Incorrect overflow detection in C
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Arithmetic-add.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
The following approaches can be used to check for overflow,
|
||
without actually causing it.
|
||
|
||
* Use a wider type to perform the calculation, check that the
|
||
result is within bounds, and convert the result to the
|
||
original type. All intermediate results must be checked in
|
||
this way.
|
||
|
||
* Perform the calculation in the corresponding unsigned type
|
||
and use bit fiddling to detect the overflow.
|
||
<<ex-Defensive_Coding-C-Arithmetic-add_unsigned>>
|
||
shows how to perform an overflow check for unsigned integer
|
||
addition. For three or more terms, all the intermediate
|
||
additions have to be checked in this way.
|
||
|
||
[[ex-Defensive_Coding-C-Arithmetic-add_unsigned]]
|
||
.Overflow checking for unsigned addition
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Arithmetic-add_unsigned.adoc[]
|
||
----
|
||
|
||
====
|
||
|
||
* Compute bounds for acceptable input values which are known
|
||
to avoid overflow, and reject other values. This is the
|
||
preferred way for overflow checking on multiplications,
|
||
see <<ex-Defensive_Coding-C-Arithmetic-mult>>.
|
||
|
||
[[ex-Defensive_Coding-C-Arithmetic-mult]]
|
||
.Overflow checking for unsigned multiplication
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Arithmetic-mult.adoc[]
|
||
----
|
||
|
||
====
|
||
|
||
Basic arithmetic operations are commutative, so for bounds checks,
|
||
there are two different but mathematically equivalent
|
||
expressions. Sometimes, one of the expressions results in
|
||
better code because parts of it can be reduced to a constant.
|
||
This applies to overflow checks for multiplication `a *
|
||
b` involving a constant `a`, where the
|
||
expression is reduced to `b > C` for some
|
||
constant `C` determined at compile time. The
|
||
other expression, `b && a > ((unsigned)-1) /
|
||
b`, is more difficult to optimize at compile time.
|
||
|
||
When a value is converted to a signed integer, GCC always
|
||
chooses the result based on 2's complement arithmetic. This GCC
|
||
extension (which is also implemented by other compilers) helps a
|
||
lot when implementing overflow checks.
|
||
|
||
Sometimes, it is necessary to compare unsigned and signed
|
||
integer variables. This results in a compiler warning,
|
||
*comparison between signed and unsigned integer
|
||
expressions*, because the comparison often gives
|
||
unexpected results for negative values. When adding a cast,
|
||
make sure that negative values are covered properly. If the
|
||
bound is unsigned and the checked quantity is signed, you should
|
||
cast the checked quantity to an unsigned type as least as wide
|
||
as either operand type. As a result, negative values will fail
|
||
the bounds check. (You can still check for negative values
|
||
separately for clarity, and the compiler will optimize away this
|
||
redundant check.)
|
||
|
||
Legacy code should be compiled with the [option]`-fwrapv`
|
||
GCC option. As a result, GCC will provide 2's complement
|
||
semantics for integer arithmetic, including defined behavior on
|
||
integer overflow.
|
||
|
||
[[sect-Defensive_Coding-C-Globals]]
|
||
=== Global Variables
|
||
|
||
Global variables should be avoided because they usually lead to
|
||
thread safety hazards. In any case, they should be declared
|
||
`static`, so that access is restricted to a
|
||
single translation unit.
|
||
|
||
Global constants are not a problem, but declaring them can be
|
||
tricky. <<ex-Defensive_Coding-C-Globals-String_Array>>
|
||
shows how to declare a constant array of constant strings.
|
||
The second `const` is needed to make the
|
||
array constant, and not just the strings. It must be placed
|
||
after the `*`, and not before it.
|
||
|
||
[[ex-Defensive_Coding-C-Globals-String_Array]]
|
||
.Declaring a constant array of constant strings
|
||
====
|
||
|
||
[source,c]
|
||
----
|
||
include::example$C-Globals-String_Array.adoc[]
|
||
|
||
----
|
||
|
||
====
|
||
|
||
Sometimes, static variables local to functions are used as a
|
||
replacement for proper memory management. Unlike non-static
|
||
local variables, it is possible to return a pointer to static
|
||
local variables to the caller. But such variables are
|
||
well-hidden, but effectively global (just as static variables at
|
||
file scope). It is difficult to add thread safety afterwards if
|
||
such interfaces are used. Merely dropping the
|
||
`static` keyword in such cases leads to
|
||
undefined behavior.
|
||
|
||
Another source for static local variables is a desire to reduce
|
||
stack space usage on embedded platforms, where the stack may
|
||
span only a few hundred bytes. If this is the only reason why
|
||
the `static` keyword is used, it can just be
|
||
dropped, unless the object is very large (larger than
|
||
128 kilobytes on 32-bit platforms). In the latter case, it is
|
||
recommended to allocate the object using
|
||
`malloc`, to obtain proper array checking, for
|
||
the same reasons outlined in <<sect-Defensive_Coding-C-Allocators-alloca>>.
|