2017-12-01 17:25:14 +01:00
|
|
|
|
|
|
|
|
|
:experimental:
|
|
|
|
|
|
|
|
|
|
[[sect-Defensive_Coding-C-Language]]
|
2018-02-08 13:08:40 +01:00
|
|
|
|
== The Core Language
|
2017-12-01 17:25:14 +01:00
|
|
|
|
|
|
|
|
|
C provides no memory safety. Most recommendations in this section
|
|
|
|
|
deal with this aspect of the language.
|
|
|
|
|
|
|
|
|
|
[[sect-Defensive_Coding-C-Undefined]]
|
2018-02-08 13:08:40 +01:00
|
|
|
|
=== Undefined Behavior
|
2017-12-01 17:25:14 +01:00
|
|
|
|
|
|
|
|
|
Some C constructs are defined to be undefined by the C standard.
|
|
|
|
|
This does not only mean that the standard does not describe
|
|
|
|
|
what happens when the construct is executed. It also allows
|
|
|
|
|
optimizing compilers such as GCC to assume that this particular
|
|
|
|
|
construct is never reached. In some cases, this has caused
|
|
|
|
|
GCC to optimize security checks away. (This is not a flaw in GCC
|
|
|
|
|
or the C language. But C certainly has some areas which are more
|
|
|
|
|
difficult to use than others.)
|
|
|
|
|
|
|
|
|
|
Common sources of undefined behavior are:
|
|
|
|
|
|
|
|
|
|
* out-of-bounds array accesses
|
|
|
|
|
|
|
|
|
|
* null pointer dereferences
|
|
|
|
|
|
|
|
|
|
* overflow in signed integer arithmetic
|
|
|
|
|
|
|
|
|
|
[[sect-Defensive_Coding-C-Pointers]]
|
2018-02-08 13:08:40 +01:00
|
|
|
|
=== Recommendations for Pointers and Array Handling
|
2017-12-01 17:25:14 +01:00
|
|
|
|
|
|
|
|
|
Always keep track of the size of the array you are working with.
|
|
|
|
|
Often, code is more obviously correct when you keep a pointer
|
|
|
|
|
past the last element of the array, and calculate the number of
|
|
|
|
|
remaining elements by substracting the current position from
|
|
|
|
|
that pointer. The alternative, updating a separate variable
|
|
|
|
|
every time when the position is advanced, is usually less
|
|
|
|
|
obviously correct.
|
|
|
|
|
|
|
|
|
|
<<ex-Defensive_Coding-C-Pointers-remaining>>
|
|
|
|
|
shows how to extract Pascal-style strings from a character
|
|
|
|
|
buffer. The two pointers kept for length checks are
|
|
|
|
|
`inend` and `outend`.
|
|
|
|
|
`inp` and `outp` are the
|
|
|
|
|
respective positions.
|
|
|
|
|
The number of input bytes is checked using the expression
|
|
|
|
|
`len > (size_t)(inend - inp)`.
|
|
|
|
|
The cast silences a compiler warning;
|
|
|
|
|
`inend` is always larger than
|
|
|
|
|
`inp`.
|
|
|
|
|
|
|
|
|
|
[[ex-Defensive_Coding-C-Pointers-remaining]]
|
|
|
|
|
.Array processing in C
|
|
|
|
|
====
|
|
|
|
|
|
2018-01-31 17:39:43 +01:00
|
|
|
|
[source,c]
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
2018-02-08 13:08:40 +01:00
|
|
|
|
include::../snippets/C-Pointers-remaining.adoc[]
|
2018-01-31 17:39:43 +01:00
|
|
|
|
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
====
|
|
|
|
|
|
|
|
|
|
It is important that the length checks always have the form
|
|
|
|
|
`len > (size_t)(inend - inp)`, where
|
|
|
|
|
`len` is a variable of type
|
|
|
|
|
`size_t` which denotes the *total*
|
|
|
|
|
number of bytes which are about to be read or written next. In
|
|
|
|
|
general, it is not safe to fold multiple such checks into one,
|
|
|
|
|
as in `len1 + len2 > (size_t)(inend - inp)`,
|
|
|
|
|
because the expression on the left can overflow or wrap around
|
|
|
|
|
(see <<sect-Defensive_Coding-C-Arithmetic>>), and it
|
|
|
|
|
no longer reflects the number of bytes to be processed.
|
|
|
|
|
|
|
|
|
|
[[sect-Defensive_Coding-C-Arithmetic]]
|
2018-02-08 13:08:40 +01:00
|
|
|
|
=== Recommendations for Integer Arithmetic
|
2017-12-01 17:25:14 +01:00
|
|
|
|
|
|
|
|
|
Overflow in signed integer arithmetic is undefined. This means
|
|
|
|
|
that it is not possible to check for overflow after it happened,
|
|
|
|
|
see <<ex-Defensive_Coding-C-Arithmetic-bad>>.
|
|
|
|
|
|
|
|
|
|
[[ex-Defensive_Coding-C-Arithmetic-bad]]
|
|
|
|
|
.Incorrect overflow detection in C
|
|
|
|
|
====
|
|
|
|
|
|
2018-01-31 17:39:43 +01:00
|
|
|
|
[source,c]
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
2018-02-08 13:08:40 +01:00
|
|
|
|
include::../snippets/C-Arithmetic-add.adoc[]
|
2018-01-31 17:39:43 +01:00
|
|
|
|
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
====
|
|
|
|
|
|
|
|
|
|
The following approaches can be used to check for overflow,
|
|
|
|
|
without actually causing it.
|
|
|
|
|
|
|
|
|
|
* Use a wider type to perform the calculation, check that the
|
|
|
|
|
result is within bounds, and convert the result to the
|
|
|
|
|
original type. All intermediate results must be checked in
|
|
|
|
|
this way.
|
|
|
|
|
|
|
|
|
|
* Perform the calculation in the corresponding unsigned type
|
|
|
|
|
and use bit fiddling to detect the overflow.
|
|
|
|
|
<<ex-Defensive_Coding-C-Arithmetic-add_unsigned>>
|
|
|
|
|
shows how to perform an overflow check for unsigned integer
|
|
|
|
|
addition. For three or more terms, all the intermediate
|
|
|
|
|
additions have to be checked in this way.
|
|
|
|
|
|
|
|
|
|
[[ex-Defensive_Coding-C-Arithmetic-add_unsigned]]
|
|
|
|
|
.Overflow checking for unsigned addition
|
|
|
|
|
====
|
|
|
|
|
|
2018-02-02 14:14:11 +01:00
|
|
|
|
[source,c]
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
2018-02-08 13:08:40 +01:00
|
|
|
|
include::../snippets/C-Arithmetic-add_unsigned.adoc[]
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
====
|
|
|
|
|
|
|
|
|
|
* Compute bounds for acceptable input values which are known
|
|
|
|
|
to avoid overflow, and reject other values. This is the
|
|
|
|
|
preferred way for overflow checking on multiplications,
|
|
|
|
|
see <<ex-Defensive_Coding-C-Arithmetic-mult>>.
|
|
|
|
|
|
|
|
|
|
[[ex-Defensive_Coding-C-Arithmetic-mult]]
|
|
|
|
|
.Overflow checking for unsigned multiplication
|
|
|
|
|
====
|
|
|
|
|
|
2018-01-31 17:39:43 +01:00
|
|
|
|
[source,c]
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
2018-02-08 13:08:40 +01:00
|
|
|
|
include::../snippets/C-Arithmetic-mult.adoc[]
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
====
|
|
|
|
|
|
|
|
|
|
Basic arithmetic operations are commutative, so for bounds checks,
|
|
|
|
|
there are two different but mathematically equivalent
|
|
|
|
|
expressions. Sometimes, one of the expressions results in
|
|
|
|
|
better code because parts of it can be reduced to a constant.
|
|
|
|
|
This applies to overflow checks for multiplication `a *
|
|
|
|
|
b` involving a constant `a`, where the
|
|
|
|
|
expression is reduced to `b > C` for some
|
|
|
|
|
constant `C` determined at compile time. The
|
|
|
|
|
other expression, `b && a > ((unsigned)-1) /
|
|
|
|
|
b`, is more difficult to optimize at compile time.
|
|
|
|
|
|
|
|
|
|
When a value is converted to a signed integer, GCC always
|
|
|
|
|
chooses the result based on 2's complement arithmetic. This GCC
|
|
|
|
|
extension (which is also implemented by other compilers) helps a
|
|
|
|
|
lot when implementing overflow checks.
|
|
|
|
|
|
|
|
|
|
Sometimes, it is necessary to compare unsigned and signed
|
|
|
|
|
integer variables. This results in a compiler warning,
|
|
|
|
|
*comparison between signed and unsigned integer
|
|
|
|
|
expressions*, because the comparison often gives
|
|
|
|
|
unexpected results for negative values. When adding a cast,
|
|
|
|
|
make sure that negative values are covered properly. If the
|
|
|
|
|
bound is unsigned and the checked quantity is signed, you should
|
|
|
|
|
cast the checked quantity to an unsigned type as least as wide
|
|
|
|
|
as either operand type. As a result, negative values will fail
|
|
|
|
|
the bounds check. (You can still check for negative values
|
|
|
|
|
separately for clarity, and the compiler will optimize away this
|
|
|
|
|
redundant check.)
|
|
|
|
|
|
|
|
|
|
Legacy code should be compiled with the [option]`-fwrapv`
|
|
|
|
|
GCC option. As a result, GCC will provide 2's complement
|
|
|
|
|
semantics for integer arithmetic, including defined behavior on
|
|
|
|
|
integer overflow.
|
|
|
|
|
|
|
|
|
|
[[sect-Defensive_Coding-C-Globals]]
|
2018-02-08 13:08:40 +01:00
|
|
|
|
=== Global Variables
|
2017-12-01 17:25:14 +01:00
|
|
|
|
|
|
|
|
|
Global variables should be avoided because they usually lead to
|
|
|
|
|
thread safety hazards. In any case, they should be declared
|
|
|
|
|
`static`, so that access is restricted to a
|
|
|
|
|
single translation unit.
|
|
|
|
|
|
|
|
|
|
Global constants are not a problem, but declaring them can be
|
|
|
|
|
tricky. <<ex-Defensive_Coding-C-Globals-String_Array>>
|
|
|
|
|
shows how to declare a constant array of constant strings.
|
|
|
|
|
The second `const` is needed to make the
|
|
|
|
|
array constant, and not just the strings. It must be placed
|
|
|
|
|
after the `*`, and not before it.
|
|
|
|
|
|
|
|
|
|
[[ex-Defensive_Coding-C-Globals-String_Array]]
|
|
|
|
|
.Declaring a constant array of constant strings
|
|
|
|
|
====
|
|
|
|
|
|
2018-02-01 14:05:31 +01:00
|
|
|
|
[source,c]
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
2018-02-08 13:08:40 +01:00
|
|
|
|
include::../snippets/C-Globals-String_Array.adoc[]
|
2018-01-31 17:39:43 +01:00
|
|
|
|
|
2017-12-01 17:25:14 +01:00
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
====
|
|
|
|
|
|
|
|
|
|
Sometimes, static variables local to functions are used as a
|
|
|
|
|
replacement for proper memory management. Unlike non-static
|
|
|
|
|
local variables, it is possible to return a pointer to static
|
|
|
|
|
local variables to the caller. But such variables are
|
|
|
|
|
well-hidden, but effectively global (just as static variables at
|
|
|
|
|
file scope). It is difficult to add thread safety afterwards if
|
|
|
|
|
such interfaces are used. Merely dropping the
|
|
|
|
|
`static` keyword in such cases leads to
|
|
|
|
|
undefined behavior.
|
|
|
|
|
|
|
|
|
|
Another source for static local variables is a desire to reduce
|
|
|
|
|
stack space usage on embedded platforms, where the stack may
|
|
|
|
|
span only a few hundred bytes. If this is the only reason why
|
|
|
|
|
the `static` keyword is used, it can just be
|
|
|
|
|
dropped, unless the object is very large (larger than
|
2018-02-01 14:05:31 +01:00
|
|
|
|
128 kilobytes on 32-bit platforms). In the latter case, it is
|
2017-12-01 17:25:14 +01:00
|
|
|
|
recommended to allocate the object using
|
|
|
|
|
`malloc`, to obtain proper array checking, for
|
2018-01-31 17:39:43 +01:00
|
|
|
|
the same reasons outlined in <<sect-Defensive_Coding-C-Allocators-alloca>>.
|