324 lines
12 KiB
Text
324 lines
12 KiB
Text
:experimental:
|
|
:toc:
|
|
|
|
include::partial$entities.adoc[]
|
|
|
|
= The {cpp} Programming Language
|
|
|
|
[[sect-Defensive_Coding-CXX-Language]]
|
|
== The Core Language
|
|
|
|
C++ includes a large subset of the C language. As far as the C
|
|
subset is used, the recommendations in xref:programming-languages/C.adoc#chap-Defensive_Coding-C[Defensive Coding in C] apply.
|
|
|
|
=== Array Allocation with `operator new[]`
|
|
|
|
For very large values of `n`, an expression
|
|
like `new T[n]` can return a pointer to a heap
|
|
region which is too small. In other words, not all array
|
|
elements are actually backed with heap memory reserved to the
|
|
array. Current GCC versions generate code that performs a
|
|
computation of the form `sizeof(T) * size_t(n) + cookie_size`, where `cookie_size` is
|
|
currently at most 8. This computation can overflow, and GCC
|
|
versions prior to 4.8 generated code which did not detect this.
|
|
(Fedora 18 was the first release which fixed this in GCC.)
|
|
|
|
The `std::vector` template can be used instead
|
|
an explicit array allocation. (The GCC implementation detects
|
|
overflow internally.)
|
|
|
|
If there is no alternative to `operator new[]`
|
|
and the sources will be compiled with older GCC versions, code
|
|
which allocates arrays with a variable length must check for
|
|
overflow manually. For the `new T[n]` example,
|
|
the size check could be `n || (n > 0 && n >
|
|
(size_t(-1) - 8) / sizeof(T))`. (See xref:programming-languages/C.adoc#sect-Defensive_Coding-C-Arithmetic[Recommendations for Integer Arithmetic]) If there are
|
|
additional dimensions (which must be constants according to the
|
|
{cpp} standard), these should be included as factors in the
|
|
divisor.
|
|
|
|
These countermeasures prevent out-of-bounds writes and potential
|
|
code execution. Very large memory allocations can still lead to
|
|
a denial of service. xref:tasks/Tasks-Serialization.adoc#sect-Defensive_Coding-Tasks-Serialization-Decoders[Recommendations for Manually-written Decoders]
|
|
contains suggestions for mitigating this problem when processing
|
|
untrusted data.
|
|
|
|
See xref:tasks/programming-languages/C.adoc#sect-Defensive_Coding-C-Allocators-Arrays[Array Allocation]
|
|
for array allocation advice for C-style memory allocation.
|
|
|
|
=== Overloading
|
|
|
|
Do not overload functions with versions that have different
|
|
security characteristics. For instance, do not implement a
|
|
function `strcat` which works on
|
|
`std::string` arguments. Similarly, do not name
|
|
methods after such functions.
|
|
|
|
=== ABI compatibility and preparing for security updates
|
|
|
|
A stable binary interface (ABI) is vastly preferred for security
|
|
updates. Without a stable ABI, all reverse dependencies need
|
|
recompiling, which can be a lot of work and could even be
|
|
impossible in some cases. Ideally, a security update only
|
|
updates a single dynamic shared object, and is picked up
|
|
automatically after restarting affected processes.
|
|
|
|
Outside of extremely performance-critical code, you should
|
|
ensure that a wide range of changes is possible without breaking
|
|
ABI. Some very basic guidelines are:
|
|
|
|
* Avoid inline functions.
|
|
|
|
* Use the pointer-to-implementation idiom.
|
|
|
|
* Try to avoid templates. Use them if the increased type
|
|
safety provides a benefit to the programmer.
|
|
|
|
* Move security-critical code out of templated code, so that
|
|
it can be patched in a central place if necessary.
|
|
|
|
The KDE project publishes a document with more extensive
|
|
guidelines on ABI-preserving changes to {cpp} code, link:++https://community.kde.org/Policies/Binary_Compatibility_Issues_With_C%2B%2B++[Policies/Binary
|
|
Compatibility Issues With {cpp}]
|
|
(*d-pointer* refers to the
|
|
pointer-to-implementation idiom).
|
|
|
|
[[sect-Defensive_Coding-CXX-Language-CXX11]]
|
|
=== {cpp}0X and {cpp}11 Support
|
|
|
|
GCC offers different language compatibility modes:
|
|
|
|
* [option]`-std=c++98` for the original 1998 {cpp}
|
|
standard
|
|
|
|
* [option]`-std=c++03` for the 1998 standard with the
|
|
changes from the TR1 technical report
|
|
|
|
* [option]`-std=c++11` for the 2011 {cpp} standard. This
|
|
option should not be used.
|
|
|
|
* [option]`-std=c++0x` for several different versions
|
|
of {cpp}11 support in development, depending on the GCC
|
|
version. This option should not be used.
|
|
|
|
For each of these flags, there are variants which also enable
|
|
GNU extensions (mostly language features also found in C99 or
|
|
C11):
|
|
|
|
* [option]`-std=gnu++98`
|
|
* [option]`-std=gnu++03`
|
|
* [option]`-std=gnu++11`
|
|
|
|
Again, [option]`-std=gnu++11` should not be used.
|
|
|
|
If you enable {cpp}11 support, the ABI of the standard {cpp} library
|
|
`libstdc++` will change in subtle ways.
|
|
Currently, no {cpp} libraries are compiled in {cpp}11 mode, so if
|
|
you compile your code in {cpp}11 mode, it will be incompatible
|
|
with the rest of the system. Unfortunately, this is also the
|
|
case if you do not use any {cpp}11 features. Currently, there is
|
|
no safe way to enable {cpp}11 mode (except for freestanding
|
|
applications).
|
|
|
|
The meaning of {cpp}0X mode changed from GCC release to GCC
|
|
release. Earlier versions were still ABI-compatible with {cpp}98
|
|
mode, but in the most recent versions, switching to {cpp}0X mode
|
|
activates {cpp}11 support, with its compatibility problems.
|
|
|
|
Some {cpp}11 features (or approximations thereof) are available
|
|
with TR1 support, that is, with [option]`-std=c++03` or
|
|
[option]`-std=gnu++03` and in the
|
|
`<tr1/*>` header files. This includes
|
|
`std::tr1::shared_ptr` (from
|
|
`<tr1/memory>`) and
|
|
`std::tr1::function` (from
|
|
`<tr1/functional>`). For other {cpp}11
|
|
features, the Boost {cpp} library contains replacements.
|
|
|
|
[[sect-Defensive_Coding-CXX-Std]]
|
|
== The C++ Standard Library
|
|
|
|
The C++ standard library includes most of its C counterpart
|
|
by reference, see xref:programming-languages/C.adoc#chap-Defensive_Coding-C[Defensive Coding in C].
|
|
|
|
[[sect-Defensive_Coding-CXX-Std-Functions]]
|
|
=== Functions That Are Difficult to Use
|
|
|
|
This section collects functions and function templates which are
|
|
part of the standard library and are difficult to use.
|
|
|
|
[[sect-Defensive_Coding-CXX-Std-Functions-Unpaired_Iterators]]
|
|
==== Unpaired Iterators
|
|
|
|
Functions which use output operators or iterators which do not
|
|
come in pairs (denoting ranges) cannot perform iterator range
|
|
checking.
|
|
(See <<sect-Defensive_Coding-CXX-Std-Iterators>>)
|
|
Function templates which involve output iterators are
|
|
particularly dangerous:
|
|
|
|
* `std::copy`
|
|
|
|
* `std::copy_backward`
|
|
|
|
* `std::copy_if`
|
|
|
|
* `std::move` (three-argument variant)
|
|
|
|
* `std::move_backward`
|
|
|
|
* `std::partition_copy_if`
|
|
|
|
* `std::remove_copy`
|
|
|
|
* `std::remove_copy_if`
|
|
|
|
* `std::replace_copy`
|
|
|
|
* `std::replace_copy_if`
|
|
|
|
* `std::swap_ranges`
|
|
|
|
* `std::transform`
|
|
|
|
In addition, `std::copy_n`,
|
|
`std::fill_n` and
|
|
`std::generate_n` do not perform iterator
|
|
checking, either, but there is an explicit count which has to be
|
|
supplied by the caller, as opposed to an implicit length
|
|
indicator in the form of a pair of forward iterators.
|
|
|
|
These output-iterator-expecting functions should only be used
|
|
with unlimited-range output iterators, such as iterators
|
|
obtained with the `std::back_inserter`
|
|
function.
|
|
|
|
Other functions use single input or forward iterators, which can
|
|
read beyond the end of the input range if the caller is not careful:
|
|
|
|
* `std::equal`
|
|
|
|
* `std::is_permutation`
|
|
|
|
* `std::mismatch`
|
|
|
|
[[sect-Defensive_Coding-CXX-Std-String]]
|
|
=== String Handling with `std::string`
|
|
|
|
The `std::string` class provides a convenient
|
|
way to handle strings. Unlike C strings,
|
|
`std::string` objects have an explicit length
|
|
(and can contain embedded NUL characters), and storage for its
|
|
characters is managed automatically. This section discusses
|
|
`std::string`, but these observations also
|
|
apply to other instances of the
|
|
`std::basic_string` template.
|
|
|
|
The pointer returned by the `data()` member
|
|
function does not necessarily point to a NUL-terminated string.
|
|
To obtain a C-compatible string pointer, use
|
|
`c_str()` instead, which adds the NUL
|
|
terminator.
|
|
|
|
The pointers returned by the `data()` and
|
|
`c_str()` functions and iterators are only
|
|
valid until certain events happen. It is required that the
|
|
exact `std::string` object still exists (even
|
|
if it was initially created as a copy of another string object).
|
|
Pointers and iterators are also invalidated when non-const
|
|
member functions are called, or functions with a non-const
|
|
reference parameter. The behavior of the GCC implementation
|
|
deviates from that required by the {cpp} standard if multiple
|
|
threads are present. In general, only the first call to a
|
|
non-const member function after a structural modification of the
|
|
string (such as appending a character) is invalidating, but this
|
|
also applies to member function such as the non-const version of
|
|
`begin()`, in violation of the {cpp} standard.
|
|
|
|
Particular care is necessary when invoking the
|
|
`c_str()` member function on a temporary
|
|
object. This is convenient for calling C functions, but the
|
|
pointer will turn invalid as soon as the temporary object is
|
|
destroyed, which generally happens when the outermost expression
|
|
enclosing the expression on which `c_str()`
|
|
is called completes evaluation. Passing the result of
|
|
`c_str()` to a function which does not store
|
|
or otherwise leak that pointer is safe, though.
|
|
|
|
Like with `std::vector` and
|
|
`std::array`, subscribing with
|
|
`operator[]` does not perform bounds checks.
|
|
Use the `at(size_type)` member function
|
|
instead. See <<sect-Defensive_Coding-CXX-Std-Subscript>>.
|
|
Furthermore, accessing the terminating NUL character using
|
|
`operator[]` is not possible. (In some
|
|
implementations, the `c_str()` member function
|
|
writes the NUL character on demand.)
|
|
|
|
Never write to the pointers returned by
|
|
`data()` or `c_str()`
|
|
after casting away `const`. If you need a
|
|
C-style writable string, use a
|
|
`std::vector<char>` object and its
|
|
`data()` member function. In this case, you
|
|
have to explicitly add the terminating NUL character.
|
|
|
|
GCC's implementation of `std::string` is
|
|
currently based on reference counting. It is expected that a
|
|
future version will remove the reference counting, due to
|
|
performance and conformance issues. As a result, code that
|
|
implicitly assumes sharing by holding to pointers or iterators
|
|
for too long will break, resulting in run-time crashes or worse.
|
|
On the other hand, non-const iterator-returning functions will
|
|
no longer give other threads an opportunity for invalidating
|
|
existing iterators and pointers because iterator invalidation
|
|
does not depend on sharing of the internal character array
|
|
object anymore.
|
|
|
|
[[sect-Defensive_Coding-CXX-Std-Subscript]]
|
|
=== Containers and `operator[]`
|
|
|
|
Many sequence containers similar to `std::vector`
|
|
provide both `operator[](size_type)` and a
|
|
member function `at(size_type)`. This applies
|
|
to `std::vector` itself,
|
|
`std::array`, `std::string`
|
|
and other instances of `std::basic_string`.
|
|
|
|
`operator[](size_type)` is not required by the
|
|
standard to perform bounds checking (and the implementation in
|
|
GCC does not). In contrast, `at(size_type)`
|
|
must perform such a check. Therefore, in code which is not
|
|
performance-critical, you should prefer
|
|
`at(size_type)` over
|
|
`operator[](size_type)`, even though it is
|
|
slightly more verbose.
|
|
|
|
The `front()` and `back()`
|
|
member functions are undefined if a vector object is empty. You
|
|
can use `vec.at(0)` and
|
|
`vec.at(vec.size() - 1)` as checked
|
|
replacements. For an empty vector, `data()` is
|
|
defined; it returns an arbitrary pointer, but not necessarily
|
|
the NULL pointer.
|
|
|
|
[[sect-Defensive_Coding-CXX-Std-Iterators]]
|
|
=== Iterators
|
|
|
|
Iterators do not perform any bounds checking. Therefore, all
|
|
functions that work on iterators should accept them in pairs,
|
|
denoting a range, and make sure that iterators are not moved
|
|
outside that range. For forward iterators and bidirectional
|
|
iterators, you need to check for equality before moving the
|
|
first or last iterator in the range. For random-access
|
|
iterators, you need to compute the difference before adding or
|
|
subtracting an offset. It is not possible to perform the
|
|
operation and check for an invalid operator afterwards.
|
|
|
|
Output iterators cannot be compared for equality. Therefore, it
|
|
is impossible to write code that detects that it has been
|
|
supplied an output area that is too small, and their use should
|
|
be avoided.
|
|
|
|
These issues make some of the standard library functions
|
|
difficult to use correctly, see <<sect-Defensive_Coding-CXX-Std-Functions-Unpaired_Iterators>>.
|