defensive-coding-guide/en-US/Java-Language.xml

291 lines
12 KiB
XML

<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
]>
<section id="sect-Defensive_Coding-Java-Language">
<title>The core language</title>
<para>
Implementations of the Java programming language provide strong
memory safety, even in the presence of data races in concurrent
code. This prevents a large range of security vulnerabilities
from occurring, unless certain low-level features are used; see
<xref linkend="sect-Defensive_Coding-Java-LowLevel"/>.
</para>
<section id="sect-Defensive_Coding-Java-Language-ReadArray">
<title>Inceasing robustness when reading arrays</title>
<para>
External data formats often include arrays, and the data is
stored as an integer indicating the number of array elements,
followed by this number of elements in the file or protocol data
unit. This length specified can be much larger than what is
actually available in the data source.
</para>
<para>
To avoid allocating extremely large amounts of data, you can
allocate a small array initially and grow it as you read more
data, implementing an exponential growth policy. See the
<function>readBytes(InputStream, int)</function> function in
<xref linkend="ex-Defensive_Coding-Java-Language-ReadArray"/>.
</para>
<example id="ex-Defensive_Coding-Java-Language-ReadArray">
<title>Incrementally reading a byte array</title>
<xi:include href="snippets/Java-Language-ReadArray.xml"
xmlns:xi="http://www.w3.org/2001/XInclude" />
</example>
<para>
When reading data into arrays, hash maps or hash sets, use the
default constructor and do not specify a size hint. You can
simply add the elements to the collection as you read them.
</para>
</section>
<section id="sect-Defensive_Coding-Java-Language-Resources">
<title>Resource management</title>
<para>
Unlike C++, Java does not offer destructors which can deallocate
resources in a predictable fashion. All resource management has
to be manual, at the usage site. (Finalizers are generally not
usable for resource management, especially in high-performance
code; see <xref
linkend="sect-Defensive_Coding-Java-Language-Finalizers"/>.)
</para>
<para>
The first option is the
<literal>try</literal>-<literal>finally</literal> construct, as
shown in <xref linkend="ex-Defensive_Coding-Java-Language-Finally"/>.
The code in the <literal>finally</literal> block should be as short as
possible and should not throw any exceptions.
</para>
<example id="ex-Defensive_Coding-Java-Language-Finally">
<title>Resource management with a
<literal>try</literal>-<literal>finally</literal> block</title>
<xi:include href="snippets/Java-Finally.xml"
xmlns:xi="http://www.w3.org/2001/XInclude" />
</example>
<para>
Note that the resource allocation happens
<emphasis>outside</emphasis> the <literal>try</literal> block,
and that there is no <literal>null</literal> check in the
<literal>finally</literal> block. (Both are common artifacts
stemming from IDE code templates.)
</para>
<para>
If the resource object is created freshly and implements the
<literal>java.lang.AutoCloseable</literal> interface, the code
in <xref
linkend="ex-Defensive_Coding-Java-Language-TryWithResource"/> can be
used instead. The Java compiler will automatically insert the
<function>close()</function> method call in a synthetic
<literal>finally</literal> block.
</para>
<example id="ex-Defensive_Coding-Java-Language-TryWithResource">
<title>Resource management using the
<literal>try</literal>-with-resource construct</title>
<xi:include href="snippets/Java-TryWithResource.xml"
xmlns:xi="http://www.w3.org/2001/XInclude" />
</example>
<para>
To be compatible with the <literal>try</literal>-with-resource
construct, new classes should name the resource deallocation
method <function>close()</function>, and implement the
<literal>AutoCloseable</literal> interface (the latter breaking
backwards compatibility with Java 6). However, using the
<literal>try</literal>-with-resource construct with objects that
are not freshly allocated is at best awkward, and an explicit
<literal>finally</literal> block is usually the better approach.
</para>
<para>
In general, it is best to design the programming interface in
such a way that resource deallocation methods like
<function>close()</function> cannot throw any (checked or
unchecked) exceptions, but this should not be a reason to ignore
any actual error conditions.
</para>
</section>
<section id="sect-Defensive_Coding-Java-Language-Finalizers">
<title>Finalizers</title>
<para>
Finalizers can be used a last-resort approach to free resources
which would otherwise leak. Finalization is unpredictable,
costly, and there can be a considerable delay between the last
reference to an object going away and the execution of the
finalizer. Generally, manual resource management is required;
see <xref linkend="sect-Defensive_Coding-Java-Language-Resources"/>.
</para>
<para>
Finalizers should be very short and should only deallocate
native or other external resources held directly by the object
being finalized. In general, they must use synchronization:
Finalization necessarily happens on a separate thread because it is
inherently concurrent. There can be multiple finalization
threads, and despite each object being finalized at most once,
the finalizer must not assume that it has exclusive access to
the object being finalized (in the <literal>this</literal>
pointer).
</para>
<para>
Finalizers should not deallocate resources held by other
objects, especially if those objects have finalizers on their
own. In particular, it is a very bad idea to define a finalizer
just to invoke the resource deallocation method of another object,
or overwrite some pointer fields.
</para>
<para>
Finalizers are not guaranteed to run at all. For instance, the
virtual machine (or the machine underneath) might crash,
preventing their execution.
</para>
<para>
Objects with finalizers are garbage-collected much later than
objects without them, so using finalizers to zero out key
material (to reduce its undecrypted lifetime in memory) may have
the opposite effect, keeping objects around for much longer and
prevent them from being overwritten in the normal course of
program execution.
</para>
<para>
For the same reason, code which allocates objects with
finalizers at a high rate will eventually fail (likely with a
<literal>java.lang.OutOfMemoryError</literal> exception) because
the virtual machine has finite resources for keeping track of
objects pending finalization. To deal with that, it may be
necessary to recycle objects with finalizers.
</para>
<para>
The remarks in this section apply to finalizers which are
implemented by overriding the <function>finalize()</function>
method, and to custom finalization using reference queues.
</para>
</section>
<section id="sect-Defensive_Coding-Java-Language-Exceptions">
<title>Recovering from exceptions and errors</title>
<para>
Java exceptions come in three kinds, all ultimately deriving
from <literal>java.lang.Throwable</literal>:
</para>
<itemizedlist>
<listitem>
<para>
<emphasis>Run-time exceptions</emphasis> do not have to be
declared explicitly and can be explicitly thrown from any
code, by calling code which throws them, or by triggering an
error condition at run time, like division by zero, or an
attempt at an out-of-bounds array access. These exceptions
derive from from the
<literal>java.lang.RuntimeException</literal> class (perhaps
indirectly).
</para>
</listitem>
<listitem>
<para>
<emphasis>Checked exceptions</emphasis> have to be declared
explicitly by functions that throw or propagate them. They
are similar to run-time exceptions in other regards, except
that there is no language construct to throw them (except
the <literal>throw</literal> statement itself). Checked
exceptions are only present at the Java language level and
are only enforced at compile time. At run time, the virtual
machine does not know about them and permits throwing
exceptions from any code. Checked exceptions must derive
(perhaps indirectly) from the
<literal>java.lang.Exception</literal> class, but not from
<literal>java.lang.RuntimeException</literal>.
</para>
</listitem>
<listitem>
<para>
<emphasis>Errors</emphasis> are exceptions which typically
reflect serious error conditions. They can be thrown at any
point in the program, and do not have to be declared (unlike
checked exceptions). In general, it is not possible to
recover from such errors; more on that below, in <xref
linkend="sect-Defensive_Coding-Java-Language-Exceptions-Errors"/>.
Error classes derive (perhaps indirectly) from
<literal>java.lang.Error</literal>, or from
<literal>java.lang.Throwable</literal>, but not from
<literal>java.lang.Exception</literal>.
</para>
</listitem>
</itemizedlist>
<para>
The general expection is that run-time errors are avoided by
careful programming (e.g., not dividing by zero). Checked
exception are expected to be caught as they happen (e.g., when
an input file is unexpectedly missing). Errors are impossible
to predict and can happen at any point and reflect that
something went wrong beyond all expectations.
</para>
<section id="sect-Defensive_Coding-Java-Language-Exceptions-Errors">
<title>The difficulty of catching errors</title>
<para>
Errors (that is, exceptions which do not (indirectly) derive
from <literal>java.lang.Exception</literal>), have the
peculiar property that catching them is problematic. There
are several reasons for this:
</para>
<itemizedlist>
<listitem>
<para>
The error reflects a failed consistenty check, for example,
<literal>java.lang.AssertionError</literal>.
</para>
</listitem>
<listitem>
<para>
The error can happen at any point, resulting in
inconsistencies due to half-updated objects. Examples are
<literal>java.lang.ThreadDeath</literal>,
<literal>java.lang.OutOfMemoryError</literal> and
<literal>java.lang.StackOverflowError</literal>.
</para>
</listitem>
<listitem>
<para>
The error indicates that virtual machine failed to provide
some semantic guarantees by the Java programming language.
<literal>java.lang.ExceptionInInitializerError</literal>
is an example—it can leave behind a half-initialized
class.
</para>
</listitem>
</itemizedlist>
<para>
In general, if an error is thrown, the virtual machine should
be restarted as soon as possible because it is in an
inconsistent state. Continuing running as before can have
unexpected consequences. However, there are legitimate
reasons for catching errors because not doing so leads to even
greater problems.
</para>
<para>
Code should be written in a way that avoids triggering errors.
See <xref linkend="sect-Defensive_Coding-Java-Language-ReadArray"/>
for an example.
</para>
<para>
It is usually necessary to log errors. Otherwise, no trace of
the problem might be left anywhere, making it very difficult
to diagnose realted failures. Consequently, if you catch
<literal>java.lang.Exception</literal> to log and suppress all
unexpected exceptions (for example, in a request dispatching
loop), you should consider switching to
<literal>java.lang.Throwable</literal> instead, to also cover
errors.
</para>
<para>
The other reason mainly applies to such request dispatching
loops: If you do not catch errors, the loop stops looping,
resulting in a denial of service.
</para>
<para>
However, if possible, catching errors should be coupled with a
way to signal the requirement of a virtual machine restart.
</para>
</section>
</section>
</section>