defensive-coding-guide/en-US/Web_Applications.xml.txt

<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
]>
<chapter id="chap-Defensive_Coding-Web_Applications">
	<title>Web Applications</title>
	<para>How to protect your web application against some of the more common format of attacks.</para>

	<section id="sect-Defensive_Coding-Web_Applications-Cross_site_scripting">
		<title>Cross site scripting</title>
		<para>Cross Site Scripting (XSS) vulnerabilities are often used a foothold for more sophisticated vectors and it is why they need to be taken seriously. Generally speaking there are two forms of XSS attacks, reflected and persisted.</para>

		<section>
			<title>Reflected XSS attack</title>
			<para>A reflected XSS attack occurs when an attacker can directly manipulate a HTTP request parameter to inject malicious Javascript code into the rendered page.  The attacker tricks a victim to click on the specially crafted link to execute malicious script.</para>
			<para>Consider a simple servlet that greets a user via a HTTP request parameter:</para>
<code>
  protected void doGet(...){
    response.getWriter().printf("Welcome %s", request.getParameter('name'));
  }
</code>

			<para>An attacker could in theory create a link that contains Javascript that will be executed in the victims browser like so:</para>
<code>
http://example.com/?name=&#60;script&#62;alert(1)&#60;/script&#62;
</code>
		</section>
		<section>
			<title>Persisted XSS attack</title>
			<para>This variation of the XSS attack uses the ability of a registered user to be able to store content in the web applications database. The attacker saves the malicious script to the database via a web form with the knowledge that it will be embedded in a rendered page at a later time when viewed by a victim. This scenario is most common in sites that allow users to post comments or rate products without taking the appropriate security precautions.</para>
		</section>
	</section>
	<section>
		<title>Defending against XSS attacks</title>
		<para/>

		<section>
			<title>Escaping user controlled content</title>
			<para>The most effective mechanism for preventing XSS attacks is to ensure that any user controlled content is safely escaped. Most modern web frameworks have mechanisms to do this which are automatically enabled. There should be limited reasons to stray outside of this practice for all user controlled input.</para>
			<para>The Seam framework escapes content by default unless explicitly told not to.</para>

<code>
Will be escaped
&#60;h:outputText value="#{param.name}"&#62;

Won't be escaped (DON'T DO THIS!)
&#60;h:outputText value="#{param.name}" escape=false&#62;
</code>

			<para>Similarly Flask templates will escape content by default unless explicitly told not to.</para>

<code>
This will be escaped
{{ user_controlled_input }}

Unless you do this
{{ user_controlled_input | safe }}
</code>
			<para>To escape HTML content directly within a Flask request handler you need only import escape.</para>

<code>
from flask import g, request, flash, abort, escape

@app.route('/say_hello')
def say_hello():
  return "&#60;p&#62;Hi %s&#60;/p&#62;" % (escape(request.args.get('name')))
</code>

			<para>You can also do this directly in Java code using Apache commons</para>
<code>
import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;

void doGet(...){
  response.getWriter().printf("&#60;p&#62;Hello, %s&#60;/p&#62;", escapeHtml(request.getParameter("name")));
}
</code>

			<important>
				<para>You should *NOT* try to implement your own escaping mechanisms or blacklists when a better, security tested alternative is available.</para>
			</important>

			<para>Some caveats around how to escape content correctly and safely will be covered in more details in the &#60;_output_escaping_and_encoding,output escaping and encoding&#62;&#62; section.</para>

			<section>
				<title>Monitoring and validating HTTP parameters</title>
				<para>Tampering with HTTP parameters can be a good indication that your site is being probed for weaknesses. It is a good idea to validate user HTTP parameters against a whitelist of trusted input characters that omit anything that may be used to construct malicious input.</para>
<code>
def say_hello():
  user_input = request.args.get('name')
  whitelist = re.compile('^\w+$')
  if not whitelist.match(user_input):
    return abort(400)

  return "Hello %s" % escape(user_input)
</code>
			</section>
		</section>
		<section>
			<title>Leveraging browser security features</title>
			<para>Most modern browsers have added capabilities to thwart these types of attacks. However to take advantage of them some effort is required by the web application developer.</para>
			<section>
				<title>X-XSS-Protection</title>
				<para>The first of these measures is a simple header that should be supplied with each HTTP response from the server. It ensures that a XSS filter is enabled in some older browsers (IE8 and IE9). The filter mechanism in these browsers is disabled by default. Other browsers will ignore this header.</para>
<code>
// You can manually add this header to each request using
// a custom filter and enabling it in your web.xml
public void doFilter(...){
  request.addHeader("X-XSS-Protection", "1; mode=block;");
  ...
}
</code>
<code>
# (flask)
@app.after_request
def xss_protection(rsp):
  rsp.headers['X-XSS-Protection'] = '1; mode=block;'
</code>
			</section>
			<section>
				<title>Content Security Policy (CSP)</title>
				<para>The trouble with XSS from a browser's perspective is that is difficult to distinguish between a script that's intended to be part of a web application and a script that has been injected by an attacker. Content Security Policy (CSP) is a standard that allows application developers to specify whitelists of trusted sources for their content.</para>
				<para>There are however some caveats to using this feature. In-lined content is considered harmful so all content must be loadable from an external source. This means that you cannot embed scripts and styles within HTML content.</para>
<code>
&#60;!--
       You can no longer use inline content when working with CSP.
  This is an example of things that will no longer work.
--&#62;
&#60;html&#62;
  &#60;head&#62;
    &#60;title&#62;Example&#60;/title&#62;
    &#60;!--
               All styles need to be loaded from an explicit source
      rather than being embedded in a page.
    --&#62;
    &#60;style&#62;
      #foo {
        padding-top: 2em;
        margin-left: 1.5em;
      };
    &#60;/style&#62;
  &#60;/head&#62;
  &#60;body
    &#60;div id="foo"&#62;
      &#60;h1&#62;Example&#60;/h1&#62;
      &#60;p&#62;
        This is an example of the kind of things that you can no longer do.
        You can't invoke &#60;a id='say_hello' href="javascript:sayHello()"&#62;javascript via links&#60;/a&#62;
        anymore.
      &#60;/p&#62;
      &#60;p id="bad_style" style="color:red"&#62;
        Inline styles are also prevented.
      &#60;/p&#62;
    &#60;/div&#62;

    &#60;!--
               All scripts need to be loaded from an explicit source rather
      than being embedded in a page.
    --&#62;
    &#60;script&#62;
      function sayHello(){
        alert("Hello world");
      }
    &#60;/script&#62;
  &#60;/body&#62;
&#60;/html&#62;
</code>
				<para>Instead you should load content from an explicit source. For example the previous code snippet may be rewritten as follows.</para>
<code>
/* hello.css */
#foo {
  padding-top: 2em;
  margin-left: 1.5em;
}

#bad_style {
  color: red;
}
</code>
<code>
// hello.js
function sayHello(){
  alert("Hello world");
}

document.addEventListner('DOMContentLoaded', function(){
  document.getElementById('say_hello').addEventListener('click', sayHello);
});
</code>
<code>
&#60;html&#62;
  &#60;head&#62;
    &#60;link rel="stylesheet" type="text/css" href="hello.css"&#62;
  &#60;/head&#62;
  &#60;body&#62;
    &#60;div id="foo"&#62;
      &#60;h1&#62;Example&#60;/h1&#62;
      &#60;p&#62;
        In this example the scripts and styles are loaded from
        an external source. The onclick event handler of &#60;a id='say_hello' href="#"&#62;this link&#60;/a&#62;
        is handled by the external javascript source file.  This can be used with
        CSP.
      &#60;/p&#62;
      &#60;p id="bad_style"&#62;
        Similarly the style of this paragrah is managed by the external CSS source. This
        approach can also be used with CSP.
      &#60;/p&#62;
    &#60;/div&#62;
    &#60;script type="text/javascript" src="hello.js"&#62;&#60;/script&#62;
  &#60;/body&#62;
&#60;/html&#62;
</code>
				<para>The content security policy can be set using the following headers:</para>
<code>
[options="header"]
|===
| Header | Browsers

| Content-Security-Policy
| Chrome 25+, Firefox 23+

| X-Content-Security-Policy
| Firefox 4+, IE 10+ (partial)

| X-Webkit-CSP
| Chrome 14+, Safari 6+

|===
</code>
<para>When specifying the header you can create a fine grained policy using a combination
of the following directives:</para>
</code>
[options="header"]
|===
| Directive | Description

| default-src
| The default policy for loading all content such as images, scripts, stylesheets,
fonts etc. This can be used when a simple broad policy will apply for all content
within the web application.

| script-src
| Allows you to set policy around valid sources for Javascript content.

| object-src
| Defines valid sources of plugins such as &#60;object&#62;, &#60;embed&#62;, and &#60;applet&#62;.

| style-src
| Allows you to set policy around valid sources for stylesheets.

| img-src
| Allows you to set policy around valid sources of images.

| media-src
| Allows you to set policy around valid sources of HTML5 content like
&#60;audio&#62; and &#60;video&#62;.

| frame-src
| Allows you to set policy around valid sources for loading frames.

| font-src
| Allows you to set policy around valid sources for loading fonts.

| connect-src
| Allows you to put restrictions on AJAX requests, websockets and
event source.

| sandbox
| The sandbox feature allows further preventative restrictions to be
placed on the browser context. When specified it will have to most restrictive
environment enabled by default, and can disable sandboxing for specific
features that are required by the site.

*allow-top-navigation*: If not specified along with the sandbox directive auxiliary browsing contexts will be disabled. Using target, window.open() or showModalDiaglog() will be blocked by the browser.

*allow-same-origin*: If not specified along with the sandbox directive forces the content into a unique origin. Preventing it from accessing other content from the same origin. It also means that scripts will not be able to access document.cookie or local storage.

*allow-forms*: If not specified along with the sandbox directive form submission will be blocked at the browser.

*allow-scripts*: If not specified with the sandbox directive then no scripts will be executed by the browser.

| report-uri
| CSP failures can be reported back to the application server. This directive allows you to
specify the URI to send the CSP report to.


|===
</code>
<para>Most of these directives need to be applied to a source to be enforced by the browser.
The exceptions being the sandbox and report-uri directives which have their own special
purposes. For ther other directives you may enforce your whitelist by selective usage
of the following sources. </para>
<code>
[options="headers"]
|===
| Source | Description

| *
| Wildcard will allow anything to be loaded for this directive.

| `none'
| Prevents loading the content from anywhere.

| `self'
| Restricts the content from being loaded from the same origin (host, scheme and port).

| data
| Allows content to be loaded via the data scheme. For example a base64 encoded img.

| hostname.example.com or *.example.com
| Allows content to be loaded from the specified hostname.

| https://cdn.example.com
| An explicit URI to load content from.

|===
</code>
<para>
Creating a policy requires knowing where your web assets are coming from and restricting
external sources as much as possible. Most web applications will be able to get away with
disabling media, embedded content and frames entirely. The best way to retrofit a policy
is to start in a restrictive mode, run your test suite and examine any reported CSP failures.
The following code snippet demonstrates how you might create a policy for browsers
supporting CSP.
</para>
<code>
[source,java]
----

  // Policy explicitly disables all content then selectively enables
  // features required by most modern sites.
  String policy = "default-src 'none';";

  // Enable 'self' for commonly directives.
  String[] directives = {
    "script-src",
    "connect-src",
    "img-src",
    "style-src"
  };

  for (String directive : directives){
    policy += String.format(" %s 'self';", directive);
  }

  // Add appropriate header for browser
  String userAgent = request.getHeader("user-agent");
  Pattern chrome = Pattern.compile(" Chrome/([0-9]+)");
  Matcher browser = null;

  if ((browser = chrome.matcher(userAgent)).find()){
    int version = Integer.parseInt(browser.group(1));
    if (version &#62;= 25){
      response.addHeader("Content-Security-Policy", policy);
    } else if (version &#60; 25 &#35;&#35; version &#62; 14){
      response.addHeader("X-Webkit-CSP", policy);
    } else {
      log.debug("CSP not supported by : " + userAgent);
    }
  }

  // ... etc.


----
</code>
			</section>
			<section>
				<title>Summary</title>
				<para>Using the CSP in conjunction with the aforementioned mitigation strategies is a really good defense in depth approach to reduce the likelihood of XSS attacks. The great thing is that it requires very little effort by web developers to get a substantial gain in security and therefore it is something that should be considered as a high priority for those looking to boost the security of their applications.</para>
<note><title>TIP</title><para><simplelist>
<member>Validate user input against a whitelist</member>
<member>Escape user controlled input</member>
<member>Monitor the use of HTTP parameters for signs of attacks.</member>
<member>Use CSP to limit the trusted sources of scripts in your web application.</member>
</simplelist></para></note>
			</section>
		</section>
	</section>
	<section>
		<title>Session Hijacking</title>
		<para />
		<section>
			<title>Overview</title>
			<para>A session hijacking attack is when an attacker manages to steal an existing
users session token and impersonate them when talking to the server.
Session Hijacking is one possible use of a XSS attack, although sessions may
be hijacked by other means. For instance, an attacker may construct a link that
pre-emptively sets a users session identifier to a known value which the user
then authenticates. Other attacks only require you to be logged into the
same network as the attacker and they can simply sniff the network traffic
to access your session token.</para>

			<para>As a result it is essential to ensure all authenticated network traffic is
performed over a secure channel via TLS. Session tokens must be also transmitted
in a secure manner.</para>
		</section>
		<section>
			<title>Defending against Session Hijacking</title>
			<para />
			<section>
				<title>Use non-deterministic session identifiers</title>
				<para>Predictable session identifiers also be used in a session hijacking attack. An attacker
only has to guess a valid session identifier to impersonate a different user. Most
web frameworks already include a vetted session cookie implementation.  It is not
recommended that you concoct your own session identification mechanism. It is however
recommended that you verify that the session identifier is suitably random. A useful tool for doing this is called <ulink url="http://lcamtuf.coredump.cx/soft/stompy.tgz">Stompy</ulink>.</para>
<para>Stompy is a command line entropy verifier for session cookies and XSRF tokens.</para>

<!--

.TODO
  - Include stompy usage example?
-->
			</section>
			<section>
				<title>Set the HTTP Only flag</title>
				<para>A good protective measure against session hijacking via XSS is to use
a session cookie that cannot be accessed by client side Javascript. This
is achieved by setting the HTTP only flag on the cookie.</para>

				<para>For most web containers you can specify the following option in your web.xml file.</para>
<code>
[source,xml]
----
...

  &#60;session-config&#62;
    &#60;cookie-config&#62;
      &#60;http-only&#62;true&#60;/http-only&#62;
    &#60;/cookie-config&#62;
  &#60;/session-config&#62;

...
----
</code>
<para>Of course the +HttpOnly+ flag can be set programmatically when
creating the cookie too.</para>

<code>
[source,java]
----
  response.addHeader("Set-Cookie", "foo=bar; HttpOnly;");
----
</code>
			</section>
			<section>
				<title>Set the Secure Flag</title>
				<para>Setting the HttpOnly flag will prevent the session cookie from
being accessed via client side Javascript, but it doesn't protect the
token being transmitted over an insecure channel. Setting the +Secure+
flag on the session cookie will tell the browser not to transmit the
cookie over an insecure channel.</para>

<para>Again, this can be done via the web containers web.xml file.</para>

<code>
[source, xml]
----
...

  &#60;session-config&#62;
    &#60;cookie-config&#62;
      &#60;http-only&#62;true&#60;/http-only&#62;
      &#60;secure&#62;true&#60;/secure&#62;
    &#60;/cookie-config&#62;
  &#60;/session-config&#62;
...

----
</code>
<para>Or programatically.</para>
<code>
[source,java]
----
  response.addHeader("Set-Cookie", "foo=bar; HttpOnly; Secure;");
----
</code>
			</section>
			<section>
				<title>Strict transport security (HSTS)</title>
				<para>Taking this a step further, you should ensure that all authenticated
network traffic is sent via a TLS connection. Enabling HTTP Strict Transport
Security informs compliant browsers to only interact with the web service
via a secure HTTPS connection. This protection mechanism is most effective
in preventing TLS stripping attacks and helps prevent hijacking by
ensuring a secure connection is always used with the server.</para>

				<para>To enable HSTS you need only add a Strict-Transport-Security header to
client responses with a +max-age+ value in seconds. The +max-age+
attribute indicates to the browser how long it should honour the HTST
transport request.</para>

<code>
[source,java]
----
  ...
  response.addHeader("Strict-Transport-Security", "max-age=86400; includeSubdomains");
  ...
----
</code>
			</section>
		</section>
		<section>
			<title>Summary</title>
			<para>Using a combination of all these measures will help prevent the incidence of
session hijacking on your site.</para>
			<note><title>TIP</title><para><simplelist>
<member>*DON'T* Use deterministic session identifiers</member>
<member>*DON'T* Send a session identifier via a HTTP parameter or in the URI</member>
<member>*DON'T* Allow client side Javascript to have access to the session token.</member>
<member>*DON'T* Send session tokens over an insecure channel</member>
<member>*DON'T* Allow authenticated content to travel via HTTP</member>
</simplelist></para></note>
		</section>
	</section>
	<section>
		<title>Click Jacking</title>
		<para />
		<section>
			<title>Overview</title>
			<para>Click Jacking is when an attacker conceals the true nature of a site using
techniques as cursor spoofing or iframe overlays to trick a user into
clicking on a malicious link to perform unintended actions.</para>
		</section>
		<section>
			<title>Content Security Policy</title>
			<para>The &#60;&#60;_content_security_policy_csp, XSS section&#62;&#62; introduced
content security policy, and defining a tight CSP can also
help mitigate this class of attack.  W3C currently has
a link:https://www.w3.org/TR/UISafety/[working draft]
of how CSP may be futher used to mitigate user interface redressing.</para>
		</section>
		<section>
			<title>X-Frame-Options</title>
			<para>It is common for this class of attack to use iframes to obscure the
actual nature of the site. The +X-Frame-Options+ header was introduced
by Microsoft in IE8. It allows web application developers to prevent frame
based UI redressing. All modern browsers now support this option so it
is worth turning on.</para>

There are three configuration options for this header.
[options="headers"]
|===
| Option | Description

| DENY
| The page cannot be displayed in a frame, regardless of the site attempting to do so.

| SAMEORIGIN
| The page can only be displayed in a frame on the same origin as the
page itself.

| ALLOW-FROM uri
| The page can only be displayed in a frame on the specified origin.
|===

[source, java]
----

// prevent all frames
response.addHeader("X-Frame-Options", "DENY");

// allow frames from same origin
response.addHeader("X-Frame-Options", "SAMEORIGIN");

// allow frames from specific uri
response.addHeader("X-Frame-Options", "ALLOW-FROM http://example.com");

----


=== Confirmation

It may sound simple but to protect against click jacking another
approach is to use +window.confirm()+ to notify the user of
the action they are about to perform. This is because it creates
a popup that cannot be framed and hidden by the attacker.

=== Summary

Click Jacking can be serious and cause carnage to users of your site. It is worth
taking some basic precautions when creating your site.

[TIP]
====
* Set the X-Frame-Options header to +DENY+ unless frames are needed for the site.
* Request user confirmation before attempting irreversible actions.
====


== Cross site request forgery (CSRF/XSRF)

=== Overview

A Cross site request forgery (CSRF) attack leverages an authenticated
users existing session to issue commands on their behalf.  This attack
generally involves some aspect of social engineering to get a user to click
on a malicious link on a third party site or email. The link is usually
specially crafted to execute business logic on a site for which the user
already has an authenticated session for.

=== CSRF Token

It is common for most web frameworks to include mechanism that creates a
unique CSRF token for each request. The CSRF attack is thwarted by this
mechanism as the attacker cannot spoof the random value.


Starting from Seam 2.2.1 you can add a CSRF token to user forms as follows.

[source, html]
----
&#60;h:form&#62;
  &#60;s:token/&#62;
  ...
&#60;/h:form&#62;

----

Python Flask-WTF forms have CSRF token protection enabled by default. You need
only remember to include the hidden tags in your form template. This will
include the random _csrf_token.

[source, html]
----
&#60;form method="POST" action="/transfer"&#62;
  {{ form.hidden_tag() }}
  &#60;!-- Define the rest of form fields goes here --&#62;
  &#60;button type="submit"&#62;Transfer funds&#60;/button&#62;
&#60;/form&#62;

----

[IMPORTANT]
====
The CSRF token must be sufficiently random and non deterministic. This can
be confirmed by tools like Stompy. Again you should *NOT* try and implement
your own CSRF mechanism.

====


=== Referrer Header Checks

It is also difficult for an attacker to spoof the referrer header that
is sent by the browser in each request. It is therefore pertinent to
enable referrer checks within your web framework. These checks should
ensure that the request originated from the same origin as the website
executing the business logic.

This technique can be effective but is not fool proof. Browsers and
proxies can be configured to strip the referrer header for privacy
reasons. It is therefore recommended to use this in conjunction with
CSRF tokens to counter this class of attack.

If your framework does not include referrer checks you can add them
by comparing the schema, hostname and port of the web application
against the supplied referrer header.

[source, python]
----
# TODO (This is completely untested)
from functools import wraps
from urlparse import urlparse
from flask import request, abort, render_template, redirect
from forms import TransferForm
from app import app

# Referrer check decorator. Checks the referrer
# against the configured SERVER_NAME for the webapp.
def referrer_check(f)
  @wraps(f)
  def decorated(*args, **kwargs):
    url = urlparse(request.referrer)
    if url.netloc and not(url.netloc == app.config['SERVER_NAME']):
      return abort(400)
  return decorated

@referrer_check
@app.route('/transfer', methods=[GET, POST])
def transfer_funds():
  form = TransferForm()
  if form.validate_on_submit():
    # process form
    pass
    return redirect('/')

  # Render form to user
  return render_template("transfer.html", form=form)

----


=== Summary

CSRF attacks are common place but can be prevented by deploying a secure
random CSRF token on a per-request or per-session basis. When used in
conjunction with referrer header checks it is difficult for an attack
of this class to be successful.

[TIP]
====
* Enable a CSRF token for business logic.
* Ensure the CSRF token is non deterministic.
* Enable referrer header checks in your web framework.

====

== Remote code execution

=== Overview

Remote code execution is usually the ultimate goal
of most attacks. In 1996 when the infamous _Smashing the
stack for fun and for profit_ article hit Phrack magazine
this was achieved via buffer overruns and remote injection
of shell code. However more modern garbage collected languages
aren't exempt from this category of attacks. There as still
many ways in which remote code can find its way into the
execution context of the application and cause unexpected
behaviours. This section covers some of the common
flaws in applications that can lead to remote code or command
execution.

=== Unsafe use of Serialization

Serialization in both the Java and Python language can lead
to malicious code execution. Technologies such as Java
RMI rely on serialization so care needs to be taken when defining
and deserializing objects that could've been exploited by a remote
attacker.

==== When it comes to serialization consider practicing abstinence

Generally speaking it is better to avoid serialization altogether and
fall back to a safer transport medium such as JSON to send messages
between services. JSON does not contain any features that would allow
an attacker to overwrite object code, and must be explicitly unpacked
to instantiate a remote object..

==== Control the serialization process

If serialization is an absolute must then there are several things that
you need to consider when developing your class.

  * Mark sensitive fields as transient
  * Use serialPersistentFields to restrict serialized state
  * Implement readObject, writeObject, and readObjectNoData methods with
    the following signatures:
     - private void writeObject(java.io.ObjectOutputStream out) throws IOException;
     - private void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException;
     - private void readObjectNoData() throws ObjectStreamException;

  * Selectively deserialize each field using readField rather than readObject
    from the ObjectInputStream.
  * Ensure that the class is declared as final or has a private constructor that
    performs a security check with the security manager.
  * Reduce privileges when deserializing objects from untrusted source
  * Do not use public static nonfinal variables
  * Always generate a unique serialization id for classes.
  * Don't serialize instances of inner classes

Even after taking these precautions it is still dangerous to
deserialize data that could be received from an external and
untrusted source.


=== Abuse of JSP Expression Language (EL)

The Java Expression Language provides a mechanism to dynamically access
Bean methods in the current scope and request context. Unfortunately
this capability can be abused if untrusted user input is not correctly
validated.

A simple example of this attack would be to bypass HttpOnly flag of a
session cookie by injecting the EL expression +${cookie["JSESSIONID"].value}+.

[source,html]
----
&#60;!--
     Consider what happens when any of these values are supplied
for the parameter without server side validation.
  http://example.com/?foo=${cookie["JSESSIONID"].value
  or
  http://example.com/?foo=${pageContext.request.getSession().setAttribute("admin", ture)}
--&#62;

...
&#60;h:outputText value="${param.foo}" /&#62;
...

----

To check for instances of EL injection vulnerabilities in your application
you should try a sending a token value for each request parameter that you
can check for in the resulting pages. For example you may send +${"IF_WE_FIND_THIS_VALUE_BAD_THINGS_COULD_HAPPEN"}+. If a generic error message or exception is thrown
then chances are the field may also be vulnerable to injection also.

A simple way to protect against EL injection is to properly sanitize
user controlled input. You should restrict all parameters using a
whitelist that does not include the characters '$#{}'. A defense in depth
approach will also restrict the capabilities of the the EL interpreter using
a Java security manager policy.

.TODO
  - Run this by SRT.

=== Remote command execution

Command injection occurs when user controlled input is passed
into a operating system command execution content or shell with
proper sanitization.

==== Python command injection gotchas

Python's subprocess module is somewhat impervious to command
injection however there are some use cases where unvalidated
user input could lead to a

===== Injection is possible when shell=True

Creating a subprocess in Python and setting the
shell setting to True can expose the application
to command injection attacks. If the application
does not escape or validate user input against a
secure whitelist of values an attacker may execute
multiple commands. For example.

[source, python]
----
def local_command_injection(cmd):
  pipe = subprocess.PIPE
  proc = subprocess.Popen([cmd], shell=True, stdin=pipe, stdout=pipe, stderr=pipe, close_fds=True)
  proc.wait()
  err = proc.stderr.read()
  if err:
    print(err)
  else:
    print(proc.stdout.read())


# A contrived usage which constructs the command string without
# validating user input.
user_input = ";cat /etc/passwd"
vulnerable_command = "ls %s" % user_input
local_command_injection(vulnerable_command)

----

Where possible constructing a subprocess using shell=True
should be avoided. If it must be used then caution
must be taken to escape user controlled content.

.TODO
  - Safe example


===== Injection is possible for commands run over ssh

Similarly, care needs to be taken when constructing command
strings that would be executed on remote machines. The
following code snippet demonstrates how user controlled input
could easily be executed using the Parmiko SSH library.

[source, python]
----
def remote_command_injection(host, port, cmd):
  user = getpass.getuser()
  passwd = getpass.getpass('Enter password for %s: ' % user)
  client = paramiko.SSHClient()
  client.load_system_host_keys()
  client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
  client.connect(host, username=user, password=passwd, port=port)
  stdin, stdout, stderr = client.exec_command(cmd)
  for line in stdout:
    print '... ' + line.strip('\n')

  client.close()

# A contrived usage which constructs the command string without
# validating user input.
user_input = ";cat /etc/passwd"
vulnerable_command = "ls %s" % user_input
remote_command_injection(vulnerable_command)

----

To protect against this scenario it is necessary to either
escape user input using shlex.escape and/or use a whitelist
of allowed characters to restrict the user input.

.TODO
  - Safe example


==== Java command injection gotchas

Java is at risk of command injection also. Again it comes back to
correctly sanitizing user input, and using a safe API where possible.
So instead of invoking something like +Runtime.getRuntime().exec(cmd + userArguments)+
it would be better to use +ProcessBuilder+ to construct the command.
However all input that comes from an untrusted source should be validated
against a whitelist of safe characters that will not allow shell
escaping.


==== Explicit mapping of command arguments

To further reduce the likelihood of command injection attacks you can
restrict command line arguments to a predefined subset of values that
the user may select by an enumerated value. This prevents passing
user controlled information directly to the command string.

=== Summary

It is difficult to construct applications that execute external commands
on the operating system. However if care is taken to limited the attack
surface and frequency of these types of calls then it is possible to
do it safely.

The same care must be taken when leveraging serialization and expression language
features of Java. Whilst using these language features can make program development
easier, it security implications associated with these capabilities also needs
some consideration.

[TIP]
====
* Choose a safe serialization mechanism such as JSON.
* Avoid using shell=True for Python subprocess
* Sanitize user input against a safe subset of characters that cannot be used
to have a side effect.
* Restrict the external commands that can be executed by the application.
* Favour libraries and SDK's over external command execution.

====


== File system attacks

=== Overview

When accessing and manipulating file system objects there are several factors
that need to be considered. Python and Java are both cross platform environments and
the discretionary access controls provided by operating systems alone do not
always provide adequate protection against certain classes of attacks.

Typically an attacker will exploit unrestricted file system access within an
application,  inadequate file system permissions or time of check time of use (TOCTOU)
flaws within the application. All of these attacks can be mitigated by restricting
the scope of file system access and creating files in a way that reduces the
chances of race conditions occurring.

=== Defeating path traversal

A path traversal attack is when an attacker supplies input that is used
directly to access a file on the file system. The input usually attempts
to break out of the applications working directory and access a file elsewhere
on the file system.

[source, java]
----
// Contrived example of how user input may be used to
// access unintended system files.

File getFile(String filename) throws IOException {
  return new File(userInput);
}

// oops
String userInput = "../../../../../../../../etc/passwd";
File f = getFile(userInput);

----

There are a couple of ways that this can be prevented. The first
explicitly restricts file access to a known safe directory, the other
is to only allow users to access files indirectly by using a enumerated
type to reference a particular known safe file.

==== Limiting file system access to specific safe directory

The key to preventing attackers from accessing external system files is
to limit access to the file system to a known safe directory. To do this
when the file name is directly supplied by user controlled input you *must*
process the file path in its canonical form. This means resolving all
directory switching and environment variables to get the absolute path
of the path string. You then should ensure that the absolute path
resides in the subset of paths that have the safe directory as the root.


Java provides the method +getCanonicalPath+ which will safely resolve
the absolute path based on the supplied input. This can then be used
to determine if the file resides within a safe directory.

[source, java]
----

// This solution is suitable for the Java 1.6 and under. If you are
// using the Java 1.7 SDK you should consider the java.nio.file.Path
// methods to resolve paths, and startsWith options.

boolean isSafeFilePath(String safeDir,  String path) throws IOException {
  return new File(path).getCanonicalPath().startsWith(safeDir);
}

File getFile(path){
  if (isSafeFilePath("/var/www", userControlledInput)){
    return new File(userControlledInput);
  }
  throw IOException("Attempted path traversal");
}

----

A similar feat can be achieved using Python. In this example you have
the option of resolving symbolic links to ensure that they cannot be
used to escape the supplied base directory.

[source,python]
----
import os

def is_safe_path(basedir, path, follow_symlinks=True):

  # resolves symbolic links
  if follow_symlinks:
    return os.path.realpath(path).startswith(basedir)

  return os.path.abspath(path).startswith(basedir)

----


==== Indirect file system object mapping

Another approach to restricting file system access to maintain an
indirect mapping between a unique identifier and a file path that
exists on the operating system. This prevents users supplying
malicious input to access unintended files.

[source,python]
----

localfiles = {
  "01" : "/var/www/img/001.png",
  "02" : "/var/www/img/002.png",
  "03" : "/var/www/img/003.png",
}

# Will raise an error if an invalid key is used.
def get_file(file_id):
  return open(localfiles[file_id])

----

=== Safely creating temporary files and directories.

On most systems the temporary directory provides a shared location that
can be used to create temporary files that can be purged periodically
or on system restart. This is beneficial for applications performing
some operation that may or may not succeed and to prevent leaving
dangling files all over the file system.

However there is also the problem that this is infact a shared area.
This means that an attacker may premptively or actively exploit a
race condition to substitue a file for their own. Resulting in
the injection of untrusted content in your application, or reading or
writing of a file with escalated priviledges.

TOCTOU attacks exist because programmers continually create files
in a shared directory with a predictable path and / or insufficient
priviledges. Both the Java(1.7) and Python programming environments provide
a API that will create a temporary file or directory in a secure manner.


[source, java]
----
// Unfortunately prior to Java 1.7 there was no standard way to create a file
// with exlcusive write access and set the default permissions. Therefore even
// using +File.createTempFile+ to create a file with a random file name
// it does not ensure that the file was created exclusively
// with restrictive permissions and cannot be trusted.
//
// Java 7 contains features within the java.nio to securely create a
// temporary file atomically and set the permissions at the same time.

import java.nio.file.FileSystems;

// Securely creating a temporary file in Java 1.7
import java.nio.file.Files;

// Restrict read / write to current user only.
Set&#60;PosixFilePermssions&#62; perms = PosixFilePermissions.fromString("rw-------")

String prefix = null; // use default
String suffix = null; // use default

// Create a temporary file
Path file = Files.createTempFile(prefix, suffix, PosixFilePermissions.asFileAttributes(perms));

// Create a temporary directory
Path dir = Files.createTempDirectory(prefix, PosixFilePermissions.asFileAttributes(perms));

----

The Python solution is a little bit less verbose.

[source, python]
----
import tempfile

# Securely creating a temporary file in Python
file = tempfile.mkstemp()

# Securely creating a temporary directory in Python
dir = tempfile.mkdtemp()

----

Whilst both these mechanisms are safe it is worth mentioning that
temporary file usage should also used sparingly where possible.


== XML attacks

=== Overview

XML is used extensively in Java EE, unfortunately it has serveral weak points
from a security perspective. Not all of these are faults of the programmer, some
of these are flaws in XML's design.

=== Importance of schema validation

XML like many topics covered previously in this document is also susceptable to
injection attacks. It is also prone to attacks focused on algorithmic exhautation
and denial of service. Both of these problems originate from problems in parsing
the XML content.

There are generally two camps when it comes to XML parsers.
The first is a stream parser that interprets the XML code on the fly whilst maintaining
state around nesting tags and attributes. This event driven approach typically feeds
nodes to callback functions to yield values from the XML content.  Without schema
validation this approach is susceptible to XML injection attacks. If an attacker
is able to inject a node into the document structure values can be overwritten
as the SAX parser will process the document on the fly.

[source, xml]
----

&#60;account&#62;
  &#60;username&#62;fred&#60;/username&#62;
  &#60;roles&#62;
    &#60;role&#62;staff&#60;/role&#62;
  &#60;/roles&#62;

  &#60;!-- Consider what would happend if the
           following user controlled input is supplied for nickname.
    &#60;/nickname&#62;&#60;roles&#62;&#60;role&#62;admin&#60;/role&#62;&#60;/roles&#62;&#60;nickname&#62;freddie
  --&#62;
  &#60;nickname&#62;
    {{ user_controlled_input }}
  &#60;/nickname&#62;
&#60;/account&#62;

----

The alternative approach is to make a full parse of the XML document and use a
document object model to represent the entire document. Without validation
and other checks this approach can be vulnerable to resource exhaustation
attacks. The XML document may have been maliciously contructed to exhaust the
stack limit or system memory to cause a denial of service of a remote
service.

By validating the XML document against a known good DTD or schema, as well
as placing some common sense limits on the XML document size being processed
these types of attacks can be circumvented.


[source, java]
----

SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(new File("schema.xsd"));

try {
  schema.newValidator().validate(xmlContent);
  // XML is in format defined by schema.xsd

} catch (SAXException e){
  // XML is invalid. Do not try to process it any further.
}


----


=== Defending XML entity attacks

==== Malicious entity expansion


XML is susceptable to a denial of service attack through entity expansion.
The following is an example of the infamous billion laughs attack where it
uses entity recursion to expand a small XML document to a huge size.


[source, xml]
----
&#60;!-- billion lolz attack --&#62;
&#60;?xml version="1.0"?&#62;
&#60;!DOCTYPE lolz [
&#60;!ENTITY lol "lol"&#62;
&#60;!ELEMENT lolz (#PCDATA)&#62;
&#60;!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;"&#62;
&#60;!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;"&#62;
&#60;!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;"&#62;
&#60;!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;"&#62;
&#60;!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;"&#62;
&#60;!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;"&#62;
&#60;!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;"&#62;
&#60;!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;"&#62;
&#60;!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;"&#62;
]&#62;
&#60;lolz&#62;&lol9;&#60;/lolz&#62;
----

To protect against entity expansion in JAXP 1.3 you will need to turn on
the secure processing feature to set a limit for the DOM and SAX parsers.

  entityExpansionLimit = 64,000;
  elementAttributeLimit = 10,000;

However you will need to enable this feature explicitly:

[source,java]
----
  SAXParserFactory spf = SAXParserFactory.newInstance();
  spf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
----

In JAXP 1.4 the secure processing feature is turned on by default.
You can also set these limits using system properties. For example
if you wanted to apply something more restrictive across the board
you could do start your process with the flag +-DentityExpansionLimit=10+
or explicitly set it in your application:

[source,java]
----
System.setProperty("entityExpansionLimit", "10");

----


==== Unauthorized file access through external entities

Another way in which a XML entity tag can be abused is by accessing
local or remote content.

[source,xml]
----

&#60;?xml version="1.0" encoding="ISO-8859-1"?&#62;
&#60;!DOCTYPE foo [
  &#60;!ELEMENT foo ANY &#62;
  &#60;!ENTITY xxe SYSTEM "file:///etc/passwd" &#62;]&#62;
  &#60;foo&#62;&xxe;&#60;/foo&#62;
----


To prevent this class of attack you should use your own custom
resolver.

[source, java]
----

public class RestrictedEntityResolver implements EntityResolver {

  private static final String EXTERNAL_ENTITY_DIR = "/usr/share/example/xml";

  public InputSource resolveEntity(String publicId, String systemId)
    throws SAXException, IOException {

    // Example restricting external entities to safe directory
    String resolvedPath = new File(systemId).canonicalPath();
    if (resolvedPath.startsWith(EXTERNAL_ENTITY_DIR)){
      return new InputSource(resolvedPath);
    }

    // Return empty InputSource which will cause malformed url exception
    return new InputSource();

  }
}


// The resolver can then be used with an XML reader to
// limit access to external entities.
SAXParser parser = spf.newSAXParser();
XMLReader reader = parser.getXMLReader();
reader.setEntityResolver(new RestrictedEntityResolver());
reader.parse(new InputSource(input));

----


=== Detecting XML deserialization attacks

.TODO
 - http://blog.diniscruz.com/2013/08/using-xmldecoder-to-execute-server-side.html


== Redirection attacks

=== Overview

It is common to redirect a user after form submission. However in
some cases developers achieve this by using a URL parameter that
could be manipulated by an attacker. An attacker may manipulate the
redirection URL to trick a victim into entering their username
and password to a fake login dialog.

=== Validating redirects

An example of how an attacker may exploit URL redirects:
    https://www.example.com/login.php?next=http://attacker.com/phonylogin.php

To counter this type of attack redirects need to be validated before
redirecting the user to an external site. Essentially
you should confirm that the redirection will take the user
to another page within your site.

[source, java]
----
static String safeRedirect(URI baseUrl,  String target){

    try {

      URI redirect = URI.create(target).normalize();

      // Relative redirect
      if (! redirect.isAbsolute()){
        return new URI(baseUrl.getScheme(),
          baseUrl.getUserInfo(),
          baseUrl.getHost(),
          baseUrl.getPort(),
          redirect.getPath(),
          redirect.getQuery(),
          redirect.getFragment()).normalize().toString();
      }

      // Assert that normalized URI hostname, port and scheme match
      if (baseUrl.getHost().equals(redirect.getHost())
        &#35;&#35; baseUrl.getPort() == redirect.getPort()
        &#35;&#35; baseUrl.getScheme().equals(redirect.getScheme())){

        return redirect.toString();
      }

    } catch (Exception e){
    }

  // or Redirect to default base URL
  return baseUrl.toString();
}


String redirect = safeRedirect(URI.create("https://example.com"), request.getParameter("next"));
response.sendRedirect(redirect);

----

=== Summary

[TIP]
====
* Validate all redirects performed by your web application
* Warn the user when redirecting them to a site outside the hosted web application.
====


== SQL injection

=== Overview

SQL injection is a well known vulnerability that may be exploited when a programmer
does not safely construct a database query from user supplied data. This problem is
well known within industry yet it still has high incidence.

A classic SQL injection may occur when concatenating or interpolating user input
when construction a query string as follows:

[source, java]
----

// User controlled
String username = "admin' OR '1' ='1";
String password = "";

boolean login(String username, String password){

  // Constructing query string
  String sql = "SELECT * FROM user where username='"
    + username + "' and password='" + password + "'";

  ResultSet rs = db.createStatement().executeQuery(sql);
  return rs.next() != null;

}

----

To prevent this type of attack you should always use parameterised queries
and validate user input against a whitelist of safe characters. You should
*NEVER* construct a SQL query using concatenation or interpolation.

=== Use JBDC Prepared statements

When accessing a SQL database using JDBC it is essential that you use parameterized
queries. This prevents injection attacks as all parameters are safely escaped when
a value is set for them. A parameterized query means that placeholders are used
when declaring the initial SQL statement, these placeholders are then substituted
for a typesafe variable at runtime.

[source, java]
----
PreparedStatement findAllEmployeesByFirstName = null;

try {

  // ? is the parameter that will be substituted
  String queryString = "SELECT * FROM Employee WHERE firstName = ?";
  findAllEmployeesByFirstName = conn.prepareStatement(queryString);

  // The value for the parameter is explicitly set
  findAllEmployeesByFirstName.set(1, userControlledInput);

  ResultSet rs = findAllEmployeesByFirstName.executeQuery();
  // Process results ...

} catch (SQLException e){
  System.err.println(e.toString());
} finally {
  if (findAllEmployeesByFirstName != null){
    findAllEmployeesByFirstName.close();
  }
}

----

Using a prepared statement in this way prevents injection as each parameter
is escaped when constructing the query string. It also provides an element
of type safety to ensure that supplied parameters are of the correct type.


=== Use JPA named queries

The Java Persistence API provides a similar mechanism for creating parameterized
queries. The best practice is to use named queries to restrict the way in which
the database may be accessed. It also prevents dynamic construction of query strings.

[source,java]
----
// Parameter :firstname will be safely escaped in this query.
@Entity
@NamedQuery(
  name="findAllEmployeesByFirstName",
  queryString="SELECT OBJECT(emp) FROM Employee emp WHERE emp.firstName = :firstname"
)
public final class Employee implements Serializable {
...
}

Query queryEmployeesByFirstName = em.createNamedQuery("findAllEmployeesByFirstName");
queryEmployeesByFirstName.setParameter("firstname", userInput);
Collection employees = queryEmployeesByFirstName.getResultList();

----

In a similar fashion to JDBC parameterized queries using named parameters will
prevent SQL injection and is the preferred approach over dynamically generating
queries by concatenating user input.


=== Stored Procedures
.TODO
  - Decide if I want to keep /add this..

=== Summary

SQL injection can be easily avoided by using safer libraries and frameworks to
access the database. All user input should be properly sanitized before constructing
SQL statements from it.


[TIP]
====
  * Beware string concatenation and interpolation
  * Use prepared statements in JDBC
  * Hibernate and JPA
  * Stored procedures
  * Use SQLAlchemy in Python

====
== Session management

=== Overview

Attackers commonly attempt to exploit the improper use and protection of
session identifiers. Deploying and using sessions correctly is therefore
essential to securing your web application.

=== Session Fixation

Session fixation occurs when an attacker supplies an valid session identifier
to a user and tricks them into authenticating that session when they login.
To prevent this class of attack you need to ensure that a new session identifier
is generated for each logon. Any supplied session identifier should be discarded.

=== Session Lifecycle

The lifetime of sessions needs to be tightly managed. A session should have an
expiry and idle timeout. Ensure that your +web.xml+ sets a +session-timeout+ for
Java EE applications. It is also vital to invalidate a session when a user logs out.
Ensure that +session.invalidate()+ is called in all places where the user can
logout of the application.

=== Insecure session identifiers

A session identifier must be non-deterministic, bound to client
properties, and provide non-repudiation. If the session identifier
does not possess these qualities it may be subject to a number of
attack vectors.

If for example an web application uses an incremental number to
maintain state for each client then it is trivial for an attacker
to simply guess the correct session identifier to impersonate
a user. You should always use the session management mechanism provided by
your application container rather than using your own implementation.

If you are using client side sessions then should be digitally
signed on the server to ensure that a malicious attacker cannot
tamper with session information in a request. This is not a common
approach in the JavaEE applications however it is common in Python.
An additional technique is to bind a session to client properties
such as the user agent. This can be useful in detecting anomalies
in the middle of an established session and acting accordingly however
it is more of an obstacle for an attack not a prevention mechanism.


=== Insecure transmission of session data

This has already been touched on in the &#60;&#60;_session_hijacking, session hijacking&#62;&#62;
section. To reiterate the main points:

  * Session data should *ALWAYS* be transmitted over a secure channel.
  * Session data should *NEVER* be passed in the URL as a HTTP parameter.
  * Session data should not be accessible via client side scripts.

=== Summary

Most web containers if up to date should have a pretty solid session management
capabilities built in so it is essential that they are deployed and utilized
correctly.

[TIP]
====
- Ensure your session identifier is sufficiently random
- Ensure session data cannot be tampered with on the client side.
- Ensure session tokens are only sent over a secure

====


== Input validation

=== Overview

The majority of attacks covered in this guide can be traced back to the
unsafe processing and use of user input. This section intends to focus on the
merits of having a strong input validation ethos in your web application
development projects. Applications need to set stringent guidelines that
dictate on what terms data may cross a trust boundary and become part of
the application state.

=== Whitelists and Blacklists

A blacklist can be used to filter input by trying to filter out or reject
input that contains banned characters. This may be implemented as a
regular expression or just a list of banned character sequences.

[source,java]
----

// NOTE: A case is intentionally missed in this list
private static final PATH_BLACKLIST = "^.*[./].*$";

String input = "../../../../../../../../etc/passwd";
if (input.matches(PATH_BLACKLIST)){
  // ERROR reject input
  throw new InvalidInputException(input);
}

// Do something with input
----

The trouble with blacklists as is demonstrated in the code snippet above
is that the list may miss a corner case that allows a protected code path
to still be executed. (The blacklist in the example above does not include
a path separator character for the Windows platform). A much safer approach
is to flip the usage to define what constitutes safe input and reject everything
else. For instance if you are expecting a value for a phone number you might
restrict input to 10 digits.

[source,java]
----
  private static final PHONE_NUMBER_WHITELIST = "^\d{10}$+";
  if (input.matches(PHONE_NUMBER_WHITELIST)){
    // Valid input continue processing
    return;
  }

  // All other cases are rejcted as invalid
  throw new InputInputException(input);

----

=== Form validation

==== Seam validation framework

http://docs.jboss.org/seam/latest/reference/html/validation.html

=== GWT validation

http://www.gwtproject.org/doc/latest/DevGuideValidation.html

==== Python WTForms

The wtforms module comes with several default validators that can
be used to ensure that invalid form data is rejected.

[options="headers"]
|===
| Validator | Description

| Required
| Ensures that data has been provided for this form field.

| Email
| Ensures the form field is a valid email address.

| EqualTo
| Ensures that the value of this form field equals the value supplied
for another form field. This is useful for confirmation fields.

| IPAddress
| Ensures that the form field is a valid IP address (either ipv4 or ipv6).

| Length
| Limits the input for the field to a specific length range.

| MacAddress
| Expects the form field to contain a valid mac address.

| NumberRange
| Expects the form field contain a value within a specific number range.

| Regexp
| Allows you to specify a regular expression. Useful for specifying
a whitelist to validate user information.

| URL
| Expects a valid URL in the form field.

| UUID
| Expects a valid UUID to be specifiied in the form field.

| AnyOf
| Expects the form field to contain a value matchin any of a values
in the supplied list.

| NoneOf
| Expects that the form field won't match any value in the supplied list
of values.

|===

When defining a form using wtforms you should always ensure that the
validation rules mirror the restrictions that are in place for the
database. For example a MySQL schema requires hard limits on VARCHAR
columns, so the corresponding form field should use the +Length+ validator
to enforce that limitation.

Custom validation is also possible with wtforms. This can be achieved by
implementing a +validate_x+ method in your form class where +x+ is the
name of a form field to validate. You can also implement inline checks.

[source, python]
----

# A custom validator that checks password strength
def password_strength_check(form, field):
  """ password must contain of uppercase, lowercase, digit and symbol """
  return \
    any((sym in string.ascii_uppercase) for c in field.data) and
    any((sym in string.ascii_lowercase) for c in field.data) and
    any((sym in string.digits) for c in field.data) and
    any((sym in """!@#$%^&#35;*()-_+={}[]\|;:'"/?.&#62;,&#60;`~""".split()) for c in field)


# Form definition for user registration.
class RegisterForm(RedirectForm):

  # Name field has validators to restrict length, and restricted to
  # have only characters that are valid for a persons name.
  name = TextField('Name',
    [
      Required(),
      Regexp(whitelists.IDENTITY_WHITELIST),
      Length(min=3, max=constants.MAX_NAME_LENGTH)
    ])

  # Using default email validator along with length restrictions.
  email = TextField('Email address',
    [
      Required(),
      Email(),
      Length(max=constants.MAX_EMAIL_LENGTH)
    ])

  # Password field has a restricted minimum length
  # and also uses a custom validator (password_strength_check)
  # to check the password complexity.
  password = PasswordField('Password',
    [
      Required(),
      Length(min=constants.MIN_PASSWORD_LENGTH, max=constants.MAX_PASSWORD_LENGTH),
      password_strength_check
    ])

  # Confirmation validator is used to make sure
  # password entered matches.
  confirm = PasswordField('Repeat Password',
    [
      Required(),
      EqualTo('password', message="Passwords must match")
    ])

----


=== Database validation

==== Input validation with Hibernate

  * Database Java   - Hibernate validation

==== Input validation with SQLAlchemy

Input validation can also be enforced at the database layer
of your application. SQLAlchemy includes a validates decorator
that can be used in your ORM definition.

[source, python]
----

from sqlalchemy.orm import validates
...

class User(Base):
  id = Column(Integer, primary_key=True)
  email = Column(String)
  #...

  # Custom email validation for the supplied field.
  @validates('email')
  def validate_email(self, key, address):
    assert '@' in address
    return address


----

=== Validating data exchange formats

==== JSON schema validation

JSON is a dominant format for interacting with RESTful API's. An effort to introduce
a standard form of schema validation link:http://json-schema.org/examples.html[is underway].

The link:https://github.com/Julian/jsonschema[jsonschema] module
implements full support of Draft3 and Draft4 of the standard.
It can be used to enforce type safety and input validation
and is recommended for defining and checking API entrypoints
for any RESTful web services.

[options="header"]
|===
| Validator | Description | Works on

|multipleOf
|Expects a number which is a multiple of the
supplied constraint.

+jsonschema.validate(32, {"multipleOf", 2})+

| +number+

|maximum and exclusiveMaximum
|Sets a maximum value for a number field.
+jsonschema.validate(101, {"maximum", 100})+
| +number+

|minimum and exclusiveMinimum
|Sets a minimum value required for a number field.
+jsonschema.validate(0, { "minimum", 21})+
| +number+

|maxLength
|Checks that the supplied string is at most N characters
long. +jsonschema.validate("The F word", {"maxLength": 4})
| +string+

|minLength
|Checks that the supplied string is at least N characters
long. +jsonschema.validate("foo", {"minLength" : 6 })+
| +string+

|pattern
|Validate using a regular expression in ECMA 262 format.
+jsonschema.validate("aardvark", {"pattern" : "^aa.*$" })+
| +string+

|additionalItems and items
| The +items+ validator allows you to apply validation rules
over an entire array. This may be to ensure that each item
is of a specific type or only a subset of values. The additionalItems
allows the array to contain supplementary values to be included
at the end of the array.

This would fail validation as it contains an additional
entry over the defined schema.

+jsonschema.validate([1, 2, "This is a string"], {
  "items" : [
      { "type" : "number" },
      { "type" : "number" }
    ]
  })+

To allow additional data at the end of the
array you need to specify +additionalData = True+.

| +array+

|maxItems
|Sets a hard limit on the maximum number of items that can be
supplied.
+jsonschema.validate([1, 2, 3], { "maxItems" : 3 })+
| +array+

|minItems
|Sets a hard limit on the minumum number of items that can be
supplied.
+jsonschema.validate([1, 2, 3], { "minItems" : 1 })+
| +array+

|uniqueItems
|True of false value to indicate that all items in the array must
be unique.
+jsonschema.validate([1, 2, 2], { "uniqueItems" : true })
| +array+

|maxProperties
|Validates that an object has at most N properties.
+jsonschema.validate({'name' : 'bob', 'age' : 21 }, { "maxProperties" : 2 })+
| +object+

|minProperties
|Validates that an object has at least N properties.
+jsonschema.validate({'name' : 'bob', 'age' : 21 }, { "minProperties" : 2 })+
| +object+

|required
|Dictates which properties an object must possess. Should be specified as
a list of objects.

+jsonschema.validate({ 'firstname' : 'bob', 'lastname' : 'marley' }, { 'required' : ['firstname', 'lastname']})+
|+object+

|additionalProperties, properties and patternProperties
| Checks an object for the supplied properties. A property name
may be matched by a pattern or explicitly. The additionalProperties
attribute can restrict or allow additional properties to exist within
the object.
|


|dependencies
|
|

|enum
|
|

|type
|Defines the basic types that instances can take.
* object
* array
* string
* number
* boolean
* null
* any
+jsonschema.validate("foo", { "type" : "string" })+
| +any+

|allOf
|Provides a list of schema's that a value must adhere to.
It is a way to separate schema's into separate definitions
and use all of them to validate a value.
+jsonschema.validate("foo", {
  "allOf" : [
    { "title" : "first schema", "type" : "string"},
    { "title": "second schema", "minLength" : 3 }
  ]})+
| +any+

|anyOf
|Requires that at least one of the provided schema's apply
to the value being validated.
+jsonschema.validate("foo", {
  "anyOf" : [
    { "title" : "first schema", "type" : "string"},
    { "title": "second schema", "minLength" : 5 }
  ]})+
| +any+


|oneOf
|Requires that *EXACTLY* one of the provided schema's
apply to the value being validated.

+jsonschema.validate("foo", {
  "oneOf" : [
    { "title": "[on]", "properties": { "status" : {"enum" : [True]} }},
    { "title": "[off]", "properties": { "status" : {"enum" : [False]} }}
  ]})+
|+any+


|not
|Negates the validation result of the supplied schema.
+jsonschema.validate(1, { "not" : { "type" : "number" } })+
|+any+


|format
| There are several useful built-in format verification
attributes that can be applied to a value. These include:
* date-time
* email
* hostname
* ipv4
* ipv6
* uri

+jsonschema.validate("user@example.com", {"format", "email"})+
| +string+


|===

Bringing all of this together allows you to safely define
a the constraints and expectations around your API usage.
When the schema is defined it dramatically reduces the
complexity associated with validating an inbound request.

[source, python]
----
def validate_person(data):

  # Could externally define and document
  # expected request format for each API call
  Person = {
    "properties": {
      "firstname" : {
        "type"      : "string",
        "minLength" : 2,
        "pattern"   : "^[A-Za-z]+$"
      },
      "lastname"  : {
        "type"      : "string",
        "minLength  : 2,
        "pattern"   : "^[A-Za-z]+$"
      },
      "age"       : {
        "type"    : "number",
        "minimum" : 0,
        "maximum" : 150,
      },
      additionProperties : False
    }
  }
  return jsonschema.validate(data, Person)


@app.route("/person", methods=["POST"])
def create_person():

  if not request.json:
    return abort(400)

  try:
    validate_person(request.json)
    id = save_person(request.json)
    return jsonify(status="success", id=id)

  except Exception as e:
    return  jsonify(status="error", message=str(e)), 400

----


==== XML schema validation

The &#60;&#60;_importance_of_schema_validation,importance of XML schema validation&#62;&#62;
has already been touched on in this guide. To summarize you should explicitly
validate XML input against a xsd to ensure it is a valid document.


=== Summary

Input validation should be considered a high priority activity
when developing web applications. Having strict constraints on the
type of input that can cross a trust boundary will help prevent
applications being exploited by attackers.

You should try to identify your applications attack surface and
ensure that all input that comes from an external source is correctly
validated before allowing your application to interact with it.

[TIP]
====
* Validate all fields when processing user forms
* Ensure the database constraints are protected by input validation
* Validate all input at API boundaries
* Use whitelists to define what input is valid and reject everything else.
====


== Output escaping and encoding

  * Safely escaping and encoding

== Authentication and Authorization

  * JAAS
  * JBoss declarative security
  * Federated authentication
  * LDAP authentication
  * Kerberos authentication

== Deployment issues

  * Incorrectly configured TLS and keystores
  * Deploying using known vulnerable artifacts
  * Beward the risks of embedded dependencies
===


=== Environment Hardening

A good defensive strategy for command or code injection is to
reduce the attack surface and limit exposure of the application
as much as possible.

  * Linux containers and SELinux
  * Java security policy

=== Summary

.TODO
  - Insert summary here.


== Verifying application correctness

  * OWASP application verification standard.

== Supporting libraries &#35; tools

  * OWASP ESAPI
  * OWASP AntiSammy
  * OWASP CSRF Guard Project
  * Picketbox
  * Apache Santuario
  * Apache Shiro
  * Bouncy Castle
  * Checker framework (@Tainted)
  * Google Guava Libraries
    - Using and avoiding null
    - Preconditions
    - Immutable Collections
  * Java Simplified Encryption - jasypt
  * Coverity security library - https://github.com/coverity/coverity-security-library
  * Coverty scanner
  * Findbugs
  * Web application firewall
  *


</chapter>