PEP333: The WSGI Specification

Purpose

  • The PEP, “Python Web Server Gateway Interface v1.0”[1], defines a framework for connecting conforming Python web applications to conforming web servers.

    This document specifies a proposed standard interface between web servers and Python web applications or frameworks, to promote web application portability across a variety of web servers.

  • Intended to fill the same role as the CGI specification[2] or the Java “servlet” API[3]

[1] http://www.python.org/dev/peps/pep-0333/

[2] http://www.ietf.org/rfc/rfc3875

[3] http://java.sun.com/products/servlet/

Example Application

This is the “simplest possible WSGI application” from the PEP:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def simple_app(environ, start_response):
    """Simplest possible application object"""
    status = '200 OK'
    response_headers = [('Content-Type', 'text/plain')]
    start_response(status, response_headers)
    return ['Hello world!\n']

if __name__ == '__main__':
    from paste import httpserver
    httpserver.serve(simple_app, host='127.0.0.1', port='8080')
  • Line 1 defines a function taking two arguments.
  • Line 3 represents the HTTP status line for the response.
  • Line 4 constructs the list of HTTP response headers.
  • Line 5 returns the iterable body of the response.
  • Lines 8 and 9 run your application under the Paste webserver.

Exercise: Run the simplest WSGI application

First, create a virtualenv for the application:

$ cd /tmp
$ wget http://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.4.8.tar.gz
$ tar xzf virtualenv-1.4.8.tar.gz
$ cd virtualenv-1.4.8
$ /opt/Python-2.6.5/bin/python virtualenv.py --no-site-packages \
   /tmp/simplest

and populate it with the egg for Paste:

$ /tmp/simplest/bin/easy_install Paste

Copy the text from the example above to a file in the virtualenv, simplest.py:

$ vim simplest.py

Finally, run the application:

$ bin/python simplest.py

and visit the application in your browser at http://localhost:8080/

Architecture

The spec defines two interfaces, one for conforming applications, which may be any callable with the following signature:

def __call__(environ, start_response):
    """ Return an iterable.

    'environ' is the WSGI environment dictionary.

    Normally, call 'start_response' with a status and headers before
    returning.
    """

and one for conforming webservers, which must populate the environ dictionary, and provide the start_response callback.

WSGI Environment

  • The environ dictionary passed by the server is normally a copy of os.environ with the standard CGI keys:

    SCRIPT_NAME

    the “base” of the URL, representing the root of the application.

    PATH_INFO

    the remainder of the URL, to be interpreted by the application.

  • The environ also contains addtional, WSGI-specific keys, of which the most important are:

    wsgi.input represents the body / payload of the request.

    wsgi.errors represents a stream to which error logging may be done.

    wsgi.url_scheme is typically either “http” or “https”.

The start_response Callback

` The start_response callback takes two arguments, conventionally

named “status” and “headers”:

  • The first argument is a string containing the contents of the HTTP response status line, e.g. '200 OK'.
  • The second argument is a list of two-tuples representing HTTP response headers, e.g. [('Content-Type', 'text/html'), ('Content-Length', '15')]
  • The start_response callable returns another callable object, conventionally named “write”. This callable can be used to do “streaming” writes to the response body, under unusual circumstances.

Note

Use of the “write” callback is often poorly supported by servers and web frameworks – usually, you are better off structuring your application to stay in the mainstream and return an iterable.

Response Body as Iterable Chunks

As noted above, the standard way for an application to provide the response is as a sequence of (most often one) strings. There is no benefit to using more than one chunk unless it makes the application simpler or easier to write (e.g., returing a generator).

Middleware

A WSGI middleware component is one which plays both the role of of the application and the role of the server: the “upstream” server calls it, passing the environ and start_response arguments, and expects it to return the iterable response body. The middleware component in turn calls “downstream” component, perhaps mutating environ first, or replacing the start_response with another callable. The middleware component may also intercept the returned iterator and transform or replace it, and may add exception handling.

Example Middleware

This middleware component filters the output of the “downstream” application, converting it all to lowercase:

1
2
3
4
5
6
7
8
class Caseless:

    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):
        for chunk in self.app(environ, start_response):
            yield chunk.lower()
  • Lines 3-4: save the “downstream” application as an attribute. This middelware should be created before the application starts serving requests.
  • Line 6: the instance supports the WSGI application interface
  • Lines 7-8: iterate over the chunks returned by the downstream application. Note that the _-call__ function is a Python generator, which is typical for this sort of “post-processing” task.

Exercise: Add middleware to the simplest application

In the virtualenv you created in the first exercise, add a module named caseless.py. Copy the Caseless class defined above into the module, and add the following at the end:

if __name__ == '__main__':
    from paste import httpserver
    from simplest import simple_app
    httpserver.serve(Caseless(simple_app), host='127.0.0.1', port='8080')

Note that we are creating the middlware instance as a wrapper around simple_app.

Now run the application and view the result in your browser:

$ bin/python caseless.py

Pipelines: Chaining Middleware together

Such components can be chained together to form a WSGI “pipeline.” This name is perhaps a poor choice, as the structure is really a kind of functional composition rather than a stream-processing chain like a Unix pipeline:

REQUEST ----> WSGI Server creates environ, start_response

          ----> Middleware A's ``__call__``

            ----> Middleware B's ``__call__``

              e.g., begin transaction

              ----> Middleware C's ``__call__``

                ----> Application's ``__call__``
                      calls ``start_response`` w/ status, headers
                <---- returns iterable

              <----

              e.g., commit or abort transaction

            <----

            e.g., catch exceptions, return error view

          <----

        <----

Exercise: Chaining Middleware

First, copy the simplest.py file defined above:

$ cp simplest.py green.py

and edit the application to use a value, if set, from the WSGI environment:

def green(environ, start_response):
    """Simplest possible application object"""
    status = '200 OK'
    response_headers = [('Content-Type', 'text/plain')]
    start_response(status, response_headers)
    return ['%s\n' % environ.get('GREETING', 'Hello world!')]

Below that, add a middleware component:

def greetingSetter(app):

    def _curried(environ, start_response):
        environ['GREETING'] = 'Bonjour, le monde!'
        return app(environ, start_response)

    return _curried

Note

We don’t have to use a class as a middleware component. Here, we use a “closure” to capture the downstream app.

and wire it in, along with the Caseless middleware defined above:

if __name__ == '__main__':
    from paste import httpserver
    from caseless import Caseless
    httpserver.serve(Caseless(greetingSetter(green)),
                     host='127.0.0.1', port='8080')

Now run the application and view the result in your browser:

$ bin/python green.py