| Author: | Tres Seaver |
|---|---|
| Version: | 0.1 |
Overview
repoze.urispace implements the URISpace [1] 1.0 spec, as proposed to the W3C by Akamai. Its aim is to provide an implementation of that language as a vehicle for asserting declarative metadata about a resource based on pattern matching against its URI.
Once asserted, such metadata can be used to guide the application in serving the resource, with possible applications including:
The URISpace [1] specification provides for matching on the following portions of a URI:
scheme
o host, including wildcarding (leading only) and port
o user (if specified in the URI)
path elements, including nesting and wildcarding, as well as parameters, where used.
query elements, including test for presence or for specific value
fragments (likely irrelevant for server-side applications)
Note
repoze.urispace does not yet provide support for fragment matching.
Match statements against these URI portions are called selectors, and an individual selector may be scalar, can contain multiple values separated by whitespace, or can use RDF Bags and Alternates to indicate groups of possible matches.
Note
repoze.urispace does not yet provide support for parsing multi-option selectors using RDF or parsing multi-option selectors separated by whitespace.
When multiple matches occur within a single selector or within sibling selectors, the most specific match takes precedence. In cases where there are multiple matches of equal specificity, the first such match takes precedence.
Note
repoze.urispace does not yet observe these rules. Currently the final match among siblings will get precedence, regardless of specificity.
The asserted metadata can be scalar or can use RDF Bag and Sequences to indicate sets or ordered collections.
Note
repoze.urispace does not yet provide support for parsing multi-valued assertions using RDF.
Operators are provided to allow for incrementally updating or clearing the value for a given metadata element. Specified operators include:
Suppose we want to select different Deliverance themes and or rulesets based on the URI of the resource being themed. In particular:
A URISpace file specifying these policies would look like:
<?xml version="1.0" ?>
<themeselect
xmlns:uri='http://www.w3.org/2000/urispace'
xmlns:urix='http://repoze.org/repoze.urispace/extensions'
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
>
<!-- default theme and rules -->
<theme>http://themes.example.com/default.html</theme>
<rules>http://static.example.com/rules/default.xml</rules>
<uri:path uri:match="news">
<theme>http://themes.example.com/news.html</theme>
<uri:path uri:match="world">
<theme>http://themes.example.com/news.html?style=world</theme>
</uri:path>
<uri:path uri:match="national">
<theme>http://themes.example.com/news.html?style=national</theme>
</uri:path>
<uri:path uri:match="local">
<theme>http://themes.example.com/news.html?style=local</theme>
</uri:path>
</uri:path>
<uri:path uri:match="lifestyle">
<theme>http://themes.example.com/lifestyle.html</theme>
</uri:path>
<uri:path uri:match="sports">
<theme>http://themes.example.com/sports.html</theme>
</uri:path>
<!-- Note that the following rules match "across" sections -->
<urix:pathlast uri:match="*.xhtml">
<rules>http://static.example.com/rules/story.xml</rules>
</urix:pathlast>
<urix:pathlast uri:match="index.xhtml">
<rules>http://static.example.com/rules/index.xml</rules>
</urix:pathlast>
<!-- Note that the following rules fail to match "across" sections -->
<uri:path uri:match="*.html">
<rules>http://static.example.com/rules/story.xml</rules>
</uri:path>
<uri:path uri:match="index.html">
<rules>http://static.example.com/rules/index.xml</rules>
</uri:path>
</themeselect>
Given that URISpace file, one can test how given URIs matches using the uri_test script:
$ /path/to/bin/uri_test examples/dv_news.xml \
http://example.com/ \
http://example.com/foo \
http://example.com/news/ \
http://example.com/news/index.html \
http://example.com/news/world/index.html \
http://example.com/sports/ \
http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
URI: http://example.com/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/default.html
------------------------------------------------------------------------------
URI: http://example.com/foo
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/default.html
------------------------------------------------------------------------------
URI: http://example.com/news/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/index.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/index.xhtml
------------------------------------------------------------------------------
rules = http://static.example.com/rules/index.xml
theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/world/index.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html?style=world
------------------------------------------------------------------------------
URI: http://example.com/news/world/index.xhtml
------------------------------------------------------------------------------
rules = http://static.example.com/rules/index.xml
theme = http://themes.example.com/news.html?style=world
------------------------------------------------------------------------------
URI: http://example.com/sports/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/sports.html
------------------------------------------------------------------------------
URI: http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/sports.html
------------------------------------------------------------------------------
URI: http://example.com/sports/world_series_2008.xhtml
------------------------------------------------------------------------------
rules = http://static.example.com/rules/story.xml
theme = http://themes.example.com/sports.html
Once parsing is complete, the URISpace is available as tree-like object. The canonical operators to extract metadata for a given URI are:
from urlparse import urlsplit
scheme, nethost, path, query, fragment = urlsplit(uri)
path = path.split('/')
if len(path) > 1 and path[0] == '':
path = path[1:]
info = {'scheme': scheme,
'nethost': nethost,
'path': path,
'query': parse_qs(query, keep_blank_values=1),
'fragment': fragment,
}
operators = urispace.collect(info)
assertions = {}
for operator in operators:
operator.apply(assertions)
At this point, assertions will contain keys and values for all operators found while matching against the URI.
One application of a URISpace might be to make assertions about the URI of a WSGI request, in order to allow other parts of the application to use those assertions. repoze.urispace provides a component which can be used as middleware for this purpose.
To configure the middleware in a PasteDeploy config file:
[filter:urispace]
use = egg:repoze.urispace#urispace
file = %{here)s/urispace.xml
You should then be able to add the middleware to your pipeline:
[pipeline:main]
pipeline =
urispace
your_app
In your application, you can get to the assertions made by the middleware using the repoze.urispace.middleware.getAssertions() API, e.g.:
from repoze.urispace.middleware import getAssertions
def your_app(environ, start_response):
assertions = getAssertions(environ)
| [1] | (1, 2) http://www.w3.org/TR/urispace.html |
| [2] | http://www.ietf.org/rfc/rfc2396.txt |