A design for a Scheme web application framework (by alaric)
While sitting on a train yesterday, I typed up some thoughts...
THE ULTIMATE WEB APPLICATION PLATFORM
As I've said before, I think we need better web application frameworks, and I'm a fan of modular software, and I think we should support caching better in the framework to keep things scaleable.
Also, I like lisp. So when I heard that the maintainer of Spiffy was looking to extend it with app framework features, what could I do but come up with a proposal...
Request dispatch
The core of any WAF is the request dispatch system. A request comes in; how do we handle it?
I propose that the primary dispatch method be via the filesystem. URLs have hierarchical structure with named links, and the FS is the standard way of representing that. A hierarchy stored in a file system is amenable to manipulation with a wide variety of tools.
But the ability to map parts of the URL namespace to 'virtual' hierarchies, from databases or other protocols or whatever, is also useful.
Therefore, the algorithm I propose is:
- Strip off any trailing query parameters in the URL. Store them in CP.
- Split the URL into components, using the / character. Store them on a stack, R, with the first component at the top.
- Set the current search directory path, SD, to the application root directory.
- Pop a component off of the stack, and call it CF. If the stack is empty, we have been routed to a directory; so the results of the algorithm are SD, NULL, NULL, CP, and an empty stack. Terminate the algorithm.
- Check CF is not "." or "..". Reject the request with a 400 if so. Terminate.
- Is CF the name of a directory in SD? If so, append CF to SD, and go to step 4.
- Is CF the name of a file in SD? If so, we have found our handler. The final values SD, CF, CF, CP, and R are the results of the algorithm. Terminate.
- Does CF specify a file extension (eg, does it contain a . character)? If so, then it specifies a file that does not exist - so reject the request with a 404. Terminate.
- Search SD for files with names of the form "CF.*".
- If there are no matches, then reject the request with a 404 and terminate.
- If there is one match, then we have found our handler. The results of the algorithm are SD, CF, the name of the matching file, CP, and R. Terminate.
- If there are more than one match, then reject the request with that return code for 'multiple options', listing them. Terminate.
At the end of this algorithm, we will have five results:
- The directory in which the handler is to be found
- The name of the file as requested, possibly lacking an extension, or NULL of the directory itself is the target.
- The name of the handler file within that directory, or NULL if the directory itself is the target.
- Any parameters to pass to that handler.
- Any remaining path components to pass to that handler.
Note one interesting property: If we have a file called "foo.jpeg" we can reference it as "foo.jpeg" or as just "foo", unless there is another "foo.*" file in the same directory. This allows us to keep file extensions such as ".php", ".aspx", etc. - which expose our internals - hidden away.
It makes it easy to have urls like "/user/1" handled by "/user.php", passed "1" as a parameter.
It's easy to optimise the above algorithm; at each case tested in steps 6-12, if it succeeds, store the result in a weak hash table, indexed on SD and CF. Then add a step 4a, which looks up SD and CF in the hash; if it finds a match, then jump straight to the case for that match rather than performing the tests. This will even cache 'negative hits' that result in 404s or 400s.
However, as it stands, we're not doing very much more than Apache does. Things become a little more interesting, however...
Handling a request
Once we've dispatched our request to a file, we need to decide how to handle that file. The default behaviour for a file is to send the file to the client verbatim, having constructed a MIME type from it. If the filename is NULL, we've been dispatched to a directory; we'd probably best handle that by trying to find an "index.*" file in that directory and then handling that as if it had been the target of the request. Generating a directory listing probably isn't desirable behaviour for an actual application server, however. So if there's no index file, we'd probably best return a 404.
Either way, the core of the request handler will be a lookup table mapping file extensions to handlers. If the requested file extension isn't in the table, then the file is returned verbatim. I recommend the extension of a file called a.b.c
be considered b.c
rather than just c
but there are various pros and cons to each option.
The handlers are just procedures, which are passed the results of the dispatch algorithm, along with the details of the HTTP request: the original URL and all the headers, and access to an HTTP protocol implementation that can be used to send a response. Or, instead of an HTTP response, it might return a new relative URL, causing a tail-call back to the request dispatcher for an 'internal server-side redirect'.
But most interestingly, the procedure is also passed an execution environment hash.
Execution environments
Now it starts getting interesting.
For now, the execution environment will be global, passed the same to all handlers; but we explicitly pass it in to allow future expansion into context-dependent environments.
The app framework generates the execution environment at startup (or restart) by starting with an empty hash, and then loading and executing every installed plugin. A plugin is just a Scheme script. It should call register-plugin
, passing in a plugin name, an alist of environment mappings, and maybe other metadata in future.
It might want to keep a list of all the plugins it initialised, along with a copy of the list of hash map entries each one returned, so that it can rebuild the environment without reloading everything again, perhaps to unload a specific plugin. It's also useful for introspection; the server can be asked (by a debugger, for example) for a list of installed plugins, and then look up what each plugin has provided to the execution environment.
If the same execution environment name is bound by more than one plugin, we have two choices. Either we require the installer to specify the plugins in order, and use that order to resolve disputes. Or consider at an error and refuse to start. The latter has the advantage that plugins might be installed by just dropping them into a special directory.
Plugins, of course, may also hook themselves into other globals than the execution environment. A loaded plugin might well define some globals in the Scheme top-level environment for handlers to use.
The list of file type handlers used in the previous section actually lives in the execution environment. A plugin that wants to handle "shtml" files, for example, will return an execution environment binding for file-extension:shtml
, mapping to a handler procedure, which will then be called for any file whose name ends in .shtml
.
Another use of the execution environment by the core system might be default error pages. If the despatch process, or a handler, wants to send a 404, then it might use a core procedure to do so, passing in the error code and an optional detail message (if no detail message is supplied, then a standard one based on the error code can be used 'Not Found' for 404, etc). If there is an entry in the execution environment named error-handler:404
, then it should be retrieved and applied to the request details and the error code and detail message. If not, a default one can be used that spits out a standard HTML page.
Handlers
It gets really fun with the handlers.
For a start, obviously, we can have handlers that provide a CGI protocol (bound to .cgi extensions, perhaps); execute the file as a subprocess. Or like PHP or ASP or JSP or PSP or whatever. All the things that we do with Apache modules.
And we might have a handler that does redirects. Put a .redirect file in the filesystem, which contains (in plain text) a list of regular expressions mapping to new URLs, which can of course backreference any groups in the regexps. This gives us functionality like Apache's mod_redirect, but more elegantly handled within the filesystem rather than in a .htaccess file. If the request doesn't match any of the regexps, it can reply with a 404; otherwise, it can either issue an HTTP redirect or return an internal redirect to the dispatcher, depending on the flags specified.
But this is all copying what has gone before. None of these make interesting use of the execution environment and other plugins.
A decoupled application framework
WAFs like Ruby on Rails, or Zope, or Django, and so on all work as a monolithic system. They combine object-relational mapping, templating, request dispatching, and so on.
This sucks. Why can't we combine different models? Why can't we use a shared storage model across our site (or different storage models for different parts of our data store; perhaps take user accounts from LDAP, read-only data from a master-slave replicated database, and read-write data from a partitioned cluster of database servers), and then write different parts of the site in different ways?
Some very linear page flows (checkout processes, wizards, complex editors) benefit from a continuation-based system. But simpler forms and pages that idempotently view stuff don't. Yet most sites contains mixtures of all of the above.
So I propose creating a suite of handlers for content generation, which use resources made available by other plugins.
Templating
There's a number of templating languages out there, each with different pros and cons.
So let's support more than one.
Now, in many WAFs, a request is routed to a 'controller' (to borrow Rails terminology); a block of code written by a programmer, which fetches stuff from the database and sets up an environment in which the template is then rendered. Basically, it creates an environment of mappings from names to values.
I don't like this model, for reasons I have described; I think the HTML designers should be free to create new pages, calling upon a library of data getters provided for them by the programmers. It lets them do things like merge two pages, or split a page into two, or rearrange the site, without needing to bother the coders.
So instead, I'd suggest that a standard be defined for template extensions. Let's examine two different types of template language, that both use this model, with some examples.
<% include "header.thtml" %>
<% load cart = ShoppingCart %>
<h1>Shopping Cart</h1>
<ul>
<% foreach item in cart.items %>
<li><%= item.title %> (<%= item.price | price %>)</li>
<% end foreach %>
</ul>
<p>Total: <%= cart.total | price %></p>
<% include "footer.thtml" %>
This is a fairly familiar approach, inspired by Rails and Django. Features to note are:
- HTML with embedded control commands
- The control commands are from a limited vocabulary, like Django; no arbitrary code like Rails or PHP. Keep that presentation logic isolated!
- We provide mechanisms to include other templates, for common page components. These includable templates should be outside of the document root; the templating engine should use a configuration parameter to select a template directory.
- There are iteration control structures, like
foreach
- There are formatting 'filters', like in Django. The
<%= ... %>
syntax outputs the result of some expression, while<% ... %>
outputs nothing in itself; and<%= foo | bar %>
computesfoo
then passes it through filterbar
to generate output HTML. If no filter is specified, then a default filter that escapes HTML entities is used. If you want to output raw HTML, you have to explicitly invoke a null filter calledraw
. Unsurprisingly, the filters are procedures in the execution environment, with names such astemplate-filter:price
. - We obtained the contents of the user's cart with the
load
statement, which loads some data. Behind the scene's it's looked in the execution environment for a binding calledtemplate-data-source:ShoppingCart
, and invoked the resulting procedure, and bound the result tocart
within the templating environment. Once the HTML designers have been told aboutShoppingCart
and what properties the returned objects provide, they can include current cart information in any page they like, without needing to bother the programmers. A win for everyone.
Let's look at another templating language:
(html
(* include "header.sxml")
(* load cart ("ShoppingCart"))
(h1 "Shopping Cart")
(ul (* foreach item cart
(li
(= (title item))
" ("
(= (price item) price)
" )")))
(p "Total: " (= (total cart) price))
(* include "footer.sxml"))
This one follows the same approach as the previous, but is based on SXML syntax instead of plain text.
The shopping cart is a simple example of a data source plugin, of course. If we were writing a photo viewer application, we would have a data source that obtains data for a specified photograph, and we'd need to pass it an argument. The load
statements should make it possible to pass arguments into data sources, either from request parameters, request URL components, or any other value expression that could be used in <%= ... %>
or (= ... )
.
I think the arguments to a page should be declared at the top of the page in one block, something like:
(* positional-params
(date (date))
(account (id "GetAccount"))
(path (rest)))
(* query-params
(search-terms "st" (string)))
That declares the positional parameters in the URL. It means that after the name of this template, the next URL component is a date which gets bound to the name date
, and the next URL component after that is an object ID; we should call the data source called "GetAccount" with the ID as the sole parameter, and if it returns nil, reject the request with a 404, otherwise bind account
to the result. If any more path components are supplied, then they go (as a list) into path
.
If the URL stops short, then any unmatched parameters are bound to nil.
It then declares a query paremeter which is called st
in the URL, but of type string, which is bound to search-terms
in the code.
Declaring the parameters up front like this makes it easy to see what the parameters to a page are, allows up-front type checking to prevent nastiness, allows for elegant and clean handling of invalid IDs (an automatic 404), allows us to use nice internal names like search-terms
from a concise query parameter like st
, lets us rearrange where parameters come from (URL components or query string) without needing to change anything else in the template, etc.
Here's a more unusual example:
(load user ("UserDetails"))
(base-image "splash.png")
(text 80 45 ("Times New Roman" bold 12)
(format "${name}'s Special Offers"
`((name ,(name user)))))
This templating language generates images. In this case, we're loading a base image (looked up in the same template resources directory any other template engine looks for includes and the like) then overlaying some text on it at a specified position in a specified font. By default, the result will be a PNG image with the size of the base image, but the handler will automatically look for a URL parameter called thumbnail
, containing a width and height, and scale the image to fit within that width and height.
The fun part is, you can combine these different templating languages in the same site. We could also have template languages that generate PDFs or Flash.
But it's not just templates that we'd want handlers for. We could have a handler that makes it easy to generate SOAP or XML-RPC handlers. Perhaps a .soap
file might contain Scheme code declaring an interface, with the handler code embedded. The handler would examine the request and either return WSDL generated from the interface spec, or handle an actual RPC request by invoking the appropriate code. Obviously, only programmers would write .soap
files, not HTML designers!
And we could have a handler for stateful page flows. The content would be Scheme code, but with access to a (send-form ...)
procedure that renders a nominated template file in the supplied environment (just like views are rendered in Rails), then stores the continuation and terminates. Within the template, URLs that advance the page flow are generated by using a special template extension provided in the execution environment just for that template, which returns a URL pointing to the original URL but with a continuation ID and any other parameters specified in the template. If that URL is invoked then the continuation is resurrected and passed the given parameters, which to the user of the (send-form ...)
procedure appears to just be a return from the procedure with a value indicating which link the user clicked, or if they submitted a form, which form and the form field values.
So it'll be easy to mix continuational page flows with other bits of the site that work totally differently.
Get-out clause
And, of course, we should have a file type handler that just loads the file as Scheme code, and runs it, with the request parameters and HTTP library and execution environment and global Scheme bindings and so on all available to it. For when one needs to do something special, or handle a form post.
Handling POST
Personally, unless handled by a continuation manager, I think that POST requests should be dispatched direct to Scheme code that handles it, which should then issue a redirect (potentially an internal one, but ideally a full HTTP redirect, so the result is bookmarkable) to a templated HTML page.
But the community can write whatever handlers they like, that handle POST however they wish.
Application structure
One would normally configure your application framework server to load any third-party plugins one wishes, then load one or more plugins that contain your application logic, then ask your HTML expert to write templates that use your application logic. And, where necessary, you write things like SOAP handlers and place them in with the templates.
Next Steps
There's a few things I still need to add. For a start, it'd be good if there was a way for plugins to perform global operations on requests. For example, a session manager plugin that gets a sneak into every request to check for cookies, issue cookies into the response, and add things to the per-request execution environment based on the 'session state'.
HTTP authentication could be handled with a similar mechanism, but that's sadly out of fashion these days.
It would also be neat to enable template plugins to ask the data sources for last-modified timestamps on objects. This would enable automatic generation of Last-Modified
headers, and subsequent ability for template handlers to handle conditional HTTP requests, thus making caching more effective. The effective last-modified date of a page would be the latest last-modified date of any load
ed data source, and the source of the page itself and any templates it uses.
This should probably be abstracted out; a file type handler should perhaps be extended from a simple procedure to a more complex object with procedures to compute last-modified dates for the entire page as well as to generate the content. Perhaps cache friendliness should be integral to the interface between core and file type handlers.
That way, the core could itself cache responses in memory, avoiding the cost of invoking a handler for a given URL. It could have the option of a local cache, no cache (for development), or using memcached
for distributed caching.
It would also be nice of file type handlers exposed a high level URL generation interface, accessed via a core procedure that accepts the path to a file (relative to the document root) an optional set of named parameters, and an optional 'rest' list of path components to add. The system would apply an appropriate URL prefix, then find the file type handler for that file and ask it to generate the rest of the URL. It could peek into the file, if it's a template, and extract the parameter definitions, thus mapping from the nice programmer-friendly names in the supplied parameter alist to terse query parameter names and/or positional parameters. Other file type handlers that aren't really template processors can apply whatever logic seems relevant.
Extending plugin interfaces
The previous section has raised the general issue of extending plugin interfaces. I began by describing data sources as simple procedures taking a set of params and returning an object, then starting getting all crazy about adding caching metadata and stuff. And file type handlers went from simple file type handlers to also having interfaces to get last modified times to feed a generic caching infrastructure, and adding interfaces to properly generate URLs.
So should I go back and change the original spec to say that the objects bound in the execution environment aren't simple procedures, but actually CLOS objects or alists mapping interface names to procedures or procedures that accept an operation parameter and return a procedure that does what you want, or...?
Well, I think that caching is key enough that the data source interface convention should be extended to return a list consisting of the data object and a last-modified timestamp, to support caching. But for extensions that add new features rather than just return extra information, such as extending the the file type handlers to support URL generation and computing an overall last-modified timestamp without rending the result, it would set a good precedent for future expansion with backwards compatability to just add extra environment entries be made for extended functionality.
Eg, a file type handler plugin might register file-extension:ssp
, binding it to a procedure that renders .ssp
files to an HTTP response (presumably in HTML). But if it supports clever URL generation, it can also register file-extension-url-generation:ssp
, providing a suitable procedure to perform the mapping. And file-extension-modtime-computation:ssp
, providing a procedure that works out the overall last modification timestamp for .ssp
files.