Saturday, September 19, 2009

Twisted Web in 60 seconds: static URL dispatch


Welcome to the third installment of "Twisted Web in 60 seconds". The goal of this installment is to show you how to serve different content at different URLs using APIs from Twisted Web (the first and second installments covered ways in which you might want to generate this content).




Key to understanding how different URLs are handled with the resource APIs in Twisted Web is understanding that any URL can be used to address a node in a tree. Resources in Twisted Web exist in such a tree, and a request for a URL will be responded to by the resource which that URL addresses. The addressing scheme considers only the path segments of the URL. Starting with the root resource (the one used to construct the Site) and the first path segment, a child resource is looked up. As long as there are more path segments, this process is repeated using the result of the previous lookup and the next path segment. For example, to handle a request for /foo/bar, first the root's "foo" child is retrieved, then that resource's "bar" child is retrieved, then that resource is used to create the response.




With that out of the way, let's consider an example that can serve a few different resources at a few different URLs.




First things first: we need to import Site, the factory for HTTP servers, Resource, a convenient base class for custom pages, and reactor, the object which implements the Twisted main loop. We'll also import File to use as the resource at one of the example URLs.

  from twisted.web.server import Site
  from twisted.web.resource import Resource
  from twisted.internet import reactor
  from twisted.web.static import File




Now we create a resource which will correspond to the root of the URL hierarchy: all URLs are children of this resource.

  root = Resource()




Here comes the interesting part of this example. I'm now going to create three more resources and attach them to the three URLs /foo, /bar, and /baz:

  root.putChild("foo", File("/tmp"))
  root.putChild("bar", File("/lost+found"))
  root.putChild("baz", File("/opt"))




Last, all that's required is to create a Site with the root resource, associate it with a listening server port, and start the reactor:

  factory = Site(root)
  reactor.listenTCP(8880, factory)
  reactor.run()

With this server running, http://localhost:8880/foo will serve a listing of files from /tmp, http://localhost:8880/bar will serve a listing of files from /lost+found, and http://localhost:8880/baz will serve a listing of files from /opt.




Here's the whole example uninterrupted:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor
from twisted.web.static import File

root = Resource()
root.putChild("foo", File("/tmp"))
root.putChild("bar", File("/lost+found"))
root.putChild("baz", File("/opt"))

factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()



Next time I'll show you how to handle URLs dynamically. Also, hey! I want your feedback. Do you find these posts useful? Am I presenting the information clearly? Tell me about it.

15 comments:

  1. These posts are useful. Keep them coming. I'm just getting into Twisted.

    ReplyDelete
  2. Hi JP. Yes, they're great! Are you on Twitter? I guess I could do a search...

    And, thanks.

    ReplyDelete
  3. Hi Terry,

    I can't imagine what I would ever use Twitter for, so I've never signed up.

    Thanks for the feedback. :)

    ReplyDelete
  4. Thanks for the post.

    I'm looking at the Twisted source. It looks like file operations are blocking... it's just using the standard Python "open" call and isn't setting the os.O_NONBLOCK flag.

    If file IO is blocking, and everything runs in a single OS thread under Twisted, how performant is this? Should we still delegate serving of static content to a separate web server (whether it's nginx, lightty, etc)?

    Thanks again.

    ReplyDelete
  5. Yes and yes. How deep do you intend to take this series?

    ReplyDelete
  6. I'm enjoying the posts, too. I like that each one is very focused
    but that in sequence they tell a bigger story.

    ReplyDelete
  7. Yes, twisted.web.static.File is implemented in terms of the normal, blocking file I/O calls. If reading data from the filesystem is slow, then this can be a problem. Generally, "slow" could reasonably mean that the filesystem is actually mounted from the network (eg NFS). Since this typically isn't the case, it's not very common to need to worry about it. For a normal, local filesystem, file I/O only blocks for a very short period. Overall, it's something to worry about later, not now.

    There are a variety of options for dealing with the issue when you do need to consider it. Serving static content with another web server is certainly one. At some point, you don't even really care what the web server is, you just want to dump your static data onto a CDN and move on. :) However, it's also possible to implement something like File that makes use of some platform's asynchronous I/O APIs. O_NONBLOCK doesn't really help here - POSIX more or less lets systems pretend that file-based I/O is always non-blocking. O_NONBLOCK is around mostly for FIFOs and for its interaction with fcntl(2). However, Windows does have real asynchronous file I/O (via IOCP) and it's possible that someday Linux and other POSIX platforms will too. Any of these could be used to create a more event-driven-friendly version of File.

    ReplyDelete
  8. Yes they are useful. I hope after this series you start another one "Twisted in 5 minutes" to include some more advanced concepts ;)

    ReplyDelete
  9. This is exactly the types of articles and examples I'd have loved to see when I was first delving into twisted.web.

    ReplyDelete
  10. Thanks for the feedback. :)

    I don't think Twisted Web itself is very deep. I'll go to the end, or so, I think. Suggestions for topics to cover are welcome, of course.

    ReplyDelete
  11. Just starting with twisted. It's a great framework and I'm learning a lot from it. Your posts make it really enjoyable!

    I'm waiting for the next one...

    Thanks for that!

    ReplyDelete
  12. I have a server with an ad hoc, informally-specified, bug-ridden, slow implementation of half of Twisted, which I want to supplement with an internal webserver. (Preferably on the same port as its other protocol, evil as that may sound...) After looking at Twisted's documentation, I gave up and started looking at other asynchronous frameworks; you've given me hope that I can use it after all.

    ReplyDelete
  13. Isn't there AIO for such purposes? Though I don't know if it's stable enough.

    Another solution would be a separate worker thread which would take requests out of a queue, serve them and report the results in a loop; the Twisted reactor would then fill the queue and poll for interesting events from that worker.

    Don't know if it smells too much like duct tape solution, but at least we would have more or less complete asynchronous support for all I/O bound operations.

    ReplyDelete
  14. The AIO APIs available for POSIX and on Linux are among the APIs which may someday be good enough to use to solve this problem, yes. :) There seems to be little interest from Linux kernel developers to actually solve this problem, though, so it may take a while.

    You're absolutely right that a userspace threadpool could also be used to do this, though. Great point.

    ReplyDelete
  15. Roomate and I are excited. This is illuminating. Thanks a lot I feel indebted :D

    ReplyDelete