Tuesday, December 1, 2009

Twisted Web in 60 seconds: session endings


Welcome back to "Twisted Web in 60 seconds". Over the previous two entries, I introduced Twisted Web's session APIs. This included accessing the session object, storing state on it, and retrieving it later. I described how the Session object has a lifetime which is tied to the notional session it represents. In this installment, I'll describe how you can exert some control over that lifetime and react when it expires.




The lifetime of a session is controlled by the sessionTimeout attribute of the Session class. This attribute gives the number of seconds a session may go without being accessed before it expires. The default is 15 minutes. In this example, I'll show you change that to a different value.




One way to override the value is with a subclass:



  from twisted.web.server import Session

 class ShortSession(Session):
     sessionTimeout = 60



To have Twisted Web actually make use of this session class, rather than the default, it is also necessary to override the sessionFactory attribute of Site. I could do this with another subclass, but I can also do it to just one instance of Site:



  from twisted.web.server import Site

 factory = Site(rootResource)
 factory.sessionFactory = ShortSession



Sessions given out for requests served by this Site will use ShortSession and only last one minute without activity.




You can have arbitrary functions run when sessions expire, too. This can be useful for cleaning up external resources associated with the session, tracking usage statistics, and more. This functionality is provided via Session.notifyOnExpire. It accepts a single argument: a function to call when the session expires. Here's a trivial example which prints a message whenever a session expires:



  from twisted.web.resource import Resource

 class ExpirationLogger(Resource):
     sessions = set()

     def render_GET(self, request):
         session = request.getSession()
         if session.uid not in self.sessions:
             self.sessions.add(session.uid)
             session.notifyOnExpire(lambda: self._expired(session.uid))
         return ""

     def _expired(self, uid):
         print "Session", uid, "has expired."
         self.sessions.remove(uid)



Keep in mind that using a method as the callback will keep the instance (in this case, the ExpirationLogger resource) in memory until the session expires.




With those pieces in hand, here's an example that prints a message whenever a session expires, and uses sessions which last for 5 seconds:



from twisted.web.server import Site, Session
from twisted.web.resource import Resource
from twisted.internet import reactor

class ShortSession(Session):
   sessionTimeout = 5

class ExpirationLogger(Resource):
   sessions = set()

   def render_GET(self, request):
       session = request.getSession()
       if session.uid not in self.sessions:
           self.sessions.add(session.uid)
           session.notifyOnExpire(lambda: self._expired(session.uid))
       return ""

   def _expired(self, uid):
       print "Session", uid, "has expired."
       self.sessions.remove(uid)

rootResource = Resource()
rootResource.putChild("logme", ExpirationLogger())
factory = Site(rootResource)
factory.sessionFactory = ShortSession

reactor.listenTCP(8080, factory)
reactor.run()



Since Site customization is required, this example can't be rpy-based, so it brings back the manual reactor.listenTCP and reactor.run calls. Run it and visit /logme to see it in action. Keep visiting it to keep your session active. Stop visiting it for five seconds to see your session expiration message.




That pretty much wraps things up for Twisted Web's built in session support. Next time I'll cover some of Twisted Web's proxying features.

Sunday, November 29, 2009

Divmod Software Releases


I've made some new releases of software formerly developed by Divmod, Inc. and now maintained by the community. These releases include changes from both before and after the end of Divmod. Click through to find links to release details.





Enjoy.

Saturday, November 28, 2009

Twisted Web in 60 seconds: storing objects in the session


Welcome to the 16th installment of "Twisted Web in 60 seconds". Last time I introduced the basic APIs for interacting with Twisted Web sessions. In this installment, I'll show you how you can persist objects across requests in the session object.




As I discussed last time, instances of Session last as long as the notional session itself does. Each time Request.getSession is called, if the session for the request is still active, then the same Session instance is returned as was returned previously. Because of this, Session instances can be used to keep other objects around for as long as the session exists.




It's easier to demonstrate how this works than explain it, so here's an example:



  >>> from zope.interface import Interface, Attribute, implements
 >>> from twisted.python.components import registerAdapter
 >>> from twisted.web.server import Session
 >>> class ICounter(Interface):
 ...     value = Attribute("An int value which counts up once per page view.")
 ...
 >>> class Counter(object):
 ...     implements(ICounter)
 ...     def __init__(self, session):
 ...         self.value = 0
 ...
 >>> registerAdapter(Counter, Session, ICounter)
 >>> ses = Session(None, None)
 >>> data = ICounter(ses)
 >>> print data
 <__main__.Counter object at 0x8d535ec>
 >>> print data is ICounter(ses)
 True
 >>>



What?, I hear you say.




What's shown in this example is the interface and adaption based API which Session exposes for persisting state. There are several critical pieces interacting here:




  • ICounter is an interface which serves several purposes. Like all interfaces, it documents the API of some class of objects (in this case, just the value attribute). It also serves as a key into what is basically a dictionary within the session object: the interface is used to store or retrieve a value on the session (the Counter instance, in this case).

  • Counter is the class which actually holds the session data in this example. It implements ICounter (again, mostly for documentation purposes). It also has a value attribute, as the interface declared.

  • The registerAdapter call sets up the relationship between its three arguments so that adaption will do what we want in this case.

  • Adaption is performed by the expression ICounter(ses). This is read as adapt ses to ICounter. Because of the registerAdapter call, it is roughly equivalent to Counter(ses). However (because of certain things Session does), it also saves the Counter instance created so that it will be returned the next time this adaption is done. This is why the last statement produces True.




If you're still not clear on some of the details there, don't worry about it and just remember this: ICounter(ses) gives you an object you can persist state on. It can be as much or as little state as you want, and you can use as few or as many different Interface classes as you want on a single Session instance.




With those conceptual dependencies out of the way, it's a very short step to actually getting persistent state into a Twisted Web application. Here's an example which implements a simple counter, re-using the definitions from the example above:



  from twisted.web.resource import Resource

 class CounterResource(Resource):
     def render_GET(self, request):
         session = request.getSession()
         counter = ICounter(session)
         counter.value += 1
         return "Visit #%d for you!" % (counter.value,)



Pretty simple from this side, eh? All this does is use Request.getSession and the adaption from above, plus some integer math to give you a session-based visit counter.




Here's the complete source for an rpy script based on this example:



cache()

from zope.interface import Interface, Attribute, implements
from twisted.python.components import registerAdapter
from twisted.web.server import Session
from twisted.web.resource import Resource

class ICounter(Interface):
   value = Attribute("An int value which counts up once per page view.")

class Counter(object):
   implements(ICounter)
   def __init__(self, session):
       self.value = 0

registerAdapter(Counter, Session, ICounter)

class CounterResource(Resource):
   def render_GET(self, request):
       session = request.getSession()
       counter = ICounter(session)  
       counter.value += 1
       return "Visit #%d for you!" % (counter.value,)

resource = CounterResource()



One more thing to note is the cache() call at the top of this example. As with the previous example where this came up, this rpy script is stateful. This time, it's the ICounter definition and the registerAdapter call that need to be executed only once. If we didn't use cache, every request would define a new, different interface named ICounter. Each of these would be a different key in the session, so the counter would never get past one.




There's one more interesting thing you can do with sessions in Twisted Web right out of the box. Tune in next time to find out what.

Wednesday, November 18, 2009

Twisted Web in 60 seconds: session basics

Welcome to the 15th installment of "Twisted Web in 60 seconds". As promised, I'll be covering sessions in this installment. Or, more accurately, I'll be covering a tiny bit of sessions. As this is the most complicated topic I've covered so far, I'm going to take a few installments to cover all the different aspects.



In this installment, you can expect to learn the very basics of the Twisted Web session API: how to get the session object for the current request and how to prematurely expire a session.



Before I get into the APIs, though, I should explain the big picture of sessions in Twisted Web. Sessions are represented by instances of Session. The Site creates a new instance of Session the first time an application asks for it for a particular session. Session instances are kept on the Site instance until they expire (due to inactivity or because they are explicitly expired). Each time after the first that a particular session's Session object is requested, it is retrieved from the Site.



With the conceptual underpinnings of the upcoming API in place, here comes the example. This will be a very simple rpy script which tells a user what their unique session identifier is and lets them prematurely expire it.



First, I'll import Resource so I can define a couple subclasses of it:



  from twisted.web.resource import Resource


Next I'll define the resource which tells the client what its session identifier is. This is done easily by first getting the session object using Request.getSession and then getting the session object's uid attribute.



  class ShowSession(Resource):
     def render_GET(self, request):
         return 'Your session id is: ' + request.getSession().uid


To let the client expire their own session before it times out, I'll define another resource which expires whatever session it is requested with. This is done using the Session.expire method.



  class ExpireSession(Resource):
     def render_GET(self, request):
         request.getSession().expire()
         return 'Your session has been expired.'


Finally, to make the example an rpy script, I'll make an instance of ShowSession and give it an instance of ExpireSession as a child using Resource.putChild (covered earlier).



  resource = ShowSession()
  resource.putChild("expire", ExpireSession())


And that is the complete example. You can fire this up and load the top page. You'll see a (rather opaque) session identifier that remains the same across reloads (at least until you flush the TWISTED_SESSION cookie from your browser or enough time passes). You can then visit the expire child and go back to the top page and see that you have a new session.



Here's the complete source for the example.



from twisted.web.resource import Resource

class ShowSession(Resource):
   def render_GET(self, request):
       return 'Your session id is: ' + request.getSession().uid

class ExpireSession(Resource):
   def render_GET(self, request):
       request.getSession().expire()
       return 'Your session has been expired.'

resource = ShowSession()
resource.putChild("expire", ExpireSession())


Next time I'll talk about how you can persist information in the session object.

Friday, November 6, 2009

Twisted Web in 60 seconds: HTTP authentication


Welcome to the 14th installment of "Twisted Web in 60 seconds". In many of the previous installments, I've demonstrated how to serve content by using existing resource classes or implementing new ones. In this installment, I'll demonstrate how you can use Twisted Web's basic or digest HTTP authentication to control access to these resources.




Guard, the Twisted Web module which provides most of the APIs which will be used in this example, helps you to add authentication and authorization to a resource hierarchy. It does this by providing a resource which implements getChild to return a dynamically selected resource. The selection is based on the authentication headers in the request. If those headers indicate the request is made on behalf of Alice, then Alice's resource will be returned. If they indicate it was made on behalf of Bob, his will be returned. If the headers contain invalid credentials, an error resource is returned. Whatever happens, once this resource is returned, URL traversal continues as normal from that resource.




The resource which implements this is HTTPAuthSessionWrapper, though it is directly is directly responsible for very little of the process. It will extract headers from the request and hand them off to a credentials factory to parse them according to the appropriate standards (eg HTTP Authentication: Basic and Digest Access Authentication) and then it hands the resulting credentials object off to a portal, the core of Twisted Cred, a system for uniform handling of authentication and authorization. I am not going to discuss Twisted Cred in much depth here. To make use of it with Twisted Web, the only thing you really need to know is how to implement a realm.




You need to implement a realm because the realm is the object which actually decides which resources are used for which users. This can be as complex or as simple as it suitable for your application. For this example, I'll keep it very simple: each user will have a resource which is a static file listing of the public_html directory in their UNIX home directory. First, I need to import implements from zope.interface and IRealm from twisted.cred.portal. Together these will let me mark this class as a realm (this is mostly - but notentirely - a documentation thing). I'll also need File for the actual implementation later.



  from zope.interface import implements

 from twisted.cred.portal import IRealm
 from twisted.web.static import File

 class PublicHTMLRealm(object):
     implements(IRealm)



A realm only needs to implement one method, requestAvatar. This is called after any successful authentication attempt (ie, Alice supplied the right password). Its job is to return the avatar for the user who succeeded in authenticating. An avatar is just an object that represents a user. In this case, it will be a File. In general, with Guard, the avatar must be a resource of some sort.



      def requestAvatar(self, avatarId, mind, *interfaces):
         if IResource in interfaces:
             return (IResource, File("/home/%s/public_html" % (avatarId,)), lambda: None)
         raise NotImplementedError()



A few notes on this method:




  • The avatarId parameter is essentially the username. It's the job of some other code to extract the username from the request headers and make sure it gets passed here.

  • The mind is always None when writing a realm to be used with Guard. You can ignore it until you want to write a realm for something else.

  • Guard always passed IResource for the interfaces parameter. If interfaces only contains interfaces your code doesn't understand, raising NotImplementedError is the thing to do, as above. You'll only need to worry about getting a different interface when you write a realm for something other than Guard.

  • If you want to track when a user logs out, that's what the last element of the returned tuple is for. It will be called when this avatar logs out. lambda: None is the idiomatic no-op logout function.

  • Notice that I have written the path handling code in this example very poorly. This example may be vulnerable to certain unintentional information disclosure attacks. This sort of problem is exactly the reason FilePath exists. However, that's an example for another day...



We're almost ready to set up the resource for this example. To create an HTTPAuthSessionWrapper, though, we need two things. First, a portal, which requires the realm above, plus at least one credentials checker:



  from twisted.cred.portal import Portal
 from twisted.cred.checkers import FilePasswordDB

 portal = Portal(PublicHTMLRealm(), [FilePasswordDB('httpd.password')])



FilePasswordDB is that credentials checker I mentioned. It knows how to read passwd(5)-style (loosely) files to check credentials against. It is responsible for the authentication work after HTTPAuthSessionWrapper extracts the credentials from the request.




Next we need either BasicCredentialFactory or DigestCredentialFactory. The former knows how to challenge HTTP clients to do basic authentication; the latter, digest authentication. I'll use digest here:



  from twisted.web.guard import DigestCredentialFactory

 credentialFactory = DigestCredentialFactory("md5", "example.org")



The two parameters to this constructor are the hash algorithm and the http authentication realm which will be used. The only other valid hash algorithm is "sha" (but be careful, MD5 is more widely supported than SHA). The http authentication realm is mostly just a string that is presented to the user to let them know why they're authenticating (you can read more about this in the RFC).




With those things created, we can finally instantiate HTTPAuthSessionWrapper:



  from twisted.web.guard import HTTPAuthSessionWrapper

 resource = HTTPAuthSessionWrapper(portal, [credentialFactory])



There's just one last thing that needs to be done here. When I introduced rpy scripts, I mentioned that they're evaluated in an unusual context. This is the first example which actually needs to take this into account. It so happens that DigestCredentialFactory instances are actually stateful. Authentication will only succeed if the same instance is used to generate challenges and examine the responses to those challenges. However, the normal mode of operation for an rpy script is for it to be re-executed for every request. This leads to a new DigestCredentialFactory being created for every request, preventing any authentication attempt from ever succeeding.




There are two ways to deal with this. First, the better of the two ways, I could move almost all of the code into a real Python module, including the code which instantiates the DigestCredentialFactory. This would make ensure the same instance was used for every request. Second, the easier of the two ways, I could add a call to cache to the beginning of the rpy script:



  cache()



cache is part of the globals of any rpy script, so you don't need to import it (it's okay to be cringing at this point). Calling cache makes Twisted re-use the result of the first evaluation of the rpy script for subsequent requests too. Just what I want in this case.




Here's the complete example (with imports re-arranged to the more conventional style):



cache()

from zope.interface import implements

from twisted.cred.portal import IRealm, Portal
from twisted.cred.checkers import FilePasswordDB
from twisted.web.static import File
from twisted.web.resource import IResource
from twisted.web.guard import HTTPAuthSessionWrapper, DigestCredentialFactory

class PublicHTMLRealm(object):
   implements(IRealm)

   def requestAvatar(self, avatarId, mind, *interfaces):
       if IResource in interfaces:
           return (IResource, File("/home/%s/public_html" % (avatarId,)), lambda: None)
       raise NotImplementedError()

portal = Portal(PublicHTMLRealm(), [FilePasswordDB('httpd.password')])

credentialFactory = DigestCredentialFactory("md5", "localhost:8080")
resource = HTTPAuthSessionWrapper(portal, [credentialFactory])



And voila, a password-protected per-user Twisted Web server.




I've gotten several requests to write something about sessions, so there's a good chance that's what you'll find in the next installment.

Thursday, November 5, 2009

Free Agent


As of Monday, the 9th, I will be considering opportunities for short term consulting and contract work. Please feel free to contact me if you have a software challenge to tackle, particularly if it involves one or more of Python, Twisted, networking, event-driven architectures, massive scaling, or open source software.




Immediately following the demise of Divmod this summer, I took a job at a major international corporation. A number of factors conspired to make this decision non-viable in the long term. Today I gave notice. I'm excited to be able to get back to doing what I love - solving challenging, interesting problems in a flexible, open environment.




One of the other things I've been unable to do since even before Divmod's end is commit serious time to Twisted development and maintenance. This is something else I'm looking forward to re-engaging in. I made a sizable dent in Twisted's open ticket count last fall and winter, thanks to funding from the Twisted Software Foundation (in turn thanks to all of the Twisted founding sponsors). I'll be able to continue this work thanks to this year's sponsors (visible on the front page of the Twisted site), though perhaps not to the same extent. If you'd like to help out in this regard, become a sponsor! All donations are useful and appreciated!

Monday, November 2, 2009

September - October Reading List


  • The Player of Games. Iain Banks.

  • The State of the Art. Iain Banks.

  • Use of Weapons. Iain Banks.

  • Excession. Iain Banks.

  • Inversions. Iain Banks.

  • The Planck Dive. Greg Egan.

  • The Name of the Wind. Patrick Rothfuss.

  • Red Seas Under Red Skies. Scott Lynch.

Twisted Halloween

A bunch of us carved pumpkins for Halloween, and Ying was thoughtful enough to bring along her very nice camera and got some nice shots.

Sunday, October 25, 2009

Twisted Web in 60 seconds: WSGI


Welcome to the 13th installment of Twisted Web in 60 seconds. For a while, I've been writing about how you can implement pages by working with the Twisted Web resource model. The very first example I showed you used an existing Resource subclass to serve static content from the filesystem. In this installment, I'll show you how to use WSGIResource, another existing Resource subclass which lets you serve WSGI applications in a Twisted Web server.




First, a few things about WSGIResource. It is a multithreaded WSGI container. Like any other WSGI container, you can't do anything asynchronous in your WSGI applications, even though this is a Twisted WSGI container. In the latest release of Twisted as of this post, 8.2, WSGIResource also has a few significant bugs. These are fixed in trunk (and the fixes will be included in 9.0), so if you want to play around with WSGI in any significant way, you probably want trunk for now.




The first new thing in this example is the import of WSGIResource:



  from twisted.web.wsgi import WSGIResource



Nothing too surprising there. We still need one of the other usual suspects, too:



  from twisted.internet import reactor



You'll see why in a minute. Next, we need a WSGI application. Here's a really simple one just to get things going:



  def application(environ, start_response):
     start_response('200 OK', [('Content-type', 'text/plain')])
     return ['Hello, world!']



If this doesn't make sense to you, take a look at one of these fine tutorials. Otherwise, or once you're done with that, the next step is to create a WSGIResource instance - as this is going to be another rpy script example.



  resource = WSGIResource(reactor, reactor.getThreadPool(), application)



I need to dwell on this line for a minute. The first parameter passed to WSGIResource is the reactor. Despite the fact that the reactor is global and any code that wants it can always just import it (as, in fact, this rpy script simply does itself), passing it around as a parameter leaves the door open for certain future possibilities. For example, having more than one reactor. There are also testing implications. Consider how much easier it is to unit test a function that accepts a reactor - perhaps a mock reactor specially constructed to make your tests easy to write ;) - rather than importing the real global reactor. Anyhow, that's why WSGIResource requires you to pass the reactor to it.




The second parameter passed to WSGIResource is a thread pool. WSGIResource uses this to actually call the application object passed in to it. To keep this example short, I'm passing in the reactor's internal threadpool here, letting me skip its creation and shutdown-time destruction. For finer control over how many WSGI requests are served in parallel, you may want to create your own thread pool to use with your WSGIResource. But for simple testing, using the reactor's is fine (although I'm cheating here a little - I apologize - getThreadPool is a new API, not present in 8.2: you need trunk for this example to work; please ask Chris Armstrong to release 9.0 already).




The final argument is the application object. This is pretty typical of how WSGI containers work.




The example, sans interruption:



  from twisted.web.wsgi import WSGIResource
 from twisted.internet import reactor

 def application(environ, start_response):
     start_response('200 OK', [('Content-type', 'text/plain')])
      return ['Hello, world!']

 resource = WSGIResource(reactor, reactor.getThreadPool(), application)



Up to the point where the WSGIResource instance defined here exists in the resource hierarchy, the normal resource traversal rules apply - getChild will be called to handle each segment. Once the WSGIResource is encountered, though, that process stops and all further URL handling is the responsibility of the WSGI application. Of course this application does nothing with the URL, so you won't be able to tell that.




Oh, and as was the case with the first static file example, there's also a command line option you can use to avoid a lot of this. If you just put the above application function, without all of the WSGIResource stuff, into a file, say, foo.py, then you can launch a roughly equivalent server like this:



  $ twistd -n web --wsgi foo.application



Tune in next time, when I'll discuss HTTP authentication.

Thursday, October 22, 2009

Twisted Web in 60 seconds: logging errors


Welcome to the twelfth installment of "Twisted Web in 60 seconds". The previous installment created a server which dealt with response errors by aborting response generation, potentially avoiding pointless work. However, it did this silently for any error. In this installment, I'll modify the previous example so that it logs each failed response.




This example will use the Twisted API for logging errors. As I mentioned in the first post covering Deferreds, errbacks are passed an error. In the previous example, the _responseFailed errback accepted this error as a parameter but ignored it. The only way this example will differ is that this _responseFailed will use that error parameter to log a message.




This example will require all of the imports required by the previous example, which I will not repeat here, plus one new import:



  from twisted.python.log import err



The only other part of the previous example which changes is the _responseFailed callback, which will now log the error passed to it:



      def _responseFailed(self, failure, call):
         call.cancel()
         err(failure, "Async response demo interrupted response")



I'm passing two arguments to err here. The first is the error which is being passed in to the callback. This is always an object of type Failure, a class which represents an exception and (sometimes, but not always) a traceback. err will format this nicely for the log. The second argument is a descriptive string that tells someone reading the log what the source of the error was.




Here's the full example with the two above modifications:



from twisted.web.resource import Resource
from twisted.web.server import NOT_DONE_YET
from twisted.internet import reactor
from twisted.python.log import err

class DelayedResource(Resource):
   def _delayedRender(self, request):
       request.write("Sorry to keep you waiting.")
       request.finish()

   def _responseFailed(self, failure, call):
       call.cancel()
       err(failure, "Async response demo interrupted response")

   def render_GET(self, request):
       call = reactor.callLater(5, self._delayedRender, request)
       request.notifyFinish().addErrback(self._responseFailed, call)
       return NOT_DONE_YET

resource = DelayedResource()



Run this server (see the end of the previous installment if you need a reminder about how to do that) and interrupt a request. Unlike the previous example, where the server gave no indication that this had happened, you'll see a message in the log output with this version.




Next time I'll show you about a resource that lets you host WSGI applications in a Twisted Web server.

Sunday, October 18, 2009

Twisted Web in 60 seconds: interrupted responses


Welcome to the eleventh installment of "Twisted Web in 60 seconds". Previously I gave an example of a Resource which generates its response asynchronously rather than immediately upon the call to its render method. When generating responses asynchronously, the possibility is introduced that the connection to the client may be lost before the response is generated. In such a case, it is often desirable to abandon the response generation entirely, since there is nothing to do with the data once it is produced. In this installment, I'll show you how to be notified that the connection has been lost.




This example will build upon the example from installment nine which simply (if not very realistically) generated its response after a fixed delay. I will expand that resource so that as soon as the client connection is lost, the delayed event is canceled and the response is never generated.




The feature this example relies on is provided by another Request method: notifyFinish. This method returns a new Deferred which will fire with None if the request is successfully responded to or with an error otherwise - for example if the connection is lost before the response is sent.




The example starts in a familiar way, with the requisite Twisted imports and a resource class with the same _delayedRender used previously:



  from twisted.web.resource import Resource
 from twisted.web.server import NOT_DONE_YET
 from twisted.internet import reactor

 class DelayedResource(Resource):
     def _delayedRender(self, request):
         request.write("<html><body>Sorry to keep you waiting.</body></html>")
         request.finish()



Before defining the render method, I'm going to define an errback (an errback being a callback that gets called when there's an error), though. This will be the errback attached to the Deferred returned by Request.notifyFinish. It will cancel the delayed call to _delayedRender.



      def _responseFailed(self, err, call):
         call.cancel()



Finally, the render method will set up the delayed call just as it did before, and return NOT_DONE_YET likewise. However, it will also use Request.notifyFinish to make sure _responseFailed is called if appropriate.



      def render_GET(self, request):
         call = reactor.callLater(5, self._delayedRender, request)
         request.notifyFinish().addErrback(self._responseFailed, call)
         return NOT_DONE_YET



Notice that since _responseFailed needs a reference to the delayed call object in order to cancel it, I passed that object to addErrback. Any additional arguments passed to addErrback (or addCallback) will be passed along to the errback after the Failure instance which is always passed as the first argument. Passing call here means it will be passed to _responseFailed, where it is expected and required.




That covers almost all the code for this example. Here's the entire example without interruptions, as an rpy script:



from twisted.web.resource import Resource
from twisted.web.server import NOT_DONE_YET
from twisted.internet import reactor

class DelayedResource(Resource):
   def _delayedRender(self, request):
       request.write("Sorry to keep you waiting.")
       request.finish()

   def _responseFailed(self, err, call):
       call.cancel()

   def render_GET(self, request):
       call = reactor.callLater(5, self._delayedRender, request)
       request.notifyFinish().addErrback(self._responseFailed, call)
       return NOT_DONE_YET

resource = DelayedResource()



Toss this into example.rpy, fire it up with twistd -n web --path ., and hit http://localhost:8080/example.rpy. If you wait five seconds, you'll get the page content. If you interrupt the request before then, say by hitting escape (in Firefox, at least), then you'll see perhaps the most boring demonstration ever - no page content, and nothing in the server logs. Success!




Next time I'll digress slightly to cover the basics of Twisted logging and expand this example to use it to show when clients fail to receive the response they requested.

Twisted Security Outreach


Following the second of Matasano's recommendations for how to get security right, Twisted now has a security outreach page. All you security researchers out there who've been holding back because you thought we wouldn't pay attention, bring it on. :)

Saturday, October 10, 2009

Twisted Web in 60 seconds: asynchronous responses (via Deferred)


Welcome to the tenth installment of "Twisted Web in 60 90 seconds". Previously I gave an example of a Resource which generates its response asynchronously rather than immediately upon the call to its render method. Though it was a useful demonstration of the NOT_DONE_YET feature of Twisted Web, the example itself didn't reflect what a realistic application might want to do. In this installment, I'll introduce Deferred, the Twisted class which is used to provide a uniform interface to many asynchronous events, and show you an example of using a Deferred-returning API to generate an asynchronous response to a request in Twisted Web1.




Deferred is the result of two consequences of the asynchronous programming approach. First, asynchronous code is frequently (if not always) concerned with some data (in Python, an object) which is not yet available but which probably will be soon. Asynchronous code needs a way to define what will be done to the object once it does exist. It also needs a way to define how to handle errors in the creation or acquisition of that object. These two needs are satisfied by the callbacks and errbacks of a Deferred. Callbacks are added to a Deferred with Deferred.addCallback; errbacks are added with Deferred.addErrback. When the object finally does exist, it is passed to Deferred.callback which passes it on to the callback added with addCallback. Similarly, if an error occurs, Deferred.errback is called and the error is passed along to the errback added with addErrback. Second, the events that make asynchronous code actually work often take many different, incompatible forms. Deferred acts as the uniform interface which lets different parts of an asynchronous application interact and isolates them from implementation details they shouldn't be concerned with.




That's almost all there is to Deferred. To solidify your new understanding, now consider this rewritten version of DelayedResource which uses a Deferred-based delay API. It does exactly the same thing as the previous example. Only the implementation is different.




First, the example must import that new API I just mentioned, deferLater:



  from twisted.internet.task import deferLater



Next, all the other imports (these are the same as last time):



  from twisted.web.resource import Resource
  from twisted.web.server import NOT_DONE_YET
  from twisted.internet import reactor



With the imports done, here's the first part of the DelayedResource implementation. Again, this part of the code is identical to the previous version:



  class DelayedResource(Resource):
     def _delayedRender(self, request):
         request.write("<html><body>Sorry to keep you waiting.</body></html>")
         request.finish()



Next I also need to define the render method. Here's where things change a bit. Instead of using callLater, I'm going to use deferLater this time. deferLater accepts a reactor, delay (in seconds, as with callLater), and a function to call after the delay to produce that elusive object I was talking about above in my description of Deferreds. I'm also doing to use _delayedRender as the callback to add to the Deferred returned by deferLater. Since it expects the request object as an argument, I'm going to set up the deferLater call to return a Deferred which has the request object as its result.



      def render_GET(self, request):
         d = deferLater(reactor, 5, lambda: request)



The Deferred referenced by d now needs to have the _delayedRender callback added to it. Once this is done, _delayedRender will be called with the result of d (which will be request, of course — the result of (lambda: request)()).



          d.addCallback(self._delayedRender)



Finally, the render method still needs to return NOT_DONE_YET, for exactly the same reasons as it did in the previous version of the example.



          return NOT_DONE_YET



And with that, DelayedResource is now implemented based on a Deferred. The example still isn't very realistic, but remember that since Deferreds offer a uniform interface to many different asynchronous event sources, this code now resembles a real application even more closely; you could easily replace deferLater with another Deferred-returning API and suddenly you might have a resource that does something useful.




Finally, here's the complete, uninterrupted example source, as an rpy script:



from twisted.internet.task import deferLater
from twisted.web.resource import Resource
from twisted.web.server import NOT_DONE_YET
from twisted.internet import reactor

class DelayedResource(Resource):
   def _delayedRender(self, request):
       request.write("Sorry to keep you waiting.")
       request.finish()

   def render_GET(self, request):
       d = deferLater(reactor, 5, lambda: request)
       d.addCallback(self._delayedRender)
       return NOT_DONE_YET

resource = DelayedResource()



1I know I promised an example of handling lost client connections, but I realized that example would also involve Deferreds, so I wanted to introduce Deferreds by themselves first. Tune in next time for the example I told you I'd show you this time.

Twisted Web in 60 seconds: Index

Here's an index of all the "Twisted Web in 60 seconds" entries, for your linking and searching convenience:

Wednesday, October 7, 2009

Twisted Web in 60 seconds: asynchronous responses


Welcome to the ninth installment of "Twisted Web in 60 seconds". In all the previous installments, the resource examples I presented generated responses immediately. One of the features of prime interest of Twisted Web, though, is the ability to generate a response over a longer period of time while leaving the server free to respond to other requests. In other words, asynchronously. In this installment, I'll show you how you can write a resource like this.




A resource which generates a response asynchronously looks like one which generates a response synchronously in many ways. The same base class, Resource, is used either way; the same render methods are used. There are three basic differences, though.




First, instead of returning the string which will be used as the body of the response, the resource uses Request.write. This method can be called repeatedly. Each call appends another string to the response body. Second, when the entire response body has been passed to Request.write, the application must call Request.finish. As you might expect from the name, this ends the response. Finally, in order to make Twisted Web not end the response as soon as the render method returns, the render method must return NOT_DONE_YET. Consider this example:



  from twisted.web.resource import Resource
 from twisted.web.server import NOT_DONE_YET
 from twisted.internet import reactor

 class DelayedResource(Resource):
     def _delayedRender(self, request):
         request.write("<html><body>Sorry to keep you waiting.</body></html>")
         request.finish()

     def render_GET(self, request):
         reactor.callLater(5, self._delayedRender, request)
         return NOT_DONE_YET



If you're not familiar with reactor.callLater, all you really need to know about it to understand this example is that the above usage of it arranges to have self._delayedRender(request) run about 5 seconds after callLater is invoked from this render method and that it returns immediately.




All three of the elements I mentioned earlier can be seen in this example. The resource uses Request.write to set the response body. It uses Request.finish after the entire body has been specified (all with just one call to write in this case). And it returns NOT_DONE_YET from its render method. So there you have it, asynchronous rendering with Twisted Web.




Here's a complete rpy script based on this resource class (see the previous installment if you need a reminder about rpy scripts):



from twisted.web.resource import Resource
from twisted.web.server import NOT_DONE_YET
from twisted.internet import reactor

class DelayedResource(Resource):
   def _delayedRender(self, request):
       request.write("<html><body>Sorry to keep you waiting.</body></html>")
       request.finish()

   def render_GET(self, request):
       reactor.callLater(5, self._delayedRender, request)
       return NOT_DONE_YET

resource = DelayedResource()



Drop this source into a .rpy file and fire up a server using twistd -n web --path /directory/containing/script/. You'll see that loading the page takes 5 seconds. If you try to load a second before the first completes, it will also take 5 seconds from the time you request it (but it won't be delayed by any other outstanding requests).




Something else to consider when generating responses asynchronously is that the client may not wait around to get the response to its request. Next time I'll demonstrate how to detect that the client has abandoned the request and that the server shouldn't bother to finish generating its response.

Friday, October 2, 2009

Twisted Web in 60 seconds: rpy scripts (or, how to save yourself some typing)


Welcome to the eighth installment of "Twisted Web in 60 seconds". In the previous installment, I griped about how much typing I had to do for each of the examples. The goal of this installment is to show you another way to run a Twisted Web server with a custom resource which doesn't require as much code.




The feature I'm talking about is called an rpy script. An rpy script is a Python source file which defines a resource and can be loaded into a Twisted Web server. The advantages of this approach are that you don't have to write code to create the site or set up a listening port with the reactor. That means fewer lines of code that aren't dedicated to the task you're trying to accomplish.




There are some disadvantages, though. An rpy script must have the extension .rpy. This means you can't import it using the usual Python import statement. This means it's hard to re-use code in an rpy script. This also means you can't easily unit test it. The code in an rpy script is evaluated in an unusual context, So, while rpy scripts may be useful for testing out ideas, I would not recommend them for much more than that.




Okay, with that warning out of the way, let's dive in. First, as I mentioned, rpy scripts are Python source files with the .rpy extension. So, open up an appropriately named file (for example, example.rpy) and put this code in it:



import time

from twisted.web.resource import Resource

class ClockPage(Resource):
   isLeaf = True
   def render_GET(self, request):
       return "<html><body>%s</body></html>" % (time.ctime(),)

resource = ClockPage()




You may recognize this as the resource from the first dynamic rendering example. What's different is what you don't see: I didn't import reactor or Site. There's no calls to listenTCP or run. Instead, and this is the core idea for rpy scripts, I just bound the name resource to the resource I want the script to serve. Every rpy script must bind this name, and this name is the only thing Twisted Web will pay attention to in an rpy script.




All that's left is to drop this rpy script into a Twisted Web server. There are a few ways to do this. The simplest way is with twistd:




$ twistd -n web --path .




Hit http://localhost:8080/example.rpy to see it run. You can pass other arguments here too. twistd web has options for specifying which port number to bind, whether to set up an HTTPS server, and plenty more. You can also pass options to twistd here, for example to configure logging to work differently, to select a different reactor, etc. For a full list of options, see twistd --help and twistd web --help.




That's it for rpy scripts for now. I'll probably make use of them in future examples to keep the focus on the new material. And speaking of which, check out the next installment to learn about asynchronous rendering.

Twisted Web in 60 seconds: handling POSTs


Welcome to the seventh installment of "Twisted Web in 60 seconds" in which I'll show you how to handle POST requests. All of the previous installments have focused on GET requests. Unlike GET requests, POST requests can have a request body - extra data after the request headers; for example, data representing the contents of an HTML form. Twisted Web makes this data available to applications via the Request object.




Here's an example web server which renders a static HTML form and then generates a dynamic page when that form is posted back to it. (While it's convenient for this example, it's often not a good idea to make a resource that POSTs to itself; this isn't about Twisted Web, but the nature of HTTP in general; if you do this, make sure you understand the possible negative consequences).




As usual, we start with some imports (see previous installments for details). In addition to the Twisted imports, this example uses the cgi module to escape user-entered content for inclusion in the output.

  from twisted.web.server import Site
 from twisted.web.resource import Resource
 from twisted.internet import reactor

 import cgi




Next, we'll define a resource which is going to do two things. First, it will respond to GET requests with a static HTML form:

  class FormPage(Resource):
     def render_GET(self, request):
         return '<html><body><form method="POST"><input name="the-field" type="text" /></form></body></html>'

This is similar to the static resource I used as an example in a previous installment. However, I'll now add one more method to give it a second behavior; this render_POST method will allow it to accept POST requests:
      def render_POST(self, request):
         return '<html><body>You submitted: %s</body></html>' % (cgi.escape(request.args["the-field"][0]),)




The main thing to note here is the use of request.args. This is a dictionary-like object that provides access to the contents of the form. The keys in this dictionary are the names of inputs in the form. Each value is a list containing strings (since there can be multiple inputs with the same name), which is why I had to extract the first element to pass to cgi.escape. request.args will be populated from form contents whenever a POST request is made with a content type of application/x-www-form-urlencoded or multipart/form-data (it's also populated by query arguments for any type of request).




Finally, the example just needs the usual site creation and port setup:

  root = Resource()
 root.putChild("form", FormPage())
 factory = Site(root)
 reactor.listenTCP(8880, factory)
 reactor.run()

Run the server and visit http://localhost:8880/form, submit the form, and watch it generate a page including the value you entered into the single field.




Here's the complete source for the example:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor

import cgi

class FormPage(Resource):
   def render_GET(self, request):
       return '<html><body><form method="POST"><input name="the-field" type="text" /></form></body></html>'

   def render_POST(self, request):
       return '<html><body>You submitted: %s</body></html>' % (cgi.escape(request.args["the-field"][0]),)

root = Resource()
root.putChild("form", FormPage())
factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()




Since I'm getting a little bored with some of the boilerplate involved in these examples, the next installment will introduce rpy files, a good way to try out new concepts and APIs (like the ones presented in this series) without all the repetitive boilerplate.

Thursday, September 24, 2009

Twisted Web in 60 seconds: custom response codes


Welcome to the sixth edition of "Twisted Web in 60 seconds". In the previous installment I introduced NoResource, a Twisted Web error resource which responds with a 404 (not found) code. In this installment, I'll show you the APIs which NoResource uses to do this so that you can generate your own custom response codes as desired.




First, the now-standard import preamble (see previous installments for details):

  from twisted.web.server import Site
 from twisted.web.resource import Resource
 from twisted.internet import reactor




Now I'll define a new resource class which always returns a 402 (payment required) response. This is really not very different from the resources which I've defined in previous examples. The fact that it has a response code other than 200 doesn't change anything else about its role. This will require using the request object, though, which none of the previous examples have done.




The request object has shown up in a couple places, but so far I've ignored it. It's a parameter to the getChild API as well as to render methods such as render_GET. As you might have suspected, it represents the request for which a response is to be generated. Additionally, it also represents the response being generated. In this example, I'm going to use its setResponseCode method to - you guessed it - set the response's status code.

  class PaymentRequired(Resource):
     def render_GET(self, request):
         request.setResponseCode(402)
         return "<html><body>Please swipe your credit card.</body></html>"

Just like the other resources I've demonstrated, this one returns a string from its render_GET method to define the body of the response. All that's different is the call to setResponseCode to override the default response code, 200, with a different one.




Finally, the code to set up the site and reactor. I'll put an instance of the above defined resource at /buy:

  root = Resource()
  root.putChild("buy", PaymentRequired())
  factory = Site(root)
  reactor.listenTCP(8880, factory)
  reactor.run()




Here's the complete example:

from twisted.web.server import Site  
from twisted.web.resource import Resource  
from twisted.internet import reactor  

class PaymentRequired(Resource):  
   def render_GET(self, request):  
       request.setResponseCode(402)  
       return "<html><body>Please swipe your credit card.</body></html>"  

root = Resource()
root.putChild("buy", PaymentRequired())
factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()                  




Run the server and visit http://localhost:8880/buy in your browser. It'll look pretty boring, but if you use Firefox's View Page Info right-click menu item (or your browser's equivalent), you'll be able to see that the server indeed sent back a 402 response code.




Check out the next installment to see how the request object can also be used to get the request body (eg, for form submissions).

Tuesday, September 22, 2009

Twisted Web in 60 seconds: error handling


Welcome to the fifth installment of "Twisted Web in 60 seconds". In the previous installment, I demonstrated how a Twisted Web server can decide how to respond to requests based on dynamic inspection of the request URL. In this installment, I'll show you how to extend such dynamic dispatch to return a 404 (not found) response when a client requests a non-existent URL.




As in the previous installments, we'll start with Site, Resource, and reactor imports (see the first and second installments for explanations of these):

  from twisted.web.server import Site
 from twisted.web.resource import Resource
 from twisted.internet import reactor




Next, we'll add one more import. NoResource is one of the pre-defined error resources provided by Twisted Web. It generates the necessary 404 response code and renders a simple html page telling the client there is no such resource.

  from twisted.web.error import NoResource




Next, we'll define a custom resource which does some dynamic URL dispatch. This example is going to be just like the previous one, where the path segment is interpreted as a year; the difference is that this time, we'll handle requests which don't conform to that pattern by returning the not found response:

  class Calendar(Resource):
     def getChild(self, name, request):
         try:
             year = int(name)
         except ValueError:
             return NoResource()
         else:
             return YearPage(year)




Aside from including the definition of YearPage from the previous installment, the only other thing left to do is the normal Site and reactor setup. Here's the complete code for this example:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor
from twisted.web.error import NoResource

from calendar import calendar

class YearPage(Resource):
   def __init__(self, year):
       Resource.__init__(self)
       self.year = year

   def render_GET(self, request):
       return "<html><body><pre>%s</pre></body></html>" % (calendar(self.year),)

class Calendar(Resource):
   def getChild(self, name, request):
       try:
           year = int(name)
       except ValueError:
           return NoResource()
       else:
           return YearPage(year)

root = Calendar()
factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()




This server hands out the same calendar views as the one from the previous installment, but it will also hand out a nice error page with a 404 response when a request is made for a URL which cannot be interpreted as a year.




Next time I'll show you how you can define resources like NoResource yourself.

Sunday, September 20, 2009

Twisted Web in 60 seconds: dynamic URL dispatch


Welcome to the fourth installment of "Twisted Web in 60 seconds". In the previous installment, I showed how to statically configure Twisted Web to serve different content at different URLs. The goal of this installment is to show you how to do this dynamically instead. I suggest reading the previous installment if you haven't already in order to get an overview of how URLs are treated when using Twisted Web's resource APIs.




Site (the object which associates a listening server port with the HTTP implementation), Resource (a convenient base class to use when defining custom pages), and reactor (the object which implements the Twisted main loop) return once again:

  from twisted.web.server import Site
  from twisted.web.resource import Resource
  from twisted.internet import reactor




With that out of the way, here's the interesting part of this example. I'm going to define a resource which renders a whole-year calendar. The year it will render the calendar for will be the year in the request URL. So, for example, /2009 will render a calendar for 2009. So, first, here's a resource which renders a calendar for the year passed to its initializer:

  from calendar import calendar

  class YearPage(Resource):
      def __init__(self, year):
          Resource.__init__(self)
          self.year = year

      def render_GET(self, request):
          return "<html><body><pre>%s</pre></body></html>" % (calendar(self.year),)

Pretty simple - not much different from the first dynamic resource I demonstrated. Now here's the resource which handles URLs with a year in them by creating a suitable instance of this YearPage class:
  class Calendar(Resource):
    def getChild(self, name, request):
        return YearPage(int(name))

By implementing getChild here, I've just defined how Twisted Web should find children of Calendar instances when it's resolving an URL into a resource. This implementation defines all integers as the children of Calendar (and punts on error handling, more on that later).




All that's left is to create a Site using this resource as its root and then start the reactor:

  root = Calendar()
  factory = Site(root)
  reactor.listenTCP(8880, factory)
  reactor.run()




And that's all. Any resource-based dynamic URL handling is going to look basically like Calendar.getPage. Here's the full example code:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor

from calendar import calendar

class YearPage(Resource):
    def __init__(self, year):
        Resource.__init__(self)
        self.year = year

    def render_GET(self, request):
        return "<html><body><pre>%s</pre></body></html>" % (calendar(self.year),)

class Calendar(Resource):
  def getChild(self, name, request):
      return YearPage(int(name))

root = Calendar()
factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()





Next time I'll talk about what to do when Firefox requests /favicon.ico from your web app and you don't have one to serve... (ie, error handling).

Saturday, September 19, 2009

Twisted Web in 60 seconds: static URL dispatch


Welcome to the third installment of "Twisted Web in 60 seconds". The goal of this installment is to show you how to serve different content at different URLs using APIs from Twisted Web (the first and second installments covered ways in which you might want to generate this content).




Key to understanding how different URLs are handled with the resource APIs in Twisted Web is understanding that any URL can be used to address a node in a tree. Resources in Twisted Web exist in such a tree, and a request for a URL will be responded to by the resource which that URL addresses. The addressing scheme considers only the path segments of the URL. Starting with the root resource (the one used to construct the Site) and the first path segment, a child resource is looked up. As long as there are more path segments, this process is repeated using the result of the previous lookup and the next path segment. For example, to handle a request for /foo/bar, first the root's "foo" child is retrieved, then that resource's "bar" child is retrieved, then that resource is used to create the response.




With that out of the way, let's consider an example that can serve a few different resources at a few different URLs.




First things first: we need to import Site, the factory for HTTP servers, Resource, a convenient base class for custom pages, and reactor, the object which implements the Twisted main loop. We'll also import File to use as the resource at one of the example URLs.

  from twisted.web.server import Site
  from twisted.web.resource import Resource
  from twisted.internet import reactor
  from twisted.web.static import File




Now we create a resource which will correspond to the root of the URL hierarchy: all URLs are children of this resource.

  root = Resource()




Here comes the interesting part of this example. I'm now going to create three more resources and attach them to the three URLs /foo, /bar, and /baz:

  root.putChild("foo", File("/tmp"))
  root.putChild("bar", File("/lost+found"))
  root.putChild("baz", File("/opt"))




Last, all that's required is to create a Site with the root resource, associate it with a listening server port, and start the reactor:

  factory = Site(root)
  reactor.listenTCP(8880, factory)
  reactor.run()

With this server running, http://localhost:8880/foo will serve a listing of files from /tmp, http://localhost:8880/bar will serve a listing of files from /lost+found, and http://localhost:8880/baz will serve a listing of files from /opt.




Here's the whole example uninterrupted:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor
from twisted.web.static import File

root = Resource()
root.putChild("foo", File("/tmp"))
root.putChild("bar", File("/lost+found"))
root.putChild("baz", File("/opt"))

factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()



Next time I'll show you how to handle URLs dynamically. Also, hey! I want your feedback. Do you find these posts useful? Am I presenting the information clearly? Tell me about it.

Friday, September 18, 2009

Planet Python syndication and Yahoo! Pipes bugs

I finally broke down and asked to have this blog added to Planet Python. More readers can't hurt, right? But I don't actually want everything I write here to appear on Planet Python. I definitely write some things which would be off-topic there. LiveJournal has tagging, but apparently it won't serve up an RSS feed filtered by tag (if I overlooked some feature, please let me know!). Then I remembered Yahoo! Pipes, a pretty neat web app that provides a GUI-based web mangling service. It can grab all kinds of inputs, perform all kinds of operations on the data thus retrieved, and serve the result.

It was a pretty trivial pipe I created to grab my LiveJournal RSS feed, filter out everything without the "python" keyword, and serve the result. Only one hitch. It seems Yahoo! Pipes mangles whitespace inside pre tags. Oh dear. So basically any code I try to post will get mangled by the pipe and be more or less unreadable.

Yahoo knows about this problem (people have been complaining about it for years). Bummer. There's a post in their suggestions forum about the problem that's a year old: . Maybe if all my readers go vote it up or comment on it, it will get some attention and my posts on Planet Python will appear as intended.

Thursday, September 17, 2009

Twisted Web in 60 seconds: generate a page dynamically


Welcome to the second installment of "Twisted Web in 60 seconds". The goal of this installment is to show you how to dynamically generate the contents of a page using APIs from Twisted Web. If you missed the first installment on serving static content, you may want to take a look at that first. Ready? Let's begin.




Taking care of some of the necessary imports first, we'll import Site and the reactor:


from twisted.internet import reactor
from twisted.web.server import Site


The Site is a factory which associates a listening port with the HTTP protocol implementation. The reactor is the main loop that drives any Twisted application, we'll use it to actually create the listening port in a moment.




Next, we'll import one more thing from Twisted Web, Resource. An instance of Resource (or a subclass) represents a page (technically, the entity addressed by a URI).


from twisted.web.resource import Resource




Since I'm going to make the demo resource a clock, we'll also import the time module:


import time




With imports taken care of, the next step is to define a Resource subclass which has the dynamic rendering behavior we want. Here's a resource which generates a page giving the time:


class ClockPage(Resource):
isLeaf = True
def render_GET(self, request):
return "<html><body>%s</body></html>" % (time.ctime(),)




Setting isLeaf to True indicates that ClockPage resources will never have any children.




The render_GET method here will be called whenever the URI we hook this resource up to is requested with the GET method. The byte string it returns is what will be sent to the browser.




With the resource defined, we can create a Site from it:


resource = ClockPage()
factory = Site(resource)


Just as with the previous static content example, this configuration puts our resource at the very top of the URI hierarchy, ie at /.

And with that Site instance, we can tell the reactor to create a TCP server1 and start servicing requests:

reactor.listenTCP(8880, factory)
reactor.run()




Here's the code with no interruptions:


from twisted.internet import reactor
from twisted.web.server import Site
from twisted.web.resource import Resource
import time

class ClockPage(Resource):
isLeaf = True
def render_GET(self, request):
return "<html><body>%s</body></html>" % (time.ctime(),)

resource = ClockPage()
factory = Site(resource)
reactor.listenTCP(8880, factory)
reactor.run()



Tune in next time to learn about how to put different resources at different URIs.

Wednesday, September 16, 2009

Twisted Web in 60 seconds: serve static content from a directory


Welcome to the first installment of "Twisted Web in 60 seconds". The goal of this installment is to show you how to serve static content from a filesystem using some APIs from Twisted Web (while Twisted also includes some command line tools, I will not be discussing those here) and have you understand it in 60 seconds or less (if you don't already know Python, you might want to stop here). Where possibly useful, I'll include links to the Twisted documentation, but consider these as tips for further exploration, not necessary prerequisites for understanding the example. So, let's dive in.




First, we need to import some things:





  • Site, a factory which glues a listening server port to the HTTP protocol implementation:

    from twisted.web.server import Site



  • File, a resource which glues the HTTP protocol implementation to the filesystem:

    from twisted.web.static import File



  • The reactor, which drives the whole process, actually accepting TCP connections and moving bytes into and out of them:

    from twisted.internet import reactor





Next, we create an instance of the File resource pointed at the directory to serve:


resource = File("/tmp")




Then we create an instance of the Site factory with that resource:


factory = Site(resource)




Now we glue that factory to a TCP port:


reactor.listenTCP(8888, factory)




Finally, we start the reactor so it can make the program work:


reactor.run()

And that's it.




Here's the complete program without annoying explanations:


from twisted.web.server import Site
from twisted.web.static import File
from twisted.internet import reactor

resource = File('/tmp')
factory = Site(resource)
reactor.listenTCP(8888, factory)
reactor.run()




The Twisted site has more web examples, as well as some longer form style documentation.




Bonus example! For those times when you don't actually want to write a new program, the above implemented functionality is one of the things which the command line twistd tool can do. In this case, the command twistd -n web --path /tmp will accomplish the same thing as the above server.




Keep an eye out of the next installment, in which I'll describe simple dynamic resources.

Thursday, September 10, 2009

On Civility and Common Courtesy

I'm sure there's no one reading this who expects the internet to be a congenial place at all times. The internet is just a bunch of people, usually shouting, after all. And people can't yet be relied upon to always get along (and I'm no exception - but that doesn't make it right).

Still, we should always be striving to be better. And in certain subsets of the internet - communities, if you will - the bar should be set even higher still. Of course I have someone in particular in mind as I write this, someone in one of the many open source software communities on the 'net.

There's a lot of competition between similar open source projects. As with most competition, this can be both beneficial and... otherwise. Competition can be a great source of motivation. But it can also lead to resentment, grudges, vitriol, and worse. It's easy for things to spiral when this happens. And even if matters remain at just a low level of antipathy, no one is benefiting from this, no one wakes up each day happier because they know someone is going to insult their work, no software is improved by slinging around baseless claims about the poor quality of other people's work.

I have no idea if the person who prompted this post will read it, and now I think it doesn't really matter. You read it. Just keep it in mind.