Sunday, November 29, 2009

Divmod Software Releases


I've made some new releases of software formerly developed by Divmod, Inc. and now maintained by the community. These releases include changes from both before and after the end of Divmod. Click through to find links to release details.





Enjoy.

Saturday, November 28, 2009

Twisted Web in 60 seconds: storing objects in the session


Welcome to the 16th installment of "Twisted Web in 60 seconds". Last time I introduced the basic APIs for interacting with Twisted Web sessions. In this installment, I'll show you how you can persist objects across requests in the session object.




As I discussed last time, instances of Session last as long as the notional session itself does. Each time Request.getSession is called, if the session for the request is still active, then the same Session instance is returned as was returned previously. Because of this, Session instances can be used to keep other objects around for as long as the session exists.




It's easier to demonstrate how this works than explain it, so here's an example:



  >>> from zope.interface import Interface, Attribute, implements
 >>> from twisted.python.components import registerAdapter
 >>> from twisted.web.server import Session
 >>> class ICounter(Interface):
 ...     value = Attribute("An int value which counts up once per page view.")
 ...
 >>> class Counter(object):
 ...     implements(ICounter)
 ...     def __init__(self, session):
 ...         self.value = 0
 ...
 >>> registerAdapter(Counter, Session, ICounter)
 >>> ses = Session(None, None)
 >>> data = ICounter(ses)
 >>> print data
 <__main__.Counter object at 0x8d535ec>
 >>> print data is ICounter(ses)
 True
 >>>



What?, I hear you say.




What's shown in this example is the interface and adaption based API which Session exposes for persisting state. There are several critical pieces interacting here:




  • ICounter is an interface which serves several purposes. Like all interfaces, it documents the API of some class of objects (in this case, just the value attribute). It also serves as a key into what is basically a dictionary within the session object: the interface is used to store or retrieve a value on the session (the Counter instance, in this case).

  • Counter is the class which actually holds the session data in this example. It implements ICounter (again, mostly for documentation purposes). It also has a value attribute, as the interface declared.

  • The registerAdapter call sets up the relationship between its three arguments so that adaption will do what we want in this case.

  • Adaption is performed by the expression ICounter(ses). This is read as adapt ses to ICounter. Because of the registerAdapter call, it is roughly equivalent to Counter(ses). However (because of certain things Session does), it also saves the Counter instance created so that it will be returned the next time this adaption is done. This is why the last statement produces True.




If you're still not clear on some of the details there, don't worry about it and just remember this: ICounter(ses) gives you an object you can persist state on. It can be as much or as little state as you want, and you can use as few or as many different Interface classes as you want on a single Session instance.




With those conceptual dependencies out of the way, it's a very short step to actually getting persistent state into a Twisted Web application. Here's an example which implements a simple counter, re-using the definitions from the example above:



  from twisted.web.resource import Resource

 class CounterResource(Resource):
     def render_GET(self, request):
         session = request.getSession()
         counter = ICounter(session)
         counter.value += 1
         return "Visit #%d for you!" % (counter.value,)



Pretty simple from this side, eh? All this does is use Request.getSession and the adaption from above, plus some integer math to give you a session-based visit counter.




Here's the complete source for an rpy script based on this example:



cache()

from zope.interface import Interface, Attribute, implements
from twisted.python.components import registerAdapter
from twisted.web.server import Session
from twisted.web.resource import Resource

class ICounter(Interface):
   value = Attribute("An int value which counts up once per page view.")

class Counter(object):
   implements(ICounter)
   def __init__(self, session):
       self.value = 0

registerAdapter(Counter, Session, ICounter)

class CounterResource(Resource):
   def render_GET(self, request):
       session = request.getSession()
       counter = ICounter(session)  
       counter.value += 1
       return "Visit #%d for you!" % (counter.value,)

resource = CounterResource()



One more thing to note is the cache() call at the top of this example. As with the previous example where this came up, this rpy script is stateful. This time, it's the ICounter definition and the registerAdapter call that need to be executed only once. If we didn't use cache, every request would define a new, different interface named ICounter. Each of these would be a different key in the session, so the counter would never get past one.




There's one more interesting thing you can do with sessions in Twisted Web right out of the box. Tune in next time to find out what.

Wednesday, November 18, 2009

Twisted Web in 60 seconds: session basics

Welcome to the 15th installment of "Twisted Web in 60 seconds". As promised, I'll be covering sessions in this installment. Or, more accurately, I'll be covering a tiny bit of sessions. As this is the most complicated topic I've covered so far, I'm going to take a few installments to cover all the different aspects.



In this installment, you can expect to learn the very basics of the Twisted Web session API: how to get the session object for the current request and how to prematurely expire a session.



Before I get into the APIs, though, I should explain the big picture of sessions in Twisted Web. Sessions are represented by instances of Session. The Site creates a new instance of Session the first time an application asks for it for a particular session. Session instances are kept on the Site instance until they expire (due to inactivity or because they are explicitly expired). Each time after the first that a particular session's Session object is requested, it is retrieved from the Site.



With the conceptual underpinnings of the upcoming API in place, here comes the example. This will be a very simple rpy script which tells a user what their unique session identifier is and lets them prematurely expire it.



First, I'll import Resource so I can define a couple subclasses of it:



  from twisted.web.resource import Resource


Next I'll define the resource which tells the client what its session identifier is. This is done easily by first getting the session object using Request.getSession and then getting the session object's uid attribute.



  class ShowSession(Resource):
     def render_GET(self, request):
         return 'Your session id is: ' + request.getSession().uid


To let the client expire their own session before it times out, I'll define another resource which expires whatever session it is requested with. This is done using the Session.expire method.



  class ExpireSession(Resource):
     def render_GET(self, request):
         request.getSession().expire()
         return 'Your session has been expired.'


Finally, to make the example an rpy script, I'll make an instance of ShowSession and give it an instance of ExpireSession as a child using Resource.putChild (covered earlier).



  resource = ShowSession()
  resource.putChild("expire", ExpireSession())


And that is the complete example. You can fire this up and load the top page. You'll see a (rather opaque) session identifier that remains the same across reloads (at least until you flush the TWISTED_SESSION cookie from your browser or enough time passes). You can then visit the expire child and go back to the top page and see that you have a new session.



Here's the complete source for the example.



from twisted.web.resource import Resource

class ShowSession(Resource):
   def render_GET(self, request):
       return 'Your session id is: ' + request.getSession().uid

class ExpireSession(Resource):
   def render_GET(self, request):
       request.getSession().expire()
       return 'Your session has been expired.'

resource = ShowSession()
resource.putChild("expire", ExpireSession())


Next time I'll talk about how you can persist information in the session object.

Friday, November 6, 2009

Twisted Web in 60 seconds: HTTP authentication


Welcome to the 14th installment of "Twisted Web in 60 seconds". In many of the previous installments, I've demonstrated how to serve content by using existing resource classes or implementing new ones. In this installment, I'll demonstrate how you can use Twisted Web's basic or digest HTTP authentication to control access to these resources.




Guard, the Twisted Web module which provides most of the APIs which will be used in this example, helps you to add authentication and authorization to a resource hierarchy. It does this by providing a resource which implements getChild to return a dynamically selected resource. The selection is based on the authentication headers in the request. If those headers indicate the request is made on behalf of Alice, then Alice's resource will be returned. If they indicate it was made on behalf of Bob, his will be returned. If the headers contain invalid credentials, an error resource is returned. Whatever happens, once this resource is returned, URL traversal continues as normal from that resource.




The resource which implements this is HTTPAuthSessionWrapper, though it is directly is directly responsible for very little of the process. It will extract headers from the request and hand them off to a credentials factory to parse them according to the appropriate standards (eg HTTP Authentication: Basic and Digest Access Authentication) and then it hands the resulting credentials object off to a portal, the core of Twisted Cred, a system for uniform handling of authentication and authorization. I am not going to discuss Twisted Cred in much depth here. To make use of it with Twisted Web, the only thing you really need to know is how to implement a realm.




You need to implement a realm because the realm is the object which actually decides which resources are used for which users. This can be as complex or as simple as it suitable for your application. For this example, I'll keep it very simple: each user will have a resource which is a static file listing of the public_html directory in their UNIX home directory. First, I need to import implements from zope.interface and IRealm from twisted.cred.portal. Together these will let me mark this class as a realm (this is mostly - but notentirely - a documentation thing). I'll also need File for the actual implementation later.



  from zope.interface import implements

 from twisted.cred.portal import IRealm
 from twisted.web.static import File

 class PublicHTMLRealm(object):
     implements(IRealm)



A realm only needs to implement one method, requestAvatar. This is called after any successful authentication attempt (ie, Alice supplied the right password). Its job is to return the avatar for the user who succeeded in authenticating. An avatar is just an object that represents a user. In this case, it will be a File. In general, with Guard, the avatar must be a resource of some sort.



      def requestAvatar(self, avatarId, mind, *interfaces):
         if IResource in interfaces:
             return (IResource, File("/home/%s/public_html" % (avatarId,)), lambda: None)
         raise NotImplementedError()



A few notes on this method:




  • The avatarId parameter is essentially the username. It's the job of some other code to extract the username from the request headers and make sure it gets passed here.

  • The mind is always None when writing a realm to be used with Guard. You can ignore it until you want to write a realm for something else.

  • Guard always passed IResource for the interfaces parameter. If interfaces only contains interfaces your code doesn't understand, raising NotImplementedError is the thing to do, as above. You'll only need to worry about getting a different interface when you write a realm for something other than Guard.

  • If you want to track when a user logs out, that's what the last element of the returned tuple is for. It will be called when this avatar logs out. lambda: None is the idiomatic no-op logout function.

  • Notice that I have written the path handling code in this example very poorly. This example may be vulnerable to certain unintentional information disclosure attacks. This sort of problem is exactly the reason FilePath exists. However, that's an example for another day...



We're almost ready to set up the resource for this example. To create an HTTPAuthSessionWrapper, though, we need two things. First, a portal, which requires the realm above, plus at least one credentials checker:



  from twisted.cred.portal import Portal
 from twisted.cred.checkers import FilePasswordDB

 portal = Portal(PublicHTMLRealm(), [FilePasswordDB('httpd.password')])



FilePasswordDB is that credentials checker I mentioned. It knows how to read passwd(5)-style (loosely) files to check credentials against. It is responsible for the authentication work after HTTPAuthSessionWrapper extracts the credentials from the request.




Next we need either BasicCredentialFactory or DigestCredentialFactory. The former knows how to challenge HTTP clients to do basic authentication; the latter, digest authentication. I'll use digest here:



  from twisted.web.guard import DigestCredentialFactory

 credentialFactory = DigestCredentialFactory("md5", "example.org")



The two parameters to this constructor are the hash algorithm and the http authentication realm which will be used. The only other valid hash algorithm is "sha" (but be careful, MD5 is more widely supported than SHA). The http authentication realm is mostly just a string that is presented to the user to let them know why they're authenticating (you can read more about this in the RFC).




With those things created, we can finally instantiate HTTPAuthSessionWrapper:



  from twisted.web.guard import HTTPAuthSessionWrapper

 resource = HTTPAuthSessionWrapper(portal, [credentialFactory])



There's just one last thing that needs to be done here. When I introduced rpy scripts, I mentioned that they're evaluated in an unusual context. This is the first example which actually needs to take this into account. It so happens that DigestCredentialFactory instances are actually stateful. Authentication will only succeed if the same instance is used to generate challenges and examine the responses to those challenges. However, the normal mode of operation for an rpy script is for it to be re-executed for every request. This leads to a new DigestCredentialFactory being created for every request, preventing any authentication attempt from ever succeeding.




There are two ways to deal with this. First, the better of the two ways, I could move almost all of the code into a real Python module, including the code which instantiates the DigestCredentialFactory. This would make ensure the same instance was used for every request. Second, the easier of the two ways, I could add a call to cache to the beginning of the rpy script:



  cache()



cache is part of the globals of any rpy script, so you don't need to import it (it's okay to be cringing at this point). Calling cache makes Twisted re-use the result of the first evaluation of the rpy script for subsequent requests too. Just what I want in this case.




Here's the complete example (with imports re-arranged to the more conventional style):



cache()

from zope.interface import implements

from twisted.cred.portal import IRealm, Portal
from twisted.cred.checkers import FilePasswordDB
from twisted.web.static import File
from twisted.web.resource import IResource
from twisted.web.guard import HTTPAuthSessionWrapper, DigestCredentialFactory

class PublicHTMLRealm(object):
   implements(IRealm)

   def requestAvatar(self, avatarId, mind, *interfaces):
       if IResource in interfaces:
           return (IResource, File("/home/%s/public_html" % (avatarId,)), lambda: None)
       raise NotImplementedError()

portal = Portal(PublicHTMLRealm(), [FilePasswordDB('httpd.password')])

credentialFactory = DigestCredentialFactory("md5", "localhost:8080")
resource = HTTPAuthSessionWrapper(portal, [credentialFactory])



And voila, a password-protected per-user Twisted Web server.




I've gotten several requests to write something about sessions, so there's a good chance that's what you'll find in the next installment.

Thursday, November 5, 2009

Free Agent


As of Monday, the 9th, I will be considering opportunities for short term consulting and contract work. Please feel free to contact me if you have a software challenge to tackle, particularly if it involves one or more of Python, Twisted, networking, event-driven architectures, massive scaling, or open source software.




Immediately following the demise of Divmod this summer, I took a job at a major international corporation. A number of factors conspired to make this decision non-viable in the long term. Today I gave notice. I'm excited to be able to get back to doing what I love - solving challenging, interesting problems in a flexible, open environment.




One of the other things I've been unable to do since even before Divmod's end is commit serious time to Twisted development and maintenance. This is something else I'm looking forward to re-engaging in. I made a sizable dent in Twisted's open ticket count last fall and winter, thanks to funding from the Twisted Software Foundation (in turn thanks to all of the Twisted founding sponsors). I'll be able to continue this work thanks to this year's sponsors (visible on the front page of the Twisted site), though perhaps not to the same extent. If you'd like to help out in this regard, become a sponsor! All donations are useful and appreciated!

Monday, November 2, 2009

September - October Reading List


  • The Player of Games. Iain Banks.

  • The State of the Art. Iain Banks.

  • Use of Weapons. Iain Banks.

  • Excession. Iain Banks.

  • Inversions. Iain Banks.

  • The Planck Dive. Greg Egan.

  • The Name of the Wind. Patrick Rothfuss.

  • Red Seas Under Red Skies. Scott Lynch.

Twisted Halloween

A bunch of us carved pumpkins for Halloween, and Ying was thoughtful enough to bring along her very nice camera and got some nice shots.