Thursday, September 24, 2009

Twisted Web in 60 seconds: custom response codes


Welcome to the sixth edition of "Twisted Web in 60 seconds". In the previous installment I introduced NoResource, a Twisted Web error resource which responds with a 404 (not found) code. In this installment, I'll show you the APIs which NoResource uses to do this so that you can generate your own custom response codes as desired.




First, the now-standard import preamble (see previous installments for details):

  from twisted.web.server import Site
 from twisted.web.resource import Resource
 from twisted.internet import reactor




Now I'll define a new resource class which always returns a 402 (payment required) response. This is really not very different from the resources which I've defined in previous examples. The fact that it has a response code other than 200 doesn't change anything else about its role. This will require using the request object, though, which none of the previous examples have done.




The request object has shown up in a couple places, but so far I've ignored it. It's a parameter to the getChild API as well as to render methods such as render_GET. As you might have suspected, it represents the request for which a response is to be generated. Additionally, it also represents the response being generated. In this example, I'm going to use its setResponseCode method to - you guessed it - set the response's status code.

  class PaymentRequired(Resource):
     def render_GET(self, request):
         request.setResponseCode(402)
         return "<html><body>Please swipe your credit card.</body></html>"

Just like the other resources I've demonstrated, this one returns a string from its render_GET method to define the body of the response. All that's different is the call to setResponseCode to override the default response code, 200, with a different one.




Finally, the code to set up the site and reactor. I'll put an instance of the above defined resource at /buy:

  root = Resource()
  root.putChild("buy", PaymentRequired())
  factory = Site(root)
  reactor.listenTCP(8880, factory)
  reactor.run()




Here's the complete example:

from twisted.web.server import Site  
from twisted.web.resource import Resource  
from twisted.internet import reactor  

class PaymentRequired(Resource):  
   def render_GET(self, request):  
       request.setResponseCode(402)  
       return "<html><body>Please swipe your credit card.</body></html>"  

root = Resource()
root.putChild("buy", PaymentRequired())
factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()                  




Run the server and visit http://localhost:8880/buy in your browser. It'll look pretty boring, but if you use Firefox's View Page Info right-click menu item (or your browser's equivalent), you'll be able to see that the server indeed sent back a 402 response code.




Check out the next installment to see how the request object can also be used to get the request body (eg, for form submissions).

Tuesday, September 22, 2009

Twisted Web in 60 seconds: error handling


Welcome to the fifth installment of "Twisted Web in 60 seconds". In the previous installment, I demonstrated how a Twisted Web server can decide how to respond to requests based on dynamic inspection of the request URL. In this installment, I'll show you how to extend such dynamic dispatch to return a 404 (not found) response when a client requests a non-existent URL.




As in the previous installments, we'll start with Site, Resource, and reactor imports (see the first and second installments for explanations of these):

  from twisted.web.server import Site
 from twisted.web.resource import Resource
 from twisted.internet import reactor




Next, we'll add one more import. NoResource is one of the pre-defined error resources provided by Twisted Web. It generates the necessary 404 response code and renders a simple html page telling the client there is no such resource.

  from twisted.web.error import NoResource




Next, we'll define a custom resource which does some dynamic URL dispatch. This example is going to be just like the previous one, where the path segment is interpreted as a year; the difference is that this time, we'll handle requests which don't conform to that pattern by returning the not found response:

  class Calendar(Resource):
     def getChild(self, name, request):
         try:
             year = int(name)
         except ValueError:
             return NoResource()
         else:
             return YearPage(year)




Aside from including the definition of YearPage from the previous installment, the only other thing left to do is the normal Site and reactor setup. Here's the complete code for this example:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor
from twisted.web.error import NoResource

from calendar import calendar

class YearPage(Resource):
   def __init__(self, year):
       Resource.__init__(self)
       self.year = year

   def render_GET(self, request):
       return "<html><body><pre>%s</pre></body></html>" % (calendar(self.year),)

class Calendar(Resource):
   def getChild(self, name, request):
       try:
           year = int(name)
       except ValueError:
           return NoResource()
       else:
           return YearPage(year)

root = Calendar()
factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()




This server hands out the same calendar views as the one from the previous installment, but it will also hand out a nice error page with a 404 response when a request is made for a URL which cannot be interpreted as a year.




Next time I'll show you how you can define resources like NoResource yourself.

Sunday, September 20, 2009

Twisted Web in 60 seconds: dynamic URL dispatch


Welcome to the fourth installment of "Twisted Web in 60 seconds". In the previous installment, I showed how to statically configure Twisted Web to serve different content at different URLs. The goal of this installment is to show you how to do this dynamically instead. I suggest reading the previous installment if you haven't already in order to get an overview of how URLs are treated when using Twisted Web's resource APIs.




Site (the object which associates a listening server port with the HTTP implementation), Resource (a convenient base class to use when defining custom pages), and reactor (the object which implements the Twisted main loop) return once again:

  from twisted.web.server import Site
  from twisted.web.resource import Resource
  from twisted.internet import reactor




With that out of the way, here's the interesting part of this example. I'm going to define a resource which renders a whole-year calendar. The year it will render the calendar for will be the year in the request URL. So, for example, /2009 will render a calendar for 2009. So, first, here's a resource which renders a calendar for the year passed to its initializer:

  from calendar import calendar

  class YearPage(Resource):
      def __init__(self, year):
          Resource.__init__(self)
          self.year = year

      def render_GET(self, request):
          return "<html><body><pre>%s</pre></body></html>" % (calendar(self.year),)

Pretty simple - not much different from the first dynamic resource I demonstrated. Now here's the resource which handles URLs with a year in them by creating a suitable instance of this YearPage class:
  class Calendar(Resource):
    def getChild(self, name, request):
        return YearPage(int(name))

By implementing getChild here, I've just defined how Twisted Web should find children of Calendar instances when it's resolving an URL into a resource. This implementation defines all integers as the children of Calendar (and punts on error handling, more on that later).




All that's left is to create a Site using this resource as its root and then start the reactor:

  root = Calendar()
  factory = Site(root)
  reactor.listenTCP(8880, factory)
  reactor.run()




And that's all. Any resource-based dynamic URL handling is going to look basically like Calendar.getPage. Here's the full example code:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor

from calendar import calendar

class YearPage(Resource):
    def __init__(self, year):
        Resource.__init__(self)
        self.year = year

    def render_GET(self, request):
        return "<html><body><pre>%s</pre></body></html>" % (calendar(self.year),)

class Calendar(Resource):
  def getChild(self, name, request):
      return YearPage(int(name))

root = Calendar()
factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()





Next time I'll talk about what to do when Firefox requests /favicon.ico from your web app and you don't have one to serve... (ie, error handling).

Saturday, September 19, 2009

Twisted Web in 60 seconds: static URL dispatch


Welcome to the third installment of "Twisted Web in 60 seconds". The goal of this installment is to show you how to serve different content at different URLs using APIs from Twisted Web (the first and second installments covered ways in which you might want to generate this content).




Key to understanding how different URLs are handled with the resource APIs in Twisted Web is understanding that any URL can be used to address a node in a tree. Resources in Twisted Web exist in such a tree, and a request for a URL will be responded to by the resource which that URL addresses. The addressing scheme considers only the path segments of the URL. Starting with the root resource (the one used to construct the Site) and the first path segment, a child resource is looked up. As long as there are more path segments, this process is repeated using the result of the previous lookup and the next path segment. For example, to handle a request for /foo/bar, first the root's "foo" child is retrieved, then that resource's "bar" child is retrieved, then that resource is used to create the response.




With that out of the way, let's consider an example that can serve a few different resources at a few different URLs.




First things first: we need to import Site, the factory for HTTP servers, Resource, a convenient base class for custom pages, and reactor, the object which implements the Twisted main loop. We'll also import File to use as the resource at one of the example URLs.

  from twisted.web.server import Site
  from twisted.web.resource import Resource
  from twisted.internet import reactor
  from twisted.web.static import File




Now we create a resource which will correspond to the root of the URL hierarchy: all URLs are children of this resource.

  root = Resource()




Here comes the interesting part of this example. I'm now going to create three more resources and attach them to the three URLs /foo, /bar, and /baz:

  root.putChild("foo", File("/tmp"))
  root.putChild("bar", File("/lost+found"))
  root.putChild("baz", File("/opt"))




Last, all that's required is to create a Site with the root resource, associate it with a listening server port, and start the reactor:

  factory = Site(root)
  reactor.listenTCP(8880, factory)
  reactor.run()

With this server running, http://localhost:8880/foo will serve a listing of files from /tmp, http://localhost:8880/bar will serve a listing of files from /lost+found, and http://localhost:8880/baz will serve a listing of files from /opt.




Here's the whole example uninterrupted:

from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor
from twisted.web.static import File

root = Resource()
root.putChild("foo", File("/tmp"))
root.putChild("bar", File("/lost+found"))
root.putChild("baz", File("/opt"))

factory = Site(root)
reactor.listenTCP(8880, factory)
reactor.run()



Next time I'll show you how to handle URLs dynamically. Also, hey! I want your feedback. Do you find these posts useful? Am I presenting the information clearly? Tell me about it.

Friday, September 18, 2009

Planet Python syndication and Yahoo! Pipes bugs

I finally broke down and asked to have this blog added to Planet Python. More readers can't hurt, right? But I don't actually want everything I write here to appear on Planet Python. I definitely write some things which would be off-topic there. LiveJournal has tagging, but apparently it won't serve up an RSS feed filtered by tag (if I overlooked some feature, please let me know!). Then I remembered Yahoo! Pipes, a pretty neat web app that provides a GUI-based web mangling service. It can grab all kinds of inputs, perform all kinds of operations on the data thus retrieved, and serve the result.

It was a pretty trivial pipe I created to grab my LiveJournal RSS feed, filter out everything without the "python" keyword, and serve the result. Only one hitch. It seems Yahoo! Pipes mangles whitespace inside pre tags. Oh dear. So basically any code I try to post will get mangled by the pipe and be more or less unreadable.

Yahoo knows about this problem (people have been complaining about it for years). Bummer. There's a post in their suggestions forum about the problem that's a year old: . Maybe if all my readers go vote it up or comment on it, it will get some attention and my posts on Planet Python will appear as intended.

Thursday, September 17, 2009

Twisted Web in 60 seconds: generate a page dynamically


Welcome to the second installment of "Twisted Web in 60 seconds". The goal of this installment is to show you how to dynamically generate the contents of a page using APIs from Twisted Web. If you missed the first installment on serving static content, you may want to take a look at that first. Ready? Let's begin.




Taking care of some of the necessary imports first, we'll import Site and the reactor:


from twisted.internet import reactor
from twisted.web.server import Site


The Site is a factory which associates a listening port with the HTTP protocol implementation. The reactor is the main loop that drives any Twisted application, we'll use it to actually create the listening port in a moment.




Next, we'll import one more thing from Twisted Web, Resource. An instance of Resource (or a subclass) represents a page (technically, the entity addressed by a URI).


from twisted.web.resource import Resource




Since I'm going to make the demo resource a clock, we'll also import the time module:


import time




With imports taken care of, the next step is to define a Resource subclass which has the dynamic rendering behavior we want. Here's a resource which generates a page giving the time:


class ClockPage(Resource):
isLeaf = True
def render_GET(self, request):
return "<html><body>%s</body></html>" % (time.ctime(),)




Setting isLeaf to True indicates that ClockPage resources will never have any children.




The render_GET method here will be called whenever the URI we hook this resource up to is requested with the GET method. The byte string it returns is what will be sent to the browser.




With the resource defined, we can create a Site from it:


resource = ClockPage()
factory = Site(resource)


Just as with the previous static content example, this configuration puts our resource at the very top of the URI hierarchy, ie at /.

And with that Site instance, we can tell the reactor to create a TCP server1 and start servicing requests:

reactor.listenTCP(8880, factory)
reactor.run()




Here's the code with no interruptions:


from twisted.internet import reactor
from twisted.web.server import Site
from twisted.web.resource import Resource
import time

class ClockPage(Resource):
isLeaf = True
def render_GET(self, request):
return "<html><body>%s</body></html>" % (time.ctime(),)

resource = ClockPage()
factory = Site(resource)
reactor.listenTCP(8880, factory)
reactor.run()



Tune in next time to learn about how to put different resources at different URIs.

Wednesday, September 16, 2009

Twisted Web in 60 seconds: serve static content from a directory


Welcome to the first installment of "Twisted Web in 60 seconds". The goal of this installment is to show you how to serve static content from a filesystem using some APIs from Twisted Web (while Twisted also includes some command line tools, I will not be discussing those here) and have you understand it in 60 seconds or less (if you don't already know Python, you might want to stop here). Where possibly useful, I'll include links to the Twisted documentation, but consider these as tips for further exploration, not necessary prerequisites for understanding the example. So, let's dive in.




First, we need to import some things:





  • Site, a factory which glues a listening server port to the HTTP protocol implementation:

    from twisted.web.server import Site



  • File, a resource which glues the HTTP protocol implementation to the filesystem:

    from twisted.web.static import File



  • The reactor, which drives the whole process, actually accepting TCP connections and moving bytes into and out of them:

    from twisted.internet import reactor





Next, we create an instance of the File resource pointed at the directory to serve:


resource = File("/tmp")




Then we create an instance of the Site factory with that resource:


factory = Site(resource)




Now we glue that factory to a TCP port:


reactor.listenTCP(8888, factory)




Finally, we start the reactor so it can make the program work:


reactor.run()

And that's it.




Here's the complete program without annoying explanations:


from twisted.web.server import Site
from twisted.web.static import File
from twisted.internet import reactor

resource = File('/tmp')
factory = Site(resource)
reactor.listenTCP(8888, factory)
reactor.run()




The Twisted site has more web examples, as well as some longer form style documentation.




Bonus example! For those times when you don't actually want to write a new program, the above implemented functionality is one of the things which the command line twistd tool can do. In this case, the command twistd -n web --path /tmp will accomplish the same thing as the above server.




Keep an eye out of the next installment, in which I'll describe simple dynamic resources.

Thursday, September 10, 2009

On Civility and Common Courtesy

I'm sure there's no one reading this who expects the internet to be a congenial place at all times. The internet is just a bunch of people, usually shouting, after all. And people can't yet be relied upon to always get along (and I'm no exception - but that doesn't make it right).

Still, we should always be striving to be better. And in certain subsets of the internet - communities, if you will - the bar should be set even higher still. Of course I have someone in particular in mind as I write this, someone in one of the many open source software communities on the 'net.

There's a lot of competition between similar open source projects. As with most competition, this can be both beneficial and... otherwise. Competition can be a great source of motivation. But it can also lead to resentment, grudges, vitriol, and worse. It's easy for things to spiral when this happens. And even if matters remain at just a low level of antipathy, no one is benefiting from this, no one wakes up each day happier because they know someone is going to insult their work, no software is improved by slinging around baseless claims about the poor quality of other people's work.

I have no idea if the person who prompted this post will read it, and now I think it doesn't really matter. You read it. Just keep it in mind.

Wednesday, September 2, 2009

August reading list


  • The Future of Life. Edward O. Wilson.

  • Fleet of Worlds. Larry Niven.

  • Not Long Before the End. Larry Niven.

  • What Good Is a Glass Dagger? Larry Niven.

  • The Draco Tavern. Larry Niven.

  • Fault-Intolerant. Isaan Asimov.

  • Galactic North. Alastair Reynolds.

  • Consider Phlebas. Iain Banks.