Saturday, April 23, 2011

pyOpenSSL on PyPy

You may know that I'm the maintainer of pyOpenSSL, a Python extension library which wraps some OpenSSL APIs. pyOpenSSL was started in mid 2001, around the time of Python 2.1, by AB Strakt (now Open End). Shortly afterwards Twisted picked it up as a dependency for its SSL features (the standard library SSL support was unsuitable for non-blocking use). When Twisted bumped into some of pyOpenSSL's limitations and no one else was around to address them, I decided to take responsibility for the project.

Fast forward almost a decade. pyOpenSSL now runs on Python 2.4 through Python 3.2. And soon I hope it will run on PyPy, too.

This post is about some of the things I learned while getting pyOpenSSL to work on PyPy.  All of this work is made possible, of course, by the "cpyext" module of PyPy which implements CPython's C extension API for PyPy. 

PyModule_AddObject steals references

When an extension module wants to expose a name, the most common way it does this is by using PyModule_AddObject. This adds a new attribute to the module, given a char* name and a PyObject* object. CPython uses reference counting, so the PyObject* has a counter on it recording how many different pieces of code still want it to remain alive. When populating a new module, the PyObject* you have generally have a reference count of 1. PyModule_AddObject steals this reference: it doesn't increase the reference count, it just assumes that the caller is giving up its interest in the object remaining alive; now it is the module to which the PyObject* was added which has that interest. So the reference count is still 1.

pyOpenSSL exposes a few names which are just aliases for other names (for example, X509Name and X509NameType refer to the same object). It does this by calling PyModule_AddObject twice with the same PyObject* but different char* names. Considering what wrote above about reference counts above, you might guess that this needs extra work to get the reference counting to work correctly. Otherwise the second PyModule_AddObject would steal a reference from the first PyModule_AddObject, since that's what stole it from the module initialization code. This wouldn't work very well, since there really are two references to the PyObject* now, not just one.

Though, it turns out that on CPython, it doesn't really matter. The reference count for one of these types, say X509Name again, ends up at around 20 by the time everything is initialized. Being off by one doesn't make a difference, because most of those 20 references last for the entire process lifetime. The value never gets close to 0, so the missing reference is never noticed. However, on PyPy, it turns out the missing reference does matter. I won't try to explain how PyPy manages to support CPython's C extension API, nor how it manages to make a reference counting system play together with a fully garbage collected runtime (ie, PyPy doesn't normally do reference counting). Suffice it to say that on PyPy, sometimes the reference count does get close to 0, and at those times, being off by 1 can be important - because it might mean that the reference count is exactly 0 when it was supposed to be 1. When that happens, PyPy frees the object, but other code continues to use it, and after that the behavior you get is difficult to predict due to memory corruption - but it's certainly not correct.

The fix for this one is simple - add a Py_INCREF before PyModule_AddObject. This was by far the most pervasive bug in pyOpenSSL which needed to be fixed, since it was repeated for each aliased type pyOpenSSL exposed. I added 28 Py_INCREF calls in total to address these.

PyPy doesn't support tp_setattr (yet?)

The type I mentioned above, X509Name, customizes attribute access. It needs to delegate to OpenSSL to determine if an attribute is valid or not, and if so what its current value is. It does this by implementing two C functions, one for the tp_setattr slot and one for the tp_getattro slot. No, that o isn't a typo. The CPython C extension API provides two different ways to customized attribute access. Using one way, tp_setattr and tp_getattr, CPython hands the extension function the name of the attribute as a char*. Using the other way, tp_setattro or tp_getattro, a PyObject* is passed in, instead of a char*.

So far, PyPy only implements tp_setattro and tp_getattro, not tp_setattr and tp_getattr. It would have been nice to implement this missing feature for PyPy, but instead I switched pyOpenSSL over to the already supported mechanism. This was a very simple change, since most of the lookup code is the same either way, there's just a little extra code at the beginning of the function to convert from PyObject* to char*.

I also learned about a quirk of the tp_setattro API while doing this. I expected setattr(name, u"name", "value") to pass in a PyUnicodeObject*. However, CPython actually encodes unicode to ascii and passes in a PyStringObject* instead.

X509Name.__setattr__ was missing some cleanup code for the AttributeError case

While making the switch to tp_setattro, I noticed a bug in pyOpenSSL where it failed to flush the OpenSSL error queue properly, causing a spurious OpenSSL.SSL.Error to be raised whenever an attempt was made to set an invalid attribute on an X509Name instance. This was easy to fix by adding a call to the function which flushes the error queue.

PyPy doesn't yet support all of the PyArg_ParseTuple format specifiers

Finally, I had to work PyPy itself a little bit to implement the s* and t# format specifiers for PyArg_ParseTuple. PyArg_ParseTuple is how C extension functions unpack the arguments passed to them. A call to this function looks something like PyArg_ParseTuple(args, "s*|i:send", &pbuf, &flags). The string specifies how many and what type of arguments are expected, and the values are unpacked from args into the rest of the arguments passed in. PyPy did not yet support a couple argument types which pyOpenSSL relies on, so I added this support. This code is still in a branch of PyPy, but I hope it will be merged into the default branch soon.

Remaining work

There is one thing left to do before pyOpenSSL will be 100% supported on PyPy. Though I said I implemented s* for PyArg_ParseTuple, I actually only implemented part of it. My code will handle the case where a str is passed in, but not the case where a memoryview is supplied instead. Handling memoryview involves a bit more work and a bit more understanding than I currently have of how PyPy's CPython bridge works. Fortunately there are many useful things that can be done with pyOpenSSL on PyPy even without this feature (when was the last time you constructed a memoryview? :), so I'm still very happy with where things currently stand.

The code

As of this posting, the PyPy code needed to make this work is in the pyarg-parsebuffer-new branch and the pyOpenSSL code is in the run-on-pypy branch. I'll be psyched if the PyPy branch can be merged in time for PyPy 1.5 so that the next pyOpenSSL release can work with the next PyPy release - we'll see!

4 comments:

  1. Heh, when William Reade was implementing Ironclad he solved the whole "reimplementing PyArg_ParseTuple is a pain in the ass" problem by just reusing the CPython implementation verbatim instead of reimplementing...

    ReplyDelete
  2. Hi Michael!

    Yea, that seems to be a popular strategy. PyPy does it too. It still ends up being some work, in a couple ways: first there aren't really any unit tests for PyArg_ParseTuple, since it's such a deep assumption of the runtime, everything exercises it constantly, but not on PyPy; second, the hard part of supporting memoryview is actually implementing the memoryview APIs for PyPy, so that the existing getargs.c code can call them. :)

    ReplyDelete
  3. It would appear that it doesn't work with PyPy's nightly builds at this point in time.

    All one needs to do to reproduce the problem is (please forgive the messed up formatting, not familiar with Blogger's comments):

    #!/usr/bin/env python

    from werkzeug.testapp import test_app
    from werkzeug.serving import run_simple

    if __name__ == '__main__':
    run_simple('localhost', 8000, test_app, ssl_context='adhoc')

    Note that this tiny test script *does* work with vanilla Python 2.7.1, and PyOpenSSL. (:

    ReplyDelete
  4. Hmm. I looked into the werkzeug example a little bit. It seems the SSL handshake is failing on PyPy. It's not clear why though. I tried taking werkzeug's SSL setup code and using it in Twisted, where it worked on PyPy. So there's something about how werkzeug is using pyOpenSSL that I don't understand that seems to be causing the problem.

    ReplyDelete