Wednesday, March 3, 2010

Include version information in your Python packages

Quite some time ago, I wrote a list of what to do and what not to do when creating a Python package. In the spirit of that post, I have another do item:

Do include version information somewhere in your package. When I say package here, I'm referring to the Python concept - the thing which can be imported and manipulated programmatically. Here is a random assortment of real-life examples of what I mean:

>>> import sys
>>> sys.version
'2.6.4 (r264:75706, Dec  7 2009, 18:45:15) \n[GCC 4.4.1]'
>>> import OpenSSL
>>> OpenSSL.__version__
'0.10'
>>> import gmpy
>>> gmpy.version()
'1.04'
>>> import gtk
>>> gtk.gtk_version
(2, 18, 3)
>>> gtk.pygtk_version
(2, 16, 0)
>>> import pyasn1
>>> pyasn1.majorVersionId
'1'
>>> import win32api
>>> win32api.GetFileVersionInfo(win32api.__file__, chr(92))['FileVersionLS'] >> 16
212

You can see several conventions represented here. Some of them are better than others (in case you need a hint, gtk is doing something pretty nice; win32api is towards the other end of the spectrum). However, even the worst of these is providing the desired information. Compare this to:

>>> import zope.interface
>>> [x for x in dir(zope.interface) if 'version' in x]
[]
>>> import subunit
>>> [x for x in dir(subunit) if 'version' in x]
[]

This is a simple piece of information, and after you've actually picked a version of something and installed it, it's hardly ever of much interest. However, when it is of interest, you really want to be able to get it. You don't want to rely on the memory of some user about which version they installed when you're tracking down some poor interaction.

So. Do you maintain a Python package? Does it expose its version somehow? If not, fix it!

14 comments:

  1. This could be a nice thing to add to modern-package-template: http://bitbucket.org/srid/modern-package-template/issue/5/add-__version__-to-package-module

    ReplyDelete
  2. If you have setuptools or Distribute installed, you can also do this:

    >>> import pkg_resources
    >>> pkg_resources.get_distribution('somepkg').version
    '0.4.4dev'

    This is slightly less direct than getting the version from the package itself. On the other hand, it provides a uniform way to get at the version.

    ReplyDelete
  3. Aha! I was pretty sure there was some way to do this with pkg_resources, but my five minutes of investigation weren't sufficient to discover the correct API.

    This does help a lot, but I stand by everything in my post, since it's still not necessarily the case that pkg_resources will be available (particularly on end-user machines and the like). Perhaps someday in the future though this will change.

    ReplyDelete
  4. All installers record the version information somewhere when a library is installed. Well, Distutils used to toss this information out, but now records it in an .egg-info file. Setuptools expanded up this to create a .egg-info directory so that a richer set of metadata could be stored. Then Tarek created PEP 376 to standardize and unify the metadata format, so that different installation tools could understand each others package installs, which provides an API for reading version information with:

    >>> from pkgutil import get_distribution
    >>> dist = get_distribution('docutils')
    >>> dist.name
    'docutils'
    >>> dist.metadata.version
    '0.5'

    Which I guess now will become part of 'distutils2'.

    The tool doing the downloading and installing of a package knows the version information before the library is ever imported. Versioning is metadata - it happens *external* to the library. Which also makes it easy to consistently generate lists of libraries and their versions for any given working set of libraries. And tools make it trivial to re-install that same set elsewhere (pip, buildout).

    ReplyDelete
  5. Would you mind including a motivation for your blanket edicts?

    ReplyDelete
  6. There are a lot of packages using __version___ - I've always seen it as the 'standard' way of specifying the version of a Python package / module.

    ReplyDelete
  7. > All installers record the version information somewhere when a library is installed. Well, Distutils used to toss this information out, but now records it in an .egg-info file.

    So, all installers record it, except those that don't? :)

    As I mentioned in my reply to the previous anonymous comment, it will be a while before everyone is using these new installers, so there's still value in exposing this information directly on the package, rather than relying on the installation tools to be able to provide the information.

    That said, the installers can only provide the information given to them by the authors. So even once they're widespread, it will still be important for authors to provide useful version information. For example, subunit "0.0.3" doesn't even include a setup.py.

    ReplyDelete
  8. It also leads to circular references:

    * In order to know its own version, the package uses 'pkg_resources.get_distribution("foo")' to get a reference to itself.

    * This requires that the package already be built via Setuptools before this querying can be done.

    * In order to build with Setuptools, the package's 'setup.py' program needs to be run.

    * In order to know various metadata stuff about the package, the 'setup.py' program does 'import foo'.

    * In order to 'import foo', the package is executed, including the query of its own version.

    * But at this point, the package isn't yet built, so the 'pkg_resources.get_distribution("foo")' will fail.

    ReplyDelete
  9. > Do you maintain a Python package? Does it expose its version somehow? If not, fix it!

    I'd like to fix it for my packages, yes. But there's no one obvious way to do it. I have the following criteria:

    * Don't Repeat Yourself. The version information should have one canonical location in the source code, and be dynamically retrieved from there as needed.

    * Simple maintenance of version string. Ideally the version string should be stored in a plain text file, called 'version' at the top of the VCS working tree for the project, that has the sole purpose of being the canonical current version string of the package.

    * Ability to get other metadata from importing the package. Fields like copyright years, short/long description, and author info are all useful to expose via the package; so I want 'setup.py' to do 'import foo' to get them from there. That, of course, means that the package needs to be importable during the build process.

    * Ability to get the version information while the package code is running. This is to provide the exported version attribute as you describe above.

    I haven't yet come up with a good solution for all this. My existing attempts have been convoluted and flawed. I'd love to see some examples of good practice that meet all the above criteria.

    ReplyDelete
  10. As far as I can tell, no one knows how to do this in way that satisfies all your criteria. I asked on distutils-sig about a year ago and received no satisfactory answers.

    But don't let better be the enemy of good. In Twisted, we have a _version.py that just defines the version, and import it into setup.py. That basically works, except for obscure (sorry zooko) setuptools uses. Doing something with a plain text file also seems somewhat straightforward and workable, if mildly unpleasant.

    I suggest trying one of those. When you run into something you don't like, complain about it to someone maintaining the software responsible for the issue. :)

    ReplyDelete
  11. Thanks, this helped me to answer my own question on StackOverflow about finding the version of pywin32.

    And I appreciate and agree with your point. I'll try to do the same for my own packages.

    ReplyDelete
  12. Getting version and other details about 'zope.interface' is possible:

    >>> import pkg_resources
    >>> dist = pkg_resources.get_distribution('zope.interface')
    >>> dist.version
    '3.6.1'
    >>> dist.parsed_version
    ('00000003', '00000006', '00000001', '*final')
    >>> dist.egg_name()
    'zope.interface-3.6.1-py2.7'

    For other attributes:
    http://peak.telecommunity.com/DevCenter/PkgResources#distribution-attributes

    ReplyDelete
  13. @baijum81: Thanks! Someone pointed that out before, but the comment got lost. It's nice to have that answer back up here. My very first comment (starting "Aha! I was pretty sure ...") above is a response to that original comment.

    ReplyDelete