Friday, December 21, 2007

Filesystem structure of a Python project

Do:
  • name the directory something related to your project. For example, if your project is named "Twisted", name the top-level directory for its source files Twisted. When you do releases, you should include a version number suffix: Twisted-2.5.
  • create a directory Twisted/bin and put your executables there, if you have any. Don't give them a .py extension, even if they are Python source files. Don't put any code in them except an import of and call to a main function defined somewhere else in your projects. (Slight wrinkle: since on Windows, the interpreter is selected by the file extension, your Windows users actually do want the .py extension. So, when you package for Windows, you may want to add it. Unfortunately there's no easy distutils trick that I know of to automate this process. Considering that on POSIX the .py extension is a only a wart, whereas on Windows the lack is an actual bug, if your userbase includes Windows users, you may want to opt to just have the .py extension everywhere.)
  • If your project is expressable as a single Python source file, then put it into the directory and name it something related to your project. For example, Twisted/twisted.py. If you need multiple source files, create a package instead (Twisted/twisted/, with an empty Twisted/twisted/__init__.py) and place your source files in it. For example, Twisted/twisted/internet.py.
  • put your unit tests in a sub-package of your package (note - this means that the single Python source file option above was a trick - you always need at least one other file for your unit tests). For example, Twisted/twisted/test/. Of course, make it a package with Twisted/twisted/test/__init__.py. Place tests in files like Twisted/twisted/test/test_internet.py.
  • add Twisted/README and Twisted/setup.py to explain and install your software, respectively, if you're feeling nice.
Don't:
  • put your source in a directory called src or lib. This makes it hard to run without installing.
  • put your tests outside of your Python package. This makes it hard to run the tests against an installed version.
  • create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module instead of a package, it's simpler.
  • try to come up with magical hacks to make Python able to import your module or package without having the user add the directory containing it to their import path (either via PYTHONPATH or some other mechanism). You will not correctly handle all cases and users will get angry at you when your software doesn't work in their environment.

Monday, December 3, 2007

Incompatabilities between classic and new-style Python classes

radix asked me if I had a blog post about why changing a class from classic to new-style is a bad idea. After I told him that I didn't he insisted that I should write one. Since doing so will take less work than finding something else to distract his attention, here it is:






  • attribute lookup


    Attributes on instances of classic classes override attributes of the same name on their class. For example:


    >>> class SimpleDescriptor(object):
    ... def __get__(self, instance, type):
    ... print 'getting'
    ... return instance.__dict__['simple']
    ... def __set__(self, instance, value):
    ... print 'setting'
    ... instance.__dict__['simple'] = value
    ...
    >>> class classic:
    ... simple = SimpleDescriptor()
    ...
    >>> x = classic()
    >>> x.simple = 10
    >>> x.simple
    10
    >>> class newstyle(object):
    ... simple = SimpleDescriptor()
    ...
    >>> x = newstyle()
    >>> x.simple
    getting
    Traceback (most recent call last):
    File "", line 1, in ?
    File "", line 4, in __get__
    KeyError: 'simple'
    >>> x.simple = 10
    setting
    >>> x.simple
    getting
    10

    As you can see, the descriptor on the new-style class can both handle the setattr and continue to operate after shadowing itself with an instance variable.




  • Special methods


    A particularly interesting consequence of the previous point is that the behavior of special methods changes in some cases:


    >>> class x:
    ... def __init__(self):
    ... self.__eq__ = lambda other: True
    ...
    >>> x() == 10
    True
    >>> class x(object):
    ... def __init__(self):
    ... self.__eq__ = lambda other: True
    ...
    >>> x() == 10
    False
    >>>




  • MRO


    Rules for determining the method resolution forbid new-style classes in places where classic classes are acceptable. Consider:


    >>> class x: pass
    ...
    >>> class y(object, x): pass
    ...
    >>> class x(object): pass
    ...
    >>> class y(object, x): pass
    ...
    Traceback (most recent call last):
    File "", line 1, in ?
    TypeError: Error when calling the metaclass bases
    Cannot create a consistent method resolution
    order (MRO) for bases object, x
    >>>







Hopefully that's enough to satisfy radix. Know of other incompatibilities? Please comment.