Wednesday, October 12, 2005

Divmod Xapwrap Released

Xapian is a fairly popular full-text indexing system ("Probabilistic Information Retrieval library"). It's got Python bindings, but they're not so fun to use. Xapwrap is a layer on top of these bindings which tries to simplify matters:

exarkun@boson:~/xapwrap-demo$ cat > a-m
animal
bongo
car
delicate
effigy
fantastic
gorilla
humble
internet
jump
kaleidoscope
laughter
massive
exarkun@boson:~/xapwrap-demo$ cat > n-z
noisy
octothorp
pie
quartz
restful
sate
turtle
umbrage
vorpal
winter
xylophone
yak
zoo
exarkun@boson:~/xapwrap-demo$ python
Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from xapwrap import index, document
>>> idx = index.Index(name='demo.db', create=True)
>>> idCounter = 1
>>> docIDs = {}
>>> for docName in 'a-m', 'n-z':
... docIDs[idCounter] = docName
... text = document.TextField(file(docName).read())
... idx.index(document.Document(textFields=[text], uid=idCounter))
... idCounter += 1
...
>>> res = idx.search('zoo')
>>> print res
[{'score': 100, 'uid': 2}]
>>> print docIDs[res['uid']]
n-z
>>> idx.search('animal')
[{'score': 100, 'uid': 1}]
>>>

Get the Xapwrap 0.3.0 release.

1 comment:

  1. Yay, new bindings!

    I've integrated Xapian into the latest Roundup dev release (0.9) but the latest Xapian python bindings are just plain broken. Will be interesting to see whether your bindings can be plugged in :)

    ReplyDelete