Inspired by http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408859
While some people are busy worrying about how to make Python’s builtin sockets less efficient, one might be wondering if the reverse is possible - how do you make them more efficient? After all, you usally want your program to run more quickly, or tax your CPU less heavily, or consume fewer resources, not the reverse. Fortunately, I have just the solution for you
Solution the first: readinto
exarkun@boson:~$ python Python 2.4.1 (#2, Mar 30 2005, 21:51:10) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import socket, array, os >>> s = socket.socket() >>> s.bind(('', 4321)) >>> s.listen(3) >>> c, a = s.accept() >>> buf = array.array('c', '\0' * 50) >>> os.fdopen(c.fileno()).readinto(buf)2 50 >>> buf.tostring() 'apiodjwoaidjaowidjalskdjlaksdjlaksjdawd\r\naiopjwdoa' >>> c.recv(10) Traceback (most recent call last): File "", line 1, in ? socket.error: (9, 'Bad file descriptor') >>>
As you can see, the handy
readinto
method of file objects can be used to provide a pre-allocated memory space for a read to use. Unfortunately, it is afile
method, not asocket
method (also, its documentation recommends strongly against its use, though I can’t imagine why!). We can get around this, though, since a file descriptor is just a file descriptor.os.fdopen
will happily give us a file object wrapped around the socket we’re really interested in. Then it’s a simple matter of callingreadinto
on the resultingfile
object with anarray
we have previously allocated.
“Great!” you say. “Why even bother with the other two examples?” you wonder. Well, there are a few problems. Even if we accept the
os.fdopen
hack, and even if we do not let the strong words in thefile.readinto
docstring dissuade us, there’s still a tiny problem.file.readinto
closes the file descriptor before returning! Damn, there goes our socket. Maybe the next solution will fare better.Solution the second: recv(2)
Okay, that stuff withfile.readinto
was just silly. Let’s get serious here. libc already provides the functionality we need here, and has for decades. This is basic BSD sockets 101. Stevens would cry (if he were still with us) if he saw us doing anything else. So let’s cut the funny business and just do what a C programmer would do: callrecv
.exarkun@boson:~$ python Python 2.4.1 (#2, Mar 30 2005, 21:51:10) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import dl >>> libc = dl.open('libc.so.6') >>> import socket, array >>> s = socket.socket() >>> s.bind(('', 4321)) >>> s.listen(3) >>> c, a = s.accept() >>> buf = array.array('c', '\0' * 50) >>> libc.call('recv', c.fileno(), buf.buffer_info()[0], 50, 0) 29 >>> buf.tostring() 'aldjiawoidjaskdjlacnwmoqawd\r\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' >>> >>> libc.call('recv', c.fileno(), buf.buffer_info()[0], 50, 0) 30 >>> buf.tostring() 'ncbnczmnxbcmznxcbzmnxbcu7wyw\r\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Sweet. We open libc so we can call
recv
in it, create a socket as usual, and another array object to act as our pre-allocated memory location. Note we use thebuffer_info
method this time, because recv() does not expect a “read-write buffer object” (likefile.readinto
did), but a pointer to a location in memory, which is exactly whatbuffer_info()[0]
gives us. Then we just callrecv
. Easy as eatin’ pancakes. We can even do it twice, demonstrating thatrecv
isn’t doing anything ridiculous, like closing the socket for us (I did it with the same array object, overwriting the previous contents, demonstrating that our no-allocation trick is working just fine).I know what you’re thinking, though.
array
objects? What the hell can you do with anarray
object? Well, here’s what. All kinds of stuff! Why, you can build one from a string. Or build a string from one. Or, uh, swap the byte order… umm, oh yea you can reverse them too. Cool deal, eh? Err, no, maybe not actually… None of those cool string methods are around, unfortunately. You can create a string from the array but that kind of defeats the purpose… in doing so you’ve just allocated a pile of memory. Nuts. Well, wait, don’t give up yet, we may be able to improve upon this situation…Solution the ultimate: recv(2) (uh yea, again).
The only problem we really have with
recv
isn’t actually withrecv
: it’s witharray
! Let’s not throw the baby out with the bathwater, then. Solution: droparray
, keeprecv
. We want a string. Well, let’s use a string.exarkun@boson:~$ python Python 2.4.1 (#2, Mar 30 2005, 21:51:10) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import socket, dl >>> libc = dl.open('libc.so.6') >>> s = socket.socket() >>> s.bind(('', 4321)) >>> s.listen(3) >>> c, a = s.accept() >>> buf = '\0' * 50 >>> libc.call('recv', c.fileno(), id(buf) 20, 50, 0) 36 >>> buf 'aodijaacnwuihaiuwdhkasjnbkawuhdawd\r\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' >>>
It’s the perfect solution. No wasted memory allocation, but the same level of convenience as a normal call to
socket.recv
. Rarely are we lucky enough to find such elegant and flawless solutions in computer science. The astute reader might object to the magical20
in therecv
call as being inelegant or flawed, however the value can easily be computed at runtime. The code to do so is extremely simple and only omitted because it slightly too large to fit in the margin.
So there you have it. Happy networking.
1 Sorry, it’s way too late to post something useful. Especially when I could post something fun instead.
2 Note: in each example where socket IO occurs, I have launched telnet in another terminal and type in some random bytes.