watch this The wheels are turning, slowly turning. home
Strings are the data structure of last resort 2009-08-20

34. The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information. - Alan Perlis



There is a mistake which is often made when developing objects meant to present an abstract interface for operating with some concrete set of data. The mistake is to develop the object’s constructor such that it accepts a string representation of the data instead of a structured representation. This is particularly common when the string representation is frequently read or written by humans. One of the most common examples (but there are many others) of this is classes representing URLs/URIs/IRIs. There are many URL classes which accept a string of the form http://www.livejournal.com/update.bml as the primary (or only) argument to their constructor. This is probably due to the (generally implicit, unconsidered, and almost always invalid) assumption that the class will only ever be instantiated from string input.



Stop doing this!



The signature of the constructor strongly influences how data will move through the program. By creating a constructor which accepts a string, you are encouraging users of your class to move their data around as a string. You are also forcing them to take any structured data they might already have and degrade it into a string before using your class. This is a recipe for inefficiency, insecurity, duplication, and quite a few other problems.



To be explicit, and continuing with the URL example, your constructors should accept each piece of structured data which they would have been parsing out of the string; the correct signature for a URL class might go something like this: URL(scheme, username, password, host, pathSegments, querySegments).



If you’re worried about how much harder this makes it for developers to use your class, don’t be. It only makes it easier. You can still provide APIs for parsing strings into objects; in a language like Python, you can provide a class method (eg URL.fromString(”http://www.livejournal.com/update.bml”)); in a language with signature based method overloading (such as Java), you can even just make URL(”http://www.livejournal.com/update.bml”) and URL(”http”, null, null, new String[] {”update.bml”}, null) (although you might want to avoid that for other reasons).