Friday, January 25, 2008

Expressiveness of languages

Jeff Atwood over at Coding Horror asks in his new post "What Can You Build in 600 Lines of Code?". It reminded me of Babel-17, a novel by Samuel R. Delany. I must have read that novel 10 to 15 years ago and by now I have forgotten most of the details in it (and I'm now again grateful to Wikipedia for having reminded me of some the details.) However, among several things that I do remember (some strong images, transformation of one of the characters) there is one particular passage that stuck with me even now after all these years and readily sprang to my mind when I read Jeff's post. Here it goes (heroin Rydra talking to character named Butcher about the expressiveness of languages):
"... Now: there is a huge solar-energy conversion plant that supplies all the electrical energy for the Court. The heat amplifying and reducing components take up an area a little bigger than Tarik. One (,'iribian can slither through that plant and then go describe it to another ^iribian who never saw it before so that the-second can build an exact duplicate, even to the color the walls are painted—and this actually happened, because they thought we'd done something ingenious with one of the circuits and wanted to try it themselves—where each piece is located, how big it is, in short completely describe the whole business, in nine words. Nine very small words, too,"

The Butcher shook his head. "No. A solar-heat conversion system is too complicated. These hands dismantle one, not too long ago. Too big. Not—"

"Yep, Butcher, nine words. In English it would take a couple of books full of schematics and electrical and architectural specifications. They have the proper nine words- We don't."

"Impossible."
I always remember that passage by the crucial part: "They have the proper nine words- We don't." I have always found "nine words" to be really exaggerated but then it's a really compelling idea that stuck with me all these years. It goes to the core of expressiveness of languages in their proper context and is an example of domain specific languages: for ^iribians from the novel "their whole culture is based on heat and changes in temperature" so they have nine very short words to describe a whole solar-energy conversion plant.

Going back to Jeff's example of a production application in less than 600 lines: Ruby leveraged by Ruby On Rails is really expressive for certain types of problems. It's a web application framework after all and it's really expressive for problems of web applications. And it is this narrowing of context that allows it to be so expressive. It is built on a whole stack of software/hardware to get to that point in which less than 600 lines of code can become a full blown application:
1. Ruby On Rails is written in Ruby.
2. Ruby's interpreter is written in C and relies on C run-time library.
3. C run-time library is compiler- and OS- specific and is written in C itself, OS-specific calls and probably some assembly.
4. C compiler needed to build Ruby interpreter and C run-time library was probably written in C itself and some assembly.
5. OS was also most probably written in C and some assembly.
6. Assembler is needed to translate assembly into machine code.
7. Generated machine specific code (either from C compiler or assembler) is executed on CPU's microcode.
8. CPU's microcode was designed in software tools which are themselves written in some computer language running on an OS and on some previously designed hardware. And so on.

Not to mention that you also have a whole set of protocols (HTTP, TCP/IP), software, drivers, network cards, switches, networks and so on that allow you to transfer data to and from Ta-Da (the application in question.) At the top of all this stack is a very specific context of web applications in which Ruby On Rail excels.

Now consider the following passage from the same novel (same conversation, immediately prior to the passage I quoted above):
Take the ^iribians, who have enough knowledge to sail their triple-yoked poached eggs from star to star: they have no word for 'house', 'home', or 'dwelling'. 'We must protect our families and our homes.' When we were preparing the treaty between the (^iribians and ourselves at the Court of Outer Worlds, i remember that sentence took forty-five minutes to say in ^iribian. Their whole culture is based on heat and changes in temperature. We're just lucky that they do know what a 'family' is, because they're the only ones besides humans who have them. But for'house'you have to end up describing'. . .an enclosure that creates a temperature discrepancy with the outside environment of so many degrees, capable of keeping comfortable a creature with a uniform body temperature of ninety-eight-point-six, the same enclosure being able to lower the temperature during the months of the warm season and rise it during the cold season, providing a location where organic sustenance can be refrigerated in order to be preserved, or warmed well above the boiling point of water to pamper the taste mechanism of the indigenous habitant who, through customs that go back through millions of hot and cold seasons, have habitually sought out this temperature changing device . . .' and so forth and so on. At the end you have given them some idea of what a 'home' is and why it is worth protecting. Give them a schematic of the air-conditioning and central heating system and things begin to get through.
So whereas a ^iribians language can express "solar-energy conversion plant" in "nine very small words" it took "forty-five minutes" to say "We must protect our families and our homes." in it. (again I find these examples exaggerated... but I haven't yet had to communicate with an extraterrestrial culture so for all I know this could really turn out to be close to truth.) So even though you can write a whole application in Ruby On Rails in under 600 lines of code, it only works in the context for which Ruby On Rails was designed. Change the context and it would probably become incredibly inefficient and/or impossible in expressing the solution. How would you write an OS driver in Ruby On Rails? I would probably build a web C compiler and linker, then write the driver in C and compile and link it with Ruby On Rails!

Disclaimer: this is most definitely not a critique of Ruby On Rails! We use different languages in different contexts and today I wouldn't try to write a web application in C. I'm just discussing the expressiveness of languages in their contexts and outside of them.

Update: fixed some minor typos and layout errors.

Wednesday, January 23, 2008

CAM world *intersecting* SQL Server world

I never thought that my current area of work (in general auditing and recovery of SQL Server databases) would somehow *intersect* (pun intended - see below) with my previous efforts in 2D CAM (Computer Aided Manufacturing.) But SQL Server 2008: Spatial indexes article by Paul Randal proved me wrong. In the article Paul explains how SQL Server 2008 spatial indexes work but what was most interesting to me was that the solution he described matches a solution I applied in a 2.5D CAM application some 10 or more years ago. Don't get me wrong, that doesn't make me rocket scientist since and anyone who had to work with intensive 2D geometry calculations sooner or later figures out that calculating intersection of curves is expensive (sometimes more, sometimes less: depends on the curves) so avoiding it in advance by using "bounding boxes" is, in general case, much better than blindly trying to calculate every intersection. I remember that one great advantage in my case was that we were able to use comparison of integers instead of floating points which further sped up the algorithm. Also, in our case there was just one "bounding box" per curve which was more coarse but didn't use that much memory and was easily calculated.

The 2.5D CAM application that I refer to above was called "Impakt!" It was used to generate G-code programs for 2.5D CNC (Computer Numerical Control) mills. Half of D means that mill could only position itself in height without really orienting the tool in three dimensions. Thus the geometrical problems were reduced to creating optimal 2D cutting paths and then additionally repositioning the tool between different 2D depth levels.

We started development of Impakt! in Delphi 2.0 sometime in 1996. Impakt! worked like this:
1. You would load 2D geometry from a file in Autodesk's DXF format.
2. You would define depth of each contour by choosing each area and then setting the depth (this could then be saved in our own file format that kept depth and other information)
3. You would define your tool set for the job (we kept a DB of tools and users were able to define their own tools)
4. Application would then generate tool paths by going through the following process:
4.1 Take the tool with the largest diameter in the tool set and start with contours on the highest level of the defined geometry.
4.2 Analyze the contours and identify "lakes" (outward contours) and "islands" (inner contours)
4.3 Grow "islands" outward and shrink lakes inward by the radius of the current tool.
4.4 Solve intersections of the grown/shrank contours thus defining tool path. This really solves one particular path through Voronoi diagram of the 2D contours available on the given depth level (I can't remember why we didn't just calculate Voronoi diagrams and then generate paths... I think that there were cases we couldn't or didn't know how to handle with Voronoi diagrams and I don't think we even had math for it, at first anyway)
4.5 Continue with the process in 4.2 and 4.3 until all contours would fold on themselves.
4.6 Take next smaller tool in tool set, take left over contours from the last successful iteration of 4.5 and then start again at 4.3.
4.7 Once there are no more available tools in the tool set go to the next depth as defined by geometry and start again at 4.2.
4.8 Once you have generated geometry for all depth levels you are finished.
5. You would then define cutting parameters for the job (this only influenced G-code generation)
6. G-code (or some other type of CNC code depending on loaded modules) would be generated from tool paths and cutting parameters. Unlike geometry generation where paths for all tools were calculated at once for each depth level, G-code was generated by following the paths of the largest tool on all depth levels, then changing the tool to the next smaller one and following all its paths on all depth levels and so on until the smallest tool in the tool set.
7. If I remember correctly we had a separate tool for transferring G-code (through RS-232!) to CNC machines.

Heh, more and more stuff is starting to crawl out from the back of my brain. I just remembered that once a contour was identified as an "island" or a "lake" we would reorient it if necessary so that it would go in counter-clock-wise direction if it was an "island" or in clock-wise direction if it was a "lake" (or vice-versa, it doesn't really matter so long as they had different orientations.) This allowed us to grow/shrink the contours with the same parameters ignoring the orientation as they would then "naturally" either grow or shrink depending on the orientation we gave it.

I hope that someday we will make open code source for "Impakt!" I would probably be rather embarrassed with the quality of the code but maybe somebody could find some use for it.