The Shoreline'14 April, 2014 | Page 56

Software Design Lessons from the Command Line > by Shashi Gowda _ “ The material object of observation, the bicycle [or any tool], can’t be right or wrong. Molecules are molecules. They don’t have any ethical codes to follow except those people give them. The test of the machine is the satisfaction it gives you. There isn’t any other test. If the machine produces tranquility it’s right. If it disturbs you it’s wrong until either the machine or your mind is changed. The test of the machine is always your own mind. There isn’t any other test.” — Zen and the Art of Motorcycle Maintenance, Robert M Pirsig P rogramming, in essence, is the act of instructing a computer to do things. It is what all users of computers do, even if they don’t want to call it programming. If all goes well, you and the computer are both successful, and thus attain peace of mind. User interfaces can be thought of as programming languages, because they give you a medium to instruct the computer. This idea is dear to Unix users. Many “user friendly” software today come with a ton of features exposed through Graphical User Interfaces, curated in a myriad of menus and toolbars. And yet, these interfaces are rigid in most cases. Power users of such software often encounter repetitive tasks that involve performing the same steps over and over. Often, this involves using more than one program in a sequence. Lack of ways to automate these steps leaves us with no option but to do these routines manually, clearly defeating the purpose of a computer. Most of these interfaces (or languages) lack composability. I met the Unix shell 5 years ago1. I was looking for ways to make myself a better programmer: faster at building things better. I had been following the free software movement to which Linux (a family of free Unix-like operating systems; also the name of the kernel they use) was central. Great programmers in the community were all of the opinion that an efficient programmer must be an excellent user of the Unix shell2. For good reason, this cannot be overstated. I decided to bite the bullet and try to use the shell for as much of my interaction with the computer as possible. What followed was a profound shift in the way I approached the computer as a user. I quickly learned to stop worrying and love the command line. 54 54 The Shoreline Talk of the Unix philosophy gets bounced around a lot. The authors of The Unix Programming Environment, Brian Kernaghan and Rob Pike, put it thus: “Although the [Unix] philosophy can’t be written down in a single sentence, at its heart is the idea that the power of a system comes more from the relationships among programs than from the programs themselves. Many UNIX programs do quite trivial things in isolation, but, combined with other programs, become general and useful tools.” Doug McIllroy who played a key role in the conception of Unix pipes puts this more concretely: “This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because text is the universal interface.” During my first Engineer in college, my friend Mohak and I took part in a competition called Contrive. We were asked to build a search engine that would search a bunch of HTML files for a given search query. We were given 24 hours. The search engine was to manifest itself as an executable file, taking the search query as argument and printing at most 15 filenames in descending order of their relevance to the query. The team with the most relevant results would win. We worked our donkeys off. We were trying to use a Python library to index the files and perform the search. The library gave us all sorts of trouble which included segmentation faults to highly irrelevant results (after we got it to actually work). No sleep was had that night. In the last 30 minutes or so of the competition, we decided to give up. Not doing anything in those last several minutes made us uneasy, so we casually whipped out the old manual pages for some Unix programs. We submitted our engine in pretty much the last minute. It was 2 lines of Unix shell code. The first one was a comment and read #phew. The second was grep “$*” -icr . | sort -rt ‘:’ -k 2 -n | head -15 | sed s/:.*$// That single line of shell command was a decent search engine and also formatted the output according to the problem specification. grep is a program used to search inside text files. It works just like the Find com- mand in Windows explorer except it outputs plain text: file names and matches. The option -i makes the search case-insensitive, while -r recursively searches all the files in the directory. The -c option tells grep to output only the number of occurrences (count) of the search term instead of the matches themselves. Not like we had memorized any of that, we just read the manual page (man grep3). So with this grep command, we had the number of occurances of the search query against each file name. Something like file1.html: file2.html: file3.html: file4.html: … 2 6 0 3 Next, we piped this output to the sort program, i.e. the above output now became the input to sort—this is the semantics of the| syntax4. sort -rt ‘:’ -k 2 -n translates to, “sort the input in reverse order (-r) , using ‘:’ as field separator (-t ‘:’), and sort by the second field (-k 2), treating them as numbers (-n)”. This sorted grep’s output, ordering it from the file containing the most number of occurrence of the search term to those with least, giving output that looks something like this: file12.html: 16 file7.html: 13 file6.html: 9 file2.html: 6 … Piping this to head -15 kept only the first 15 lines of the output and discarded the rest. Finally, sed s/:.*// removed everything after ‘:’, including ‘:’, leaving us with just the file names. We had reached our required output: a maximum of 15 file names: file12.html file7.html file6.html file2.html … The programs grep, sort, head and sed do very simple things in isolation. But what is still a thing of beauty to me is that it was possible for us to combine their functionality to obtain something that did what we wanted; something that these programs were not originally designed to do. It probably doesn’t say much about the competition, but we came in third! Not bad for a single line of code, I’d say! Now, imagine doing what the shell script above does using Windows Explorer, Notepad and Excel and whatnot in place of these apparently arcane and archaic Unix programs. Instead of telling the computer to do these tasks, you’d have to do most of it yourself, although in a click-and-drag interface. Down that path is no tranquility, only distress.