SSPA Blog: One-Liners: Not always fun and games

By Rick Wicklin posted 09-27-2011 09:09

  

There are two kinds of one-liners: those that are funny, and those that are not. The funny one-liners are pithy zingers made famous by comedians such as Groucho Marx and Don Rickles. For example, “I might be agnostic, but I’m not sure.”

 The not-funny one-liners occur when someone writes an obtuse sequence of nested function calls in order to brag that they can compute an answer with a single executable statement. These one-liners are difficult to read, difficult to debug, and difficult to modify. Also, they sometimes (but not always) sacrifice efficiency for the sake of compactness.

Readability is an important criterion for professional programmers. Obfuscation is the kiss of death to those who write, debug, maintain, and improve computer programs. Well-named temporary variables make it easier to understand an algorithm, even if they are not strictly necessary.  (For example, I might create temporary variables named Numerator and Denominator and write Ratio = Numerator / Denominator.)  Simple subexpressions are often helpful when you have to modify the code to support an additional parameter or to handle a special case.  

One-liners can be inefficient. Suppose that x is a large array and that f1, f2, f3, and f4 are functions that return arrays of the same size. Then the expression y = f1(f2(f3(f4(x)))) has two weaknesses. First, the expression allocates and copies many large arrays. Second, if you discover that y is incorrect, it is tedious to debug the expression as written. In languages that permit passing parameters by reference, both of these weaknesses can be addressed by passing the values by reference, but at the cost of writing four separate statements instead of one nested call. Not compact, but it’s efficient.

Of course, one-liners can be fun. Some people enjoy one-liners so much that there are contests and Web sites dedicated to obfuscated code, including the International Obfuscated C Code Contest and a Web page that seeks the shortest algorithm for the Fibonacci sequence in one line. (The Obfuscated C Code Contest is no longer held. The joke in programming circles is that the contest was no longer needed after Perl became popular.)

These contests are harmless recreations, although the Fibonacci contest seems strange to me because it invites comparisons between different languages.  Low-level languages such as C are good at certain tasks, whereas high-level languages (such as R, SAS, Mathematica) are more convenient for other tasks.  Mathematica has a built-in Fibonacci function, but does it follow that Mathematica should win the shortest algorithm contest?  The core of Mathematica is written in C/C++, and that “one” statement in Mathematica executes many lines of C code. Convenience is important, but so is efficiency.

So, next time someone brags, “I can do that in a one-liner,” think about your goals as a programmer. Analyze the one-liner in terms of clarity and efficiency. One-liners can be fun, but sometimes the compactness is not worth the cost.

2 comments
111 views

Comments

09-29-2011 14:07

From what I've seen of R and Mathematica, the object of a package or function is to bundle the steps needed to do something somewhat complicated, thus reducing work on the programming end - does that not mean an increase in efficiency?
I think of efficiency as saving programming time or saving programming resources. I guess I don't think of that necessarily in computing resources (memory, CPU cycles, etc.)
Perhaps this is because my datasets are relatively small, so if a SAS macro or R package quadruples the computing time say from 1 CPU seconds to 4 CPU seconds - that doesn't practically matter to me. Same, I'd say for even 10 seconds to 40 seconds.

09-27-2011 14:58

Just found out (from http://www.johndcook.com/blog/2011/09/27/sed-one-liners/) about two e-books by Peteris Krumins: "Awk One-Liners Explained" and "Sed One-Liners Explained." For example, see http://www.catonmat.net/blog/sed-book/