Solving a computational problem in advanced statistics, I recently experienced an epiphany that led me from Alan Turing to Noam Chomsky.
Alan Turing (1912 – 1954) is considered the father of modern computer science. Around 1936 he proposed a theoretical computing machine that could solve general classes of computing problems. The first such machines were purely conceptual. They consisted of a data storage medium from which data could be read sequentially, and a multi-state engine that would predictably change state depending on the data read. During his war time work as a code breaker in Bletchley Park, Turing gradually turned these theoretical ideas into actual data processing machines. Initially, the logic of Turing machines was hard-wired. But both Turing and John von Neumann extended the concept to stored logic computing machines, where the computing logic itself was considered data: a program.
For the past 70 years, computer science has largely followed the Turing paradigm – a conceptual engine that changes its state depending on data entered. This state transition is typically represented in programming languages by “if…then…” switches. A variety of programming techniques were developed to tame the ever sprouting vegetation of “if…then…” constructs, with various degrees of success.
Another scientific father figure – Noam Chomsky, a theoretical linguist – developed his model of generative grammar in the 1950s. He probably wasn’t thinking of computers at all, when he came up with the concept of “transformational syntax”. Where Turing was concerned with computation, Chomsky’s focus was on language. I will define language here to suit my purpose: “a linear sequence of symbols, drawn repeatedly from a finite symbol set to consistently express and communicate a potentially unlimited number of meanings”.
A colleague had asked me to help him solve a statistical problem, which his fancy toolbox of statistical software could not handle. The problem was clearly defined, his need was acute and I designed a program to meet his specific need all in the conventional Turingian paradigm. He liked the program and wanted to apply it to a set of related problems. As sets are prone to do, this one too kept expanding and expanding. I added more and more branches into my state change logic. It becam almost impossible to keep track of the logical landscape. I was ready to toss in the proverbial towel.
That was, when it hit me: the description of his psychological experiments formed a language. Even though the descriptive terms of individual facets would span the alphabet, each facet could be classified according to a tree with 10 final branches. Suddenly, the problem had become almost trivial. The spaghetti salad of “if…then…” constructs changed into a four step algorithm: (i) the original problem formulation is encoded according to facet classes; (ii) the encoded problem formulation is transformed according to well defined syntactic rules; (iii) the results of the transformations form a finite set of problem signatures; (iv) for a given problem signature, a corresponding formula calculates the desired results. Now, my colleague’s desires for expanding the problem set even further rarely meets with more than a yawn.
Software designers might find it useful, to look for an implied language in a given problem formulation and employ syntactic transformation to arrive at an economical solution..