Sanskrit, Self-referentiality and Computational Linguistics
Christopher Aaron Handy, Religion
Sanskrit is a language that developed from Proto-Indo-European. Although once widely spoken on the Indian subcontinent, Sanskrit is now used primarily as a ritual language by various Indian religions. The most notable formalization of Sanskrit occurred around the fifth century BCE, when the grammarian Pānini created his famous Ashtādhyāyī (eight chapters), which aims to describe systematically the set of all possible Sanskrit utterances.
The Ashtādhyāyī is deceptively tiny, as its sūtra format allows for a great deal of information to be packed into a small space. A finite set of rules, coupled with a self-referential property, allow the work to be used as a sort of “grammar computer.” By directing its reader to different places in the text, the Ashtādhyāyī generates a desired target string through appropriate transformations applied to axiomatic substrings. Theoretically, this method is capable of generating any possible legitimate Sanskrit phrase, and its resemblance to modern theories on generative grammars has caused Sanskrit to attract the attention of linguists, mathematicians and computer programmers in addition to Indologists. In fact, the Ashtādhyāyī has been claimed to have the equivalent computing power of a Turing machine.
Beginning with an explanation of the sūtra form, I will demonstrate how the Ashtādhyāyī generates phrases from simple input queries, comparing this process to generative grammars and computer programming in general. Using freely available software, I will then illustrate some of the problems encountered when digital computers attempt a purely mechanical application of the methods given in the Ashtādhyāyī.