CHAPTER 4
LANGUAGES FOR DATA ANALYSIS
Computing languages for data analysis and statistics must be able to cover the entire spectrum from improvisation and fast prototyping to the implementation of streamlined, specialized systems for routine analyses. Such languages must not only be interactive but also programmable, and the distinctions between language, operating system and user interface get blurred. The issues are discussed in the context of natural and computer languages, and of the different types of user interfaces (menu, command language, batch). It is argued that while such languages must have a completely general computing language kernel, they will contain surprisingly few items specific to data analysis – the latter items more properly belong to the ‘literature’ (i.e. the programs) written in the language.1
4.1 GOALS AND PURPOSES
Over the past 30 years or so, data analysis has emerged as the single most demanding application of interactive computing. The main challenge is that it covers a wide spectrum ranging from research to repetitive routine. Standard tasks should be offered in canned form, in particular to novice users. Non-standard ones should be easy to improvise, either by putting them together from standard building blocks, or by modifying suitable templates, in the style pioneered by the Postscript Cookbook (1985). It should be easy to streamline such improvised tasks and to add them to the system, either in interpreted soft or in compiled hard form (extensibility). ...