Kernel Methods for Textual Information Access 1
4.1. Kernel methods: context and intuitions
It is striking to note that, in the communities using statistical modeling and machine learning, in the past 50 years there has been a pendulum motion between approaches using rigor and simplicity on the one hand, and more experimental approaches on the other, which aim at investigating other areas and at exceeding the limitations of a too rigid framework, in particular the framework of linear models. In the last 20 years, things have gone from using more or less heuristic methods, with a huge flexibility but sometimes badly handled (in particular, in the structure and number of parameters of the underlying models), to much more “controlled”, statistically and mathematically principled methods, where the flexibility and quality of the generalization of the models are controlled as far as possible. A flagrant example of a very popular method during this period is the artificial neural network [PER 03], whose modeling power seemed to be almost unlimited and, it must be admitted, gave rise to numerous abuses and excesses, through lack of theoretical foundations or, more simply, through an inability on the part of the modeler to introduce effective means to control the generalization capacities of such systems and to avoid “overfitting”.
Motivated in part by these excesses and the disillusionment resulting from them, new theories (of which [VAP 98] is a leading paper) have emerged ...