July 2018
Beginner to intermediate
406 pages
9h 55m
English
The title of this section is a bit of jargon, which you will learn now. In the 1990s, first in the biomedical domain, and then on the web, problems started to appear where P was greater than N. What this meant was that the number of features, P, was greater than the number of examples, N (these letters were the conventional statistical shorthand for these concepts).
For example, if your input is a set of written documents, a simple way to approach it is to consider each possible word in the dictionary as a feature and regress on those (we will later work on one such problem ourselves). In the English language, you have over 20,000 words (this is if you perform some stemming and only consider common words; it is ...
Read now
Unlock full access