Kernel methods
Kernel methods exploit the similarity between documents, that is, by length, topic, language, and so on, to extract patterns from the documents. Inner products between data items can reveal a lot of latent information; in fact many of the standard algorithms can be represented in the form of inner products between data items in a potentially complex feature space. The reason why kernel methods are suitable for high dimensional data is that the complexity only depends on the choice of kernel, it does not depend upon the features of the data in use. Kernels solve the computational issues by transforming the data into richer feature spaces and non-linear features and then applying linear classifier to the transformed data, as shown ...
Get Mastering Text Mining with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.