Document retrieval and Support Vector Machine

Support Vector Machine (SVM) is a classification algorithm applicable to both linear and nonlinear data classification. It is based on an assumption: if two classes of data cannot be divided by a hyper-plane, then after mapping the source dataset to sufficient higher dimension spaces, the optimal separating hyper-plane must exist.

Here are two concepts that need to be clearly defined:

  • Linearly separable: This means that a dataset can be divided into the target classes with a linear equation with the input of a training tuple.
  • Nonlinearly separable: This means that none of the linear equations exist in the space with the same dimension as that of the training tuple.

The linear hyper-plane can be represented ...

Get R: Mining Spatial, Text, Web, and Social Media Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.