Chapter 2

Learnable Ranking Models for Automatic Text Summarization and Information Retrieval 1

2.1. Introduction

Until recently, the two main problems studied in Machine Learning (ML) were classification and regression. In these cases, the goal is to learn a prediction function, which will induce, for each entry, a value in accordance with a desired output. The problem of ranking, in which the aim is to learn an ordering of observations, has lately gained much attention in ML. A ranking function considers several entries, compares them, and returns them in the form of an ordered list. The predicted order must agree with a preferred ordering specific to the problem being treated. In the literature, there are two types of ranking problems:

– Ranking of alternatives which concerns the problem where the elements from a given collection (named alternatives) must be ranked with respect to an entry example. A typical example is automatic summarization of texts in this case, an entry example represents a document and the alternatives correspond to sentences of the document). Another example is that of information retrieval, where the documents of a given collection must be ranked in response to a user’s query (the entry is a query, and alternatives are documents of the collection). For this type of problem, we assume that the entry samples are independently and identically distributed (i.i.d.) and the aim is to return, for each sample, a ranked list of alternatives where the relevant ...

Get Textual Information Access: Statistical Models now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.