Book description
This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:
- information extraction and retrieval;
- text classification and clustering;
- opinion mining;
- comprehension aids (automatic summarization, machine translation, visualization).
In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications concerned, by highlighting the relationship between models and applications and by illustrating the behavior of each model on real collections.
Textual Information Access is organized around four themes: informational retrieval and ranking models, classification and clustering (regression logistics, kernel methods, Markov fields, etc.), multilingualism and machine translation, and emerging applications such as information exploration.
Contents
Part 1: Information Retrieval
1. Probabilistic Models for Information Retrieval, Stéphane Clinchant and Eric Gaussier.
2. Learnable Ranking Models for Automatic Text Summarization and Information Retrieval, Massih-Réza Amini, David Buffoni, Patrick Gallinari, Tuong Vinh Truong and Nicolas Usunier.
Part 2: Classification and Clustering
3. Logistic Regression and Text Classification, Sujeevan Aseervatham, Eric Gaussier, Anestis Antoniadis, Michel Burlet and Yves Denneulin.
4. Kernel Methods for Textual Information Access, Jean-Michel Renders.
5. Topic-Based Generative Models for Text Information Access, Jean-Cédric Chappelier.
6. Conditional Random Fields for Information Extraction, Isabelle Tellier and Marc Tommasi.
Part 3: Multilingualism
7. Statistical Methods for Machine Translation, Alexandre Allauzen and François Yvon.
Part 4: Emerging Applications
8. Information Mining: Methods and Interfaces for Accessing Complex Information, Josiane Mothe, Kurt Englmeier and Fionn Murtagh.
9. Opinion Detection as a Topic Classification Problem, Juan-Manuel Torres-Moreno, Marc El-Bèze, Patrice Bellot and Fréderic Béchet.
Table of contents
- Cover
- Title Page
- Copyright
- Introduction
-
Part 1: Information Retrieval
- Chapter 1: Probabilistic Models for Information Retrieval
-
Chapter 2: Learnable Ranking Models for Automatic Text Summarization and Information Retrieval
-
2.1. Introduction
- 2.1.1. Ranking of instances
- 2.1.1.2. Classification of critical pairs
- 2.1.1.3. Application with a linear model
- 2.1.1.4. Ranking induced by the output of a classifier
- 2.1.1.5. Other criteria
- 2.1.1.6. Special cases: bipartite ranking
- 2.1.2. Ranking of alternatives
- 2.1.3. Relation to existing frameworks
- 2.2. Application to automatic text summarization
- 2.3. Application to information retrieval
- 2.4. Conclusion
- 2.5. Bibliography
-
2.1. Introduction
-
Part 2: Classification and Clustering
- Chapter 3: Logistic Regression and Text Classification
-
Chapter 4: Kernel Methods for Textual Information Access
- 4.1. Kernel methods: context and intuitions
- 4.2. General principles of kernel methods
- 4.3. General problems with kernel choices (kernel engineering)
- 4.4. Kernel versions of standard algorithms: examples of solvers
- 4.5. Kernels for text entities
- 4.6. Summary
- 4.7. Bibliography
- Chapter 5: Topic-Based Generative Models for Text Information Access
- Chapter 6: Conditional Random Fields for Information Extraction
-
Part 3: Multilingualism
-
Chapter 7: Statistical Methods for Machine Translation
- 7.1. Introduction
- 7.2. Probabilistic machine translation: an overview
- 7.3. Phrase-based models
- 7.4. Modeling reorderings
- 7.5. Translation: a search problem
- 7.6. Evaluating machine translation
- 7.7. State-of-the-art and recent developments
- 7.8. Useful resources
- 7.9. Conclusion
- 7.10. Acknowledgments
- 7.11. Bibliography
-
Chapter 7: Statistical Methods for Machine Translation
-
Part 4: Emerging Applications
- Chapter 8: Information Mining: Methods and Interfaces for Accessing Complex Information
-
Chapter 9: Opinion Detection as a Topic Classification Problem
- 9.1. Introduction
- 9.2. The TREC and TAC evaluation campaigns
- 9.3. Cosine weights - a second glance
- 9.4. Which components for a opinion vectors?
- 9.5. Experiments
- 9.6. Extracting opinions from speech: automatic analysis of phone polls
- 9.7. Conclusion
- 9.8. Bibliography
-
Appendix A: Probabilistic Models: An Introduction
- A. 1. Introduction
- A. 2. Supervised categorization
- A. 3. Unsupervised learning: the multinomial mixture model
- A. 4. Markov models: statistical models for sequences
- A. 5. Hidden Markov models
- A. 6. Conclusion
- A. 7. A primer of probability theory
- A. 8. Bibliography
- List of Authors
- Index
Product information
- Title: Textual Information Access: Statistical Models
- Author(s):
- Release date: May 2012
- Publisher(s): Wiley
- ISBN: 9781848213227
You might also like
book
A Workout in Computational Finance
A comprehensive introduction to various numerical methods used in computational finance today Quantitative skills are a …
book
Case Studies in Bayesian Statistical Modelling and Analysis
Provides an accessible foundation to Bayesian analysis using real world models This book aims to present …
article
Reinventing the Organization for GenAI and LLMs
Previous technology breakthroughs did not upend organizational structure, but generative AI and LLMs will. We now …
book
Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications
Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications brings together all the information, …