Automatic Text Simplification

Book description


Thanks to the availability of texts on the Web in recent years, increased knowledge and information have been made available to broader audiences. However, the way in which a text is written—its vocabulary, its syntax—can be difficult to read and understand for many people, especially those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Texts containing uncommon words or longand complicated sentences can be difficult to read and understand by people as well as difficult to analyze by machines. Automatic text simplification is the process of transforming a text into another text which, ideally conveying the same message, will be easier to read and understand by a broader audience. The process usually involves the replacement of difficult or unknown phrases with simpler equivalents and the transformation of long and syntactically complex sentences into shorter and less complex ones. Automatic text simplification, a research topic which started 20 years ago, now has taken on a central role in natural language processing research not only because of the interesting challenges it posesses but also because of its social implications. This book presents past and current research in text simplification, exploring key issues including automatic readability assessment, lexical simplification, and syntactic simplification. It also provides a detailed account of machine learning techniques currently used in simplification, describes full systems designed for specific languages and target audiences, and offers available resources for research and development together with text simplification evaluation techniques.

Table of contents

  1. Cover
  2. Copyright
  3. Title Page
  4. Abstract
  5. Dedication
  6. Contents
  7. Acknowledgments
  8. 1 Introduction
    1. 1.1 Text Simplification Tasks
    2. 1.2 How are Texts Simplified?
    3. 1.3 The Need for Text Simplification
    4. 1.4 Easy-to-read Material on the Web
    5. 1.5 Structure of the Book
  9. 2 Readability and Text Simplification
    1. 2.1 Introduction
    2. 2.2 Readability Formulas
    3. 2.3 Advanced Natural Language Processing for Readability Assessment
      1. 2.3.1 Language Models
      2. 2.3.2 Readability as Classification
      3. 2.3.3 Discourse, Semantics, and Cohesion in Assessing Readability
    4. 2.4 Readability on the Web
    5. 2.5 Are Classic Readability Formulas Correlated?
    6. 2.6 Sentence-level Readability Assessment
    7. 2.7 Readability and Autism
    8. 2.8 Conclusion
    9. 2.9 Further Reading
  10. 3 Lexical Simplification
    1. 3.1 A First Approach
    2. 3.2 Lexical Simplification in LexSiS
    3. 3.3 Assessing Word Difficulty
    4. 3.4 Using Comparable Corpora
      1. 3.4.1 Using Simple English Wikipedia Edit History
      2. 3.4.2 Using Wikipedia and Simple Wikipedia
    5. 3.5 Language Modeling for Lexical Simplification
    6. 3.6 Lexical Simplification Challenge
    7. 3.7 Simplifying Numerical Expressions in Text
    8. 3.8 Conclusion
    9. 3.9 Further Reading
  11. 4 Syntactic Simplification
    1. 4.1 First Steps in Syntactic Simplification
    2. 4.2 Syntactic Simplification and Cohesion
    3. 4.3 Rule-based Syntactic Simplification using Syntactic Dependencies
    4. 4.4 Pattern Matching over Dependencies with JAPE
    5. 4.5 Simplifying Complex Sentences by Extracting Key Events
    6. 4.6 Conclusion
    7. 4.7 Further Reading
  12. 5 Learning to Simplify
    1. 5.1 Simplification as Translation
      1. 5.1.1 Learning Simple English
      2. 5.1.2 Facing Strong Simplifications
    2. 5.2 Learning Sentence Transformations
    3. 5.3 Optimizing Rule Application
    4. 5.4 Learning from a Semantic Representation
    5. 5.5 Conclusion
    6. 5.6 Further Reading
  13. 6 Full Text Simplification Systems
    1. 6.1 Text Simplification in PSET
    2. 6.2 Text Simplification in Simplext
      1. 6.2.1 Rule-based “Lexical” Simplification
      2. 6.2.2 Computational Grammars for Simplification
      3. 6.2.3 Evaluating Simplext
    3. 6.3 Text Simplification in PorSimples
      1. 6.3.1 An Authoring Tool with Simplification Capabilities
    4. 6.4 Conclusion
    5. 6.5 Further Reading
  14. 7 Applications of Automatic Text Simplification
    1. 7.1 Simplification for Specific Target Populations
      1. 7.1.1 Automatic Text Simplification for Reading Assistance
      2. 7.1.2 Simplification for Dyslexic Readers
      3. 7.1.3 Simplification-related Techniques for People with Autism Spectrum Disorder
      4. 7.1.4 Natural Language Generation for Poor Readers
    2. 7.2 Text Simplification as NLP Facilitator
      1. 7.2.1 Simplification for Parsing
      2. 7.2.2 Simplification for Information Extraction
      3. 7.2.3 Simplification in and for Text Summarization
      4. 7.2.4 Simplifying Medical Literature
      5. 7.2.5 Retrieving Facts from Simplified Sentences
      6. 7.2.6 Simplifying Patent Documents
    3. 7.3 Conclusion
    4. 7.4 Further Reading
  15. 8 Text Simplification Resources and Evaluation
    1. 8.1 Lexical Resources for Simplification Applications
    2. 8.2 Lexical Simplification Resources
    3. 8.3 Corpora
    4. 8.4 Non-English Text Simplification Datasets
    5. 8.5 Evaluation
    6. 8.6 Toward Automatically Measuring the Quality of Simplified Output
    7. 8.7 Conclusion
    8. 8.8 Further Reading
  16. 9 Conclusion
  17. Bibliography
  18. Author’s Biography

Product information

  • Title: Automatic Text Simplification
  • Author(s): Horacio Saggion, Graeme Hirst
  • Release date: April 2017
  • Publisher(s): Morgan & Claypool Publishers
  • ISBN: 9781681731865