Book description
Taming Text is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. This book explores how to automatically organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. The book guides you through examples illustrating each of these topics, as well as the foundations upon which they are built.
About the Technology
About the Book
There is so much text in our lives, we are practically drowning in it. Fortunately, there are innovative tools and techniques for managing unstructured information that can throw the smart developer a much-needed lifeline. You'll find them in this book.
Taming Text is a practical, example-driven guide to working with text in real applications. This book introduces you to useful techniques like full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. You'll explore real use cases as you systematically absorb the foundations upon which they are built.
Written in a clear and concise style, this book avoids jargon, explaining the subject in terms you can understand without a background in statistics or natural language processing. Examples are in Java, but the concepts can be applied in any language.
What's Inside
- When to use text-taming techniques
- Important open-source libraries like Solr and Mahout
- How to build text-processing applications
About the Reader
About the Authors
Grant Ingersoll is an engineer, speaker, and trainer, a Lucene committer, and a cofounder of the Mahout machine-learning project. Thomas Morton is the primary developer of OpenNLP and Maximum Entropy. Drew Farris is a technology consultant, soft ware developer, and contributor to Mahout, Lucene, and Solr.
Quotes
Takes the mystery out of very complex processes.
- From the Foreword by Liz Liddy, Dean, iSchool, Syracuse University
Text analysis and processing as it should be: clear, practical, and open source!
- David Weiss, Carrot Search s.c.
Shows how to unlock and exploit information locked up in text documents.
- Rick Wagner, Red Hat
Teaches text concepts with examples ... makes text search easy.
- Doug Warren, Java Web Services
A great overview of tools and techniques for text processing.
- Julien Nioche, DigitalPebble, Ltd.
Publisher resources
Table of contents
- Copyright
- Brief Table of Contents
- Table of Contents
- Foreword
- Preface
- Acknowledgments
- About this Book
- About the Cover Illustration
- Chapter 1. Getting started taming text
- Chapter 2. Foundations of taming text
- Chapter 3. Searching
- Chapter 4. Fuzzy string matching
- Chapter 5. Identifying people, places, and things
- Chapter 6. Clustering text
- Chapter 7. Classification, categorization, and tagging
- Chapter 8. Building an example question answering system
- Chapter 9. Untamed text: exploring the next frontier
- Index
- List of Figures
- List of Tables
- List of Listings
Product information
- Title: Taming Text
- Author(s):
- Release date: December 2012
- Publisher(s): Manning Publications
- ISBN: 9781933988382
You might also like
book
Working with Text
What is text mining, and how can it be used? What relevance do these methods have …
book
Link Technology to Your Long-Term Business Goals: How to Use Technology to Mobilize Your People, Strategy and Operations
Link the use of technology with long-term business goals to optimize the core elements in your …
audiobook
Transformed
Help transform your business and innovate like the world's top tech companies! Transformed: Moving to the …
book
Show Your Worth: 8 Intentional Strategies for Women to Emerge as Leaders at Work
Foreword by Rodney Adkins, Former Senior Vice President at IBM, Chairman at Avnet, and Board Member …