Skip to Content
Natural Language Annotation for Machine Learning
book

Natural Language Annotation for Machine Learning

by James Pustejovsky, Amber Stubbs
October 2012
Beginner to intermediate
342 pages
9h 55m
English
O'Reilly Media, Inc.
Content preview from Natural Language Annotation for Machine Learning

Chapter 12. Afterword: The Future of Annotation

In this book we have endeavored to give you a taste of what it’s like to go through the entire process of doing annotation for training machine learning (ML) algorithms. The MATTER development cycle provides a tested and well-understood methodology for all the steps required in this endeavor, but it doesn’t tell you everything there is to know about annotation. In this chapter we look toward the future of annotation projects and ML algorithms, and show you some ways that the field of Natural Language Processing (NLP) is changing, as well as how those changes can help (or hurt) your own annotation and ML projects.

Crowdsourcing Annotation

As you have learned from working your way through the MATTER cycle, annotation is an expensive and time-consuming task. Therefore, you want to maximize the utility of your corpus to make the most of the time and energy you put into your task.

One way that people have tried to ameliorate the cost of large annotation projects is to use crowdsourcing—by making the task available to a large group of (usually untrained) people, it becomes both cheaper and faster to obtain annotated data, because the annotation is no longer being done by a handful of selected annotators, but rather by large groups of people.

If the concept of crowdsourcing seems strange, think about asking your friends on Facebook to recommend a restaurant, or consider what happens when a famous person uses Twitter to ask her followers for ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Machine Learning with PyTorch and Scikit-Learn

Machine Learning with PyTorch and Scikit-Learn

Sebastian Raschka, Yuxi (Hayden) Liu, Vahid Mirjalili

Publisher Resources

ISBN: 9781449332693Errata