Skip to Content
Natural Language Annotation for Machine Learning
book

Natural Language Annotation for Machine Learning

by James Pustejovsky, Amber Stubbs
October 2012
Beginner to intermediate
342 pages
9h 55m
English
O'Reilly Media, Inc.
Content preview from Natural Language Annotation for Machine Learning

Chapter 9. Revising and Reporting

Finally, we’re at the “R” of the MATTER cycle—revising your project. Of course, you’ve probably been revising your project all along, as you worked your way through the MAMA cycle, and refit your algorithms through the training, testing, and evaluation stages of machine learning. However, while making adjustments at each step of the way, you may have focused only on the steps at hand, so in the first part of this chapter, we are going to take a step back and examine some of the “big picture” items that you may want to reconsider about your project. To that end, we’ll discuss:

  • Corpus modification

  • Model and specs

  • Annotation task and annotators

  • Algorithm implementation

In the second part of the chapter we will discuss what information you should include about your task when you are writing papers, giving presentations, or just putting together a website so that people can learn about your project. Creating annotated corpora and leveraging those corpora into good machine learning (ML) algorithms are difficult tasks, and because so many variables affect the outcome of a project, the more open you are about the choices you made, the more other people will be able to learn based on your example. Some of the aspects of your project you need to consider reporting on are:

  • Corpus size, content, and creation

  • Annotation methods and annotator qualifications

  • ML modifications and training adjustments

  • Revisions to your project, both implemented and planned

Revising Your ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Machine Learning with PyTorch and Scikit-Learn

Machine Learning with PyTorch and Scikit-Learn

Sebastian Raschka, Yuxi (Hayden) Liu, Vahid Mirjalili

Publisher Resources

ISBN: 9781449332693Errata