2Crowdsourcing Annotation

Crowdsourcing represents a dramatic evolution for language resource development and data production in general. Not only does the change in scale push to their limits the annotation methodologies and tools, but it also modifies our relationship, as researchers, to the citizens, who become our employees or our partners.

These specificities make it the perfect testbed for manual annotation “engineering”.

2.1. What is crowdsourcing and why should we be interested in it?

2.1.1. A moving target

Defining the term “crowdsourcing” appears to be a research subject as such [EST 12]. As new applications appear, the definition evolves and revolves around the chosen focus. Rather than adding ours to the pile, we will comment on the most used definitions and give examples, going in depth to deconstruct the myths.

The term “crowdsourcing” was first coined by Jeff Howe of Wired Magazine as “the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call” [HOW 06b]. The crowdsourced definition of crowdsourcing in Wikipedia is now that of the Merriam-Webster dictionary: “the practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people and especially from the online community rather than from traditional employees or suppliers”.1 But it used to be2 closer to the original definition:

DEFINITION 2.1 ...

Get Collaborative Annotation for Reliable Natural Language Processing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.