9.1. Ten Pounds of News in a Five-Pound Bag

The tagline in the corner of the paper version of the New York Times is "All the News That's Fit to Print." In fact, this was never true. The size of the paper is limited by many factors: press speeds, cost, and the limits of physical delivery. Editors have to pick and choose. This is true for all paper and ink publications. The Wall Street Journal index of companies mentioned rarely includes more than 300 names. But on the same day, Web sources will have news on thousands of firms. International and specialized news sources used to be costly and difficult to come by. Now they are as accessible as the local paper.

News is a time-honored source for investment information, and there is more of it than ever before, more than a person can handle. With the relentless march of technology to the beat of Moore's law, previously impractical computationally intense approaches to natural language can be used to parse, categorize, and understand the onslaught of news. Reporters help the process along by tagging story elements at their point of origin. They inject some valuable wetware into the mix of hardware and software involved in the modern production, dissemination, and consumption of news. There is a great deal of commercial effort in this area, applying language and Web technologies to gather, filter, and rank individual news by type, sentiment, or intensity. Some are available to try on the Web.[]

The purest of efficient market purists once ...

Get Nerds on Wall Street: Math, Machines, and Wired Markets now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.