James Pustejovsky

James Pustejovsky

Computational Linguist

Boston, Massachusetts

James Pustejovsky holds the TJX/Felberg Chair in Computer Science at Brandeis University, where he directs the Lab for Linguistics and Computation, and chairs both the Program in Language and Linguistics and the Computational Linguistics MA Program. He has conducted research in computational linguistics, AI, lexical semantics, temporal reasoning, and corpus linguistics and language annotation. He is currently head of a working group within ISO/TC37/SC4 to develop a Semantic Annotation Framework, and is chief architect of TimeML and ISO-TimeML, a newly adopted ISO standard for temporal information in language, as well as the draft specification for spatial information, ISO-Space. Pustejovsky was PI of a large NSF-funded effort, "Towards a Comprehensive Linguistic Annotation of Language," that involved merging several diverse linguistic annotations (PropBank, NomBank, the Discourse Treebank, TimeBank, and Opinion Corpus) into a unified representation. Currently, he is Co-PI of a major project funded by the NSF to address interoperability for NLP data and tools. He has taught computational linguistics to both graduates and undergraduates for 20 years, and corpus linguistics for eight years. He has authored numerous books, including Interpreting Motion (with I. Mani, Oxford University Press, 2012), Recent Advances in Generative Lexicon Theory (Springer, 2012), Generative Lexicon (MIT, 1995), The Problem of Polysemy (with B. Boguraev, Cambridge, 1996), The Language of Time (Oxford, with I. Mani and R. Gaizauskas, 2005), and Semantics and the Lexicon (Kluwer, 1993). He is currently finishing a textbook for Cambridge University Press, entitled Lexicon, to appear in 2013.

Natural Language Annotation for Machine Learning Natural Language Annotation for Machine Learning
by James Pustejovsky, Amber Stubbs
October 2012
Print: $39.99
Ebook: $33.99

Webcast: How to Develop Language Annotations for Machine Learning Algorithms
October 16, 2012
Text-based data mining and information extraction systems that make use of machine learning techniques require annotated datasets for training the algorithms. In this webcast we will discuss the steps involved in creating your own training corpus for...

"Natural Language Annotation for Machine Learning is not light reading. But it is well structured, well written and offers detailed examples. "
--Si Dunn, Sagecreek Productions

"I highly recommend it, particularly for developers who want to build their machine learning tool set. While you won't have a complete understanding of computer science when you are done with this book, you will equipped with most of the knowledge of programming that you will need to do most language processing tasks."
--David H., Book Bargains and Previews