O'Reilly logo

The Art and Science of Analyzing Software Data by Thomas Zimmermann, Tim Menzies, Christian Bird

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6

Latent Dirichlet Allocation

Extracting Topics from Software Engineering Data

Joshua Charles Campbell*; Abram Hindle*; Eleni Stroulia*    * Department of Computing Science, University of Alberta, Edmonton, AB, Canada

Abstract

Topic analysis is a powerful tool that extracts “topics” from document collections. Unlike manual tagging, which is effort intensive and requires expertise in the documents’ subject matter, topic analysis (in its simplest form) is an automated process. Relying on the assumption that each document in a collection refers to a small number of topics, it extracts bags of words attributable to these topics. These topics can be used to support document retrieval or to relate documents to each other through their associated ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required