13

The Ensemble LDA for Model Stability

One of the success criteria of topic modeling is to produce a reliable set of topics. However, many experiments with Latent Dirichlet Allocation (LDA) have shown that the topics can be unstable and not reproducible. This issue seriously limits the applications of LDA. The instability of the topic results is partly due to the fact that the model settles at a local maximum depending on the random initialization. Even if a seed number is set to control random initialization, noisy topics can be generated during the modeling process, which might influence the quality of the outcome.

The root cause of the instability is that a single LDA model identifies the “true” topics and “pseudo” topics and produces noisy ...

Get The Handbook of NLP with Gensim now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.