December 2018
Beginner to intermediate
684 pages
21h 9m
English
Using the BBC data as before, we use sklearn.decomposition.LatentDirichletAllocation to train an LDA model with five topics (see the sklearn documentation for detail on parameters, and the notebook lda_with_sklearn for implementation details):
lda = LatentDirichletAllocation(n_components=5, n_jobs=-1, max_iter=500, learning_method='batch', evaluate_every=5, verbose=1, random_state=42)ldat.fit(train_dtm)LatentDirichletAllocation(batch_size=128, doc_topic_prior=None, evaluate_every=5, learning_decay=0.7, learning_method='batch', learning_offset=10.0, max_doc_update_iter=100, max_iter=500, mean_change_tol=0.001, n_components=5, n_jobs=-1, n_topics=None, perp_tol=0.1, random_state=42, topic_word_prior=None, ...