Chapter 2: A simple, robust, and efficient ensemble method

Application to natural language processing

Abstract

The method described here illustrates the concept of ensemble methods, applied to a real-life NLP problem: ranking articles published on a website to predict performance of future blog posts yet to be written, and help decide on title and other features to maximize traffic volume and quality, and thus revenue. The method, called hidden decision trees (HDT), implicitly builds a large number of small usable (possibly overlapping) decision trees. Observations that do not fit in any usable node are classified with an alternate method, typically simplified logistic regression.

This hybrid procedure offers the best of both worlds: decision tree ...

Get Synthetic Data and Generative AI now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Synthetic Data and Generative AI by Vincent Granville

Chapter 2: A simple, robust, and efficient ensemble method

Application to natural language processing

Abstract

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly