Chapter 15. The classic search engine example



What better way to close out the book then the way we started it—with a classic search engine example?

You’re in for a treat. We interviewed Ken Krugler and his team from Bixo labs about their recent Public Terabyte Dataset Project,, and how Tika was a core component of a large-scale series of tests that helped shed some light on variations between languages, charsets, and other content available on the internet.

This chapter will show you even more of Tika in action, especially how you can leverage Tika inside of a ...

Get Tika in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.