O'Reilly logo

On the Efficient Determination of Most Near Neighbors, 2nd Edition by Mark S. Manasse

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

CHAPTER 6

A Few Applications

In this chapter, we look, or in some cases look yet again, at a few of the applications we have made of these and other sampling techniques.

In the first section, we once again consider web page deduplication for a search engine; this time, we delve a little deeper into some of the practical issues involved.

In the second section (Section 6.2), we turn to a different domain in which sampling has been of use: deduplication and single-instance storage of file systems. In this case, our concern is not with annoying the user by presenting multiple nearly identical copies, but with conserving disk space and network bandwidth when dealing with file systems containing copies and differing versions of files.

6.1WEB DEDUPLICATION ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required