Skip to Content
Big Data
book

Big Data

by Fei Hu
April 2016
Beginner to intermediate
463 pages
18h 53m
English
Auerbach Publications
Content preview from Big Data
Challenges in Crawling the Deep Web
107
increase of P, the cost increases at a faster speed. If we are harvesting as much data as possible
from many data sources, instead of exhaustively siphoning all the data records from one single
data source, Equation 4.2 gives a guideline as for when it is the good time to jump to another
data source for a fixed crawling resource.
Since d can be calculated easily from the crawling history, Equation 4.6 is particularly
useful to estimate how much data have been downloaded and when the crawling process
will stop. Another surprising observation we can make is that large queries induce the same
overlapping rate
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Big Data

Big Data

Bernard Marr
Big Data

Big Data

Kuan-Ching Li, Hai Jiang, Laurence T. Yang, Alfredo Cuzzocrea
Big Data

Big Data

Eglantine Schmitt
Big Data

Big Data

James Warren, Nathan Marz

Publisher Resources

ISBN: 9781498734875