October 2015
Beginner to intermediate
400 pages
14h 44m
English
In Section 5.6, we made a simple web crawler
that explored the link graph of the web in breadth-first order.
In this section, we’ll make it concurrent so that independent calls
to crawl can exploit the I/O parallelism available in
the web.
The crawl function remains exactly as it was in
gopl.io/ch5/findlinks3:
func crawl(url string) []string {
fmt.Println(url)
list, err := links.Extract(url)
if err != nil {
log.Print(err)
}
return list
}
The main function resembles breadthFirst (§5.6). As before, a worklist records the queue of items that need processing, each item being a list of URLs to crawl, but this time, instead of representing the queue using a slice, ...