July 2017
Beginner to intermediate
715 pages
17h 3m
English
In this chapter, we will use our running example--building a search engine. In Chapter 2, Data Processing Toolbox, we extracted some data from HTML pages returned by a search engine. This dataset included some numerical features, such as the length of the title and the length of the content.
For the purposes of storing these features, we created the following class:
public class RankedPage { private String url; private int position; private int page; private int titleLength; private int bodyContentLength; private boolean queryInTitle; private int numberOfHeaders; private int numberOfLinks; // setters, getters are omitted }
It is interesting to see if this information can be useful for the search engine. ...
Read now
Unlock full access