January 2018
Intermediate to advanced
340 pages
8h 6m
English
This program prints out a list of all the words used on a web page along with the count of how many times each word appeared in the page. This will search all paragraph tags. If you search the whole body, it will treat all the HTML code as words, which clutters the data and does not really help you understand the content of the site. It trims the spaces, commas, periods, tabs, and newlines from strings. It also converts all words to lowercase in an attempt to normalize the data.
For each paragraph it finds, it will split the text contents apart. Each word is stored in a map that maps the string to an integer count. In the end, the map is printed out, listing each word and how many ...
Read now
Unlock full access