#13 Word Histogram (most_common_words.rb)

And now for something that most word processors don’t do: finding the most commonly used words in a document. Like the previous script, it adds an additional “helper” method to an existing built-in class to simplify the job for our new main method. Let’s take a look.

The Code

  #!/usr/bin/env ruby
  #most_common_words.rb

  class Array

❶   def count_of(item)
❷     grep(item).size    The grep Method
❸     #inject(0) { |count,each_item| item == each_item ? count+1 : count } end end ❹ def most_common_words(input, limit=25) freq = Hash.new() sample = input.downcase.split(/\W/) sample.uniq.each do |word| ❺ freq[word] = sample.count_of(word) unless word == '' end ❻ words = freq.keys.sort_by do |word| freq[word] end.reverse.map do ...

Get Ruby by Example now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.