O'Reilly logo

Building Tag Clouds in Perl and PHP by Jim Bumgardner

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Collecting Genesis Words in PHP

Here is a PHP script, getGenesisTags.php, which collects tags by counting thewords that appear in the book of Genesis in the Bible. The data is retrieved fromthe copy of the book of Genesis at the Project Gutenberg web site. (This script isavailable at http://examples.oreilly.com/tagclouds/ .) Let's see what it does.

<?
//
// Collect text from genesis

function getTags()
{
   global $tags;

The script contains a single function, called getTags(). This function will beinvoked from another script, makeTagCloud.php, which we will invoke later. Thepurpose of the getTags() function is to populate the global associative arraycalled $tags.

$url = 'http://www.gutenberg.org/dirs/etext05/bib0110.txt';

The previous line specifies the URL of the web page we are going to screen-scrape. This particular page contains the text of the book of Genesis. If you'd liketo use some other text, go to the Project Gutenberg web site ( http://www.gutenberg.org/ ) to find what you want.

To see what this text looks like in its raw form, check out the web page we'regrabbing in your browser:

    http://www.gutenberg.org/dirs/etext05/bib0110.txt
    // $txt = file_get_contents($url);
     $ch = curl_init();
     $timeout = 30; // set to zero for no timeout
     curl_setopt ($ch, CURLOPT_URL, $url);
     curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
     curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
     $txt = curl_exec($ch);
     curl_close($ch);

The previous lines retrieves the bible text from the Project Gutenberg web site. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required