By Paul Bausch
Price: $24.95 USD
£17.50 GBP
Cover | Table of Contents | Colophon
ancient greece.
http://www.yahoo.com), the My Yahoo! portal
[Hack #34]
, and Yahoo! Mail
[Hack #52]
. The bar also indicates your login status by displaying your Yahoo! ID
[Hack #3]
or Guest, along with links that let you sign in to Yahoo! or sign out. You can also click the Help link at the far right of the navigation bar to read documentation about the site.ancient greece.
http://www.yahoo.com), the My Yahoo! portal
[Hack #34]
, and Yahoo! Mail
[Hack #52]
. The bar also indicates your login status by displaying your Yahoo! ID
[Hack #3]
or Guest, along with links that let you sign in to Yahoo! or sign out. You can also click the Help link at the far right of the navigation bar to read documentation about the site.http://search.yahoo.com) is deceptively simple. You can type in any word or phrase and find matches in documents across the Web. The trade-off for this simplicity is having to look through hundreds, thousands, or millions of results to find the documents that are actually useful to you. By understanding how Yahoo! expects queries to be phrased, you can limit the results to include only those documents most relevant to you—saving you the time of looking through extraneous results.grammar into the search form, Yahoo! will return documents that contain the word grammar. A search for grammar school will return documents that contain both words somewhere within the document, but not necessarily together.grammar school" will return documents that contain the complete phrase grammar school. You can combine keyword and phrase searches. To find documents that contain the phrase grammar school and also have the word Oregon somewhere in the document, you could search for "grammar school" Oregon.OR between words. A search for grammar OR primary will return documents that contain either grammar or primary, but not necessarily both words.Oregon school returns too many pages for schools in the city of Portland, you could type Oregon school -Portland to exclude any pages with the word Portland from the results.http://search.yahoo.com, you find yourself in front of the search form, about to type. What's the best query? If you were asking a human being for the answer you might be tempted to type in a complete question: what is the time in London?.time in London and you'll find the current time in London above the search results, as shown in Figure 1-3.
http://login.yahoo.com and click Sign Up Now for the new account form.http://login.yahoo.com and entering your Yahoo! ID and password. From there, browse to http://www.yahoo.com or http://search.yahoo.com and look for the Preferences link to the right of the search form, like the one highlighted in Figure 1-7.
http://search.yahoo.com/preferences. From the Preferences page, you can set a number of options that Yahoo! will remember and apply to any search results in the future.http://search.yahoo.com, Yahoo! offers an Advanced Web Search form at http://search.yahoo.com/web/advanced. This form lets you refine your search in a number of ways, so you can narrow the results to a more useful list.astronomy into the search form, and find hundreds of sites related to the word. But if you want only a segment of those results, you can browse over to the Advanced Web Search form, type astronomy, and limit the results by top-level domain, as shown in Figure 1-9.
astronomy across .gov sites returns only pages at NASA's web site. The same search limited to .edu sites results in astronomy programs at various universities, and limiting to .com gives you astronomy magazines at the top of the results.http://search.yahoo.com/search?_adv_prop=web&x=op&ei=UTF-8&va=astronomy&va_vt=any&vp_vt=any&vo_vt=any&ve_vt=any&vd=all&vst=.gov&vs=.gov&vf=all&vm=p&
fl=0&n=20
http://tools.search.yahoo.com/language) has some ways to help you work with other languages. Among them is a translation service that will translate any block of text to a different language. I copied the Russian text from Figure 1-10, pasted it into the text area labeled "Translate this web text," chose "From Russian to English" from the drop-down list of languages, and clicked Translate. Yahoo! responded with this:Radio Dials. The gallery of the photographs of ancient, I will not be
afraid this word, radios-scale. The author of collection, photographer
Paul Bausch, decided thus to publish the paternal collection of radio
receivers. 3x, dreams about their own tsifrozerkalke with the
macro-objective become increasingly more importunately.
http://bookmarks.yahoo.com) and the option to download the latest Yahoo! Toolbar (http://greasemonkey.mozdev.org and click the Install Greasemonkey link. Follow the Software Installation prompts and then restart your browser. You'll know the plug-in is working if you see a small monkey icon in the lower-right corner of Firefox. Once installed, you can move on to analyzing Yahoo! and building the Greasemonkey script.yschttl. Yahoo uses this for styling the links with CSS, but you can use it to find the links in the first place. A single XPath query can extract a list of all the links with the class yschttl, and the first one of those is the one we want to prefetch and cache.http://rds.yahoo.com/S=2766679/K=gpl+compatible/v=2/SID=e/TID=F510_112/l=WS1/R=2/IPC=us/SHE=0/H=1/SIG=11sgv1lum/EXP=1116517280/*-http%3A//
australian shepherd—you'll find that the top few sites are the same across both Yahoo! and Google, but the two search engines quickly diverge into different results. At the time of this writing, both sites estimate exactly 1,030,000 total results for this particular query, but estimated result counts are sometimes a way to spot differences between the sites.http://video.yahoo.com (or type video search! into any Yahoo! Search form), enter a word or phrase, and click Search Video. Say you're interested in learning more about NASA's robotic vehicle for exploring Mars and you'd like to see the rover in action. You can find thumbnails of videos from across the Web by searching with the phrase mars rover, as shown in Figure 1-30.
http://video.search.yahoo.com/video/advanced). You can use the advanced search form to limit results to specific video formats, one of three sizes of videos— small, medium, or large—and videos that are longer or shorter than one minute. As with the Web Search, you can also limit your search to a specific site or adjust the SafeSearch features for the search.
http://toolbar.yahoo.com and click the orange Download button. From there, you'll find a page with instructions about downloading and installing the toolbar. At the time of this writing, the toolbar is available only for Internet Explorer, but there is a beta (i.e., testing) version available for Mozilla Firefox. Because the program is a browser extension rather than a traditional application, the download and installation will happen within the browser window. You'll need to approve some security requests along the way, and Yahoo! has laid out all of the steps to take on its site. Firefox requires you to restart the browser to see the toolbar, but Internet Explorer doesn't.http://www.mozilla.org/products/firefox), you're probably already aware of the useful search box in the upper-right corner. From any page, at any time, you can simply type a query into the box and press Enter to bring the search page up in the browser. Though Google is the default search engine, you can click the arrow to choose another search engine, as shown in Figure 1-43.
http://mycroft.mozdev.org/quick/yahoo.html) full of over 30 different Yahoo!-related searches you can add to the Firefox search box. These are searches that others have found useful and decided to share with the larger Mozilla community (Mozilla is the technology behind Firefox). The specialty Yahoo! searches include everything from searching Yahoo! Auctions and searching Yahoo! in different countries, to the Yahoo! Oxford Shakespeare reference. If you find yourself constantly looking for pithy quotes from The Tempest, adding this option to the Firefox search box could be the stuff dreams are made of.Gwen Stefani more often than for Britney Spears, that shows a shift in interest or popularity.http://buzz.yahoo.com). When you browse the Buzz Index, you'll find several Top Movers Charts, separated into categories such as TV, Music, Sports, Movies, Actors, Video Games, and overall queries. Each chart has the top 15 search queries with the greatest percentage increase for that particular day. These charts are a quick snapshot of which queries are gaining the most ground.
American Idol on that day. With millions of users, that small percentage means thousands of people.http://buzz.research.yahoo.com). But even if you can't tell an iPod from a Typepad, the Yahoo! Buzz Game can help you make sense of the technology landscape.http://local.yahoo.com/results;_ylt=AvyPaC0wOiCme6J1PYb56tSHNcIF;_ylu=X3oDMTBtbGZ2dXFpBF9zAzk2NjEzNzY3BHNlYwNzZWFyY2g-?stx=coffee&
csz=Sebastopol%2C+CA&fr=
?stx=coffee looks important, as does csz=Sebastopol%2C+CA. But the rest of the URL looks like gibberish.http://local.yahoo.com/results?stx=coffee&csz=Sebastopol%2C+CA
http://local.yahoo.com/results?stx=coffee&csz=95472
http://privacy.yahoo.com.http://privacy.yahoo.com/privacy/us/adservers/details.html).http://search.news.yahoo.com/search/news/?p=Yahoo%21
http://news.search.yahoo.com/news/rss?ei=UTF-8&p=Yahoo%21
http://finance.yahoo.com/q?s=YHOO
http://finance.yahoo.com/q/h?s=YHOO
http://finance.yahoo.com/rss/headline?s=YHOO
http://www.ysearchblog.com
http://www.ysearchblog.com/index.xml
http://developer.yahoo.net/blog
http://developer.yahoo.net/blog/index.xml
if/ elsif/else sections that are hard to maintain and need to be rewritten every time Yahoo! makes a small change to one of its sites. If you follow that route, you will soon discover that you need to write hundreds of lines of code to describe every kind of behavior you want to build into your spider.http://search.yahoo.com with the query political weblog. You would find political weblogs in the search results, along with news articles about political weblogs, college papers about political weblogs, and even pages that just mention the terms political and weblogs. But browsing the Political Weblogs category in the Yahoo! Directory (http://dir.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Weblogs/Politics) will give you hundreds of links that have been selected by Yahoo! employees as being political weblogs.http://dir.yahoo.com/new), along with the Picks of the Day.
#!/usr/bin/perl -w
use strict;
use Date::Manip;
use LWP::Simple;
use Getopt::Long;
$ENV{TZ} = "GMT" if $^O eq "MSWin32";
# the homepage for Yahoo!'s "What's New".
my $new_url = "http://dir.yahoo.com/new/";
# the major categories at Yahoo!. hash'd because
# we'll use them to hold our counts string.
my @categories = ("Arts & Humanities", "Business & Economy",
"Computers & Internet", "Education",
"Entertainment", "Government",
"Health", "News & Media",
"Recreation & Sports", "Reference",
"Regional", "Science",
"Social Science", "Society & Culture");
my %final_counts; # where we save our final readouts.
# load in our options from the command line.
my %opts; GetOptions(\%opts, "c|count=i");
die unless $opts{c}; # count sites from past $i days.
# if we've been told to count the number of new sites,
# then we'll go through each of our main categories
# for the last $i days and collate a result.
# begin the header
# for our import file.
my $header = "Category";
# from today, going backwards, get $i days.
for (my $i=1; $i <= $opts{c}; $i++) {
# create a Data::Manip time that will
# be used to construct the last $i days
my $day; # query for Yahoo! retrieval.
if ($i == 1) { $day = "yesterday"; }
else { $day = "$i days ago"; }
my $date = UnixDate($day, "%Y%m%d");
# and this date to
# our import file.
$header .= "\t$date";
# and download the day.
my $url = "$new_url$date.html";
my $data = get($url) or die $!;
# and loop through each of our categories.
my $day_count; foreach my $category (sort @categories) {
$data =~ /$category.*?(\d+)/; my $count = $1 || 0;
$final_counts{$category} .= "\t$count"; # building our string.
}
}
# with all our counts finished,
# print out our final file.
print $header . "\n";
foreach my $category (@categories) {
print $category, $final_counts{$category}, "\n";
}