book

SEO Warrior

Name: SEO Warrior
Author: John I Jerkovic
ISBN: 9781449383077

by John I Jerkovic

November 2009

Beginner

496 pages

13h 46m

English

O'Reilly Media, Inc.

Read now

Unlock full access

SEO Warrior
SPECIAL OFFER: Upgrade this ebook with O’Reilly
Preface
Who This Book Is For
How This Book Is Organized
Conventions Used in This Book
Using Code Examples
We’d Like to Hear from You
Safari® Books Online
Acknowledgments
1. The Big Picture
SEO BenefitsSERP Real EstatePopular keywordsNiche keywordsThe Trust FactorThe Golden TriangleLower Cost of Ownership

SEO Challenges
CompetitionNo GuaranteesRanking FluctuationsTime FactorsOrganizational StructureBig companies and organizationsVirtual teamsOutsourcingLarge, complex sitesSmall companies and individuals
The SEO Process
The Research PhaseBusiness researchCompetitor analysisCurrent state assessmentKeyword researchOutput of the research phaseThe Planning and Strategy PhaseContent strategyLink-building strategySocial media strategySearch engine targeting strategySEM strategyTechnical strategyOutput of the planning and strategy phaseThe Implementation PhaseInternal optimizationExternal optimizationOutput of the implementation phaseThe Monitoring PhaseWeb spider activityWebsite referralsSearch engine rankingsWebsite trafficConversionsOutput of the monitoring phaseThe Assessment PhaseOutput of the assessment phaseThe Maintenance PhaseOutput of the maintenance phase
SEO Alternatives
Paid Ads (or Links)Traditional Marketing
Summary
2. Search Engine Primer
Search Engines That Matter Today: GoogleYahoo!Bing
Types of Search Engines and Web Directories
First-Tier Search EnginesSecond-Tier Search EnginesRegional Search EnginesTopical (Vertical) Search EnginesWeb Spider–Based Search EnginesHybrid Search EnginesMeta Search Engines
Web Directories
Search Engine Anatomy
Spiders, Robots, Bots, and CrawlersSearch engine web page viewerThe Search (or Query) InterfaceSearch Engine Indexing
Search Engine Rankings
Summary
3. Website Essentials
Domain Name OptionsDomain Name NamespacesGeneric top-level domainsCountry code top-level domainsCountry code second-level domainsBuying Domain NamesDomain name sizeKeyword-rich domain namesNonsensical domain namesDomain registration periodTapping into expired domain namesBuying existing domainsUtilizing the unsolicited approachDomain name resellersParking domainsTransferring domainsRenewing domains
Hosting Options
Choosing PlatformsOperating systemsWeb and application serversSelecting database serversSelecting the development platformHosting TypesFree hostingShared hostingDedicated server hostingCollocation hostingComanaged and managed hostingInternal hosting
Custom Site Design or Third-Party Software
Employing Custom DevelopmentBenefits of custom developmentDisadvantages of custom developmentSite page layoutsSEO-friendly site layoutBuilding a dynamic site skeletonDealing with dynamic linksUtilizing Free or Paid SoftwareAdvantages of using third-party softwareDisadvantages of using third-party softwareFree softwarePaid software
Website Usability
General ConsiderationsLinking ConsiderationsKnow Your DemographicsEnsure Web Browser CompatibilityCreate Simple Page LayoutsUse Intelligent Page FormattingCreate Smart HTML FormsOptimize Your Site for SpeedTest Your Interface Design for Usability
Website Accessibility
Summary
4. Internal Ranking Factors
Analyzing SERPs
On-Page Ranking Factors
Keywords in the <title> TagTitles in search resultsTitle keywords in the page copyKeywords in the Page URLKeywords in the Page CopyKeywords in the <meta> Description TagSearch engine alternative to <meta> description tagsKeywords in the Heading TagsKeyword ProximityKeyword ProminenceKeywords in the Link Anchor TextQuality Outbound LinksKeywords in outbound linksWeb Page AgeWeb Page SizeCalculating the optimum number of words per page
On-Site Ranking Factors
Domain Name KeywordsExact keyword matchingPartial keyword matchingSize or Quantity of ContentEstimating size of contentLinking ConsiderationsInternal link architecturePagination problemsDistributed internal link popularityURL canonicalizationFreshness of Pages
Putting It All Together
Running the ScriptProgram directory structureFinal HTML ReportReport metrics summary
Summary
5. External Ranking Factors
External LinksKnow Your ReferrersUtilizing Yahoo! Site ExplorerParsing the TSV fileQuantity and Quality of External LinksSpeed of backlink accumulationTopical link relevanceBacklinks from expert sites and the Hilltop AlgorithmBacklinks from .edu and .gov domainsBacklinks from directoriesAge of backlinksRelative page positionSpilling PageRank effectRelative popularity among peer sites
Broken Outbound Links
Handling Broken LinksRunning linkchecker.pl
User Behavior Patterns
Analyzing the Search Engine Query InterfaceFirst point of contact: the search formInteractions with SERPsGoogle AnalyticsGoogle ToolbarUser Behavior Lessons
Website Performance and Website Age
Website PerformanceMonitoring website performanceWebsite AgeDomain registration yearsExpired domainsPlan for long-term benefits
Summary
6. Web Stats Monitoring
Web Server Logs and FormatsNCSA Common FormatNCSA Combined FormatNCSA Combined Format in ApacheConverting IIS W3C to NCSA CombinedSpotting a Web Spider in Web Logs
Popular Web Stats Tools
Using WebLog Expert
Number of VisitorsUnique Versus Total VisitorsNumber of HitsNumber of Page ViewsReferrersSearch Engine (Referral) HitsSearches/KeywordsWeb Spider Hits
Using AWStats
Using Webalizer
Tying Searches to Individual Web Pages
Web Spider Patterns
User Patterns
Filtering Specific Data
Types of Web Page ElementsConversion tracking with web server logs
Summary
7. Google Webmaster Tools and Google Analytics
Google Webmaster ToolsWebmaster Tools SetupDashboardThe “Site configuration” SectionSitemapsCrawler accessSitelinksChange of addressSettingsThe “Your site on the web” SectionTop search queriesLinks to your siteKeywordsInternal linksSubscriber statsThe Diagnostics SectionCrawl errorsCrawl statsHTML suggestions
Google Analytics
Installation and SetupNavigating Google AnalyticsDashboardVisitors Overview pageBenchmarkingMap Overlay pageNew versus returning visitorsLanguagesVisitor trendingVisitor pagesBrowser capabilitiesNetwork propertiesTraffic SourcesContentGoalsDefining goalsDefining funnelsViewing goal statsGoogle Analytics ShortcomingsBased on JavaScriptMetrics accuracy
Summary
8. Search Engine Traps
JavaScript TrapsJavaScript-Generated ContentJavaScript Dynamic Links and MenusAjax
Dynamic Widget Traps
Using FlashGoogle’s support of Adobe FlashUsing Java AppletsUsing ActiveX Controls
HTML Traps
Using FramesUsing IframesUsing External DIVsUsing Graphical TextExtremely Large PagesComplex HTML and Formatting Problems
Website Performance Traps
Very Slow PagesWeb server compressionWeb page caching
Error Pages
Session IDs and URL Variables
Splash or Doorway Pages
Robots.txt
Summary
9. Robots Exclusion Protocol
Understanding REPCrawling Versus IndexingWhy Prohibit Crawling or Indexing?New sitesContent duplicationREP and document securityProtecting directories with .htaccessWebsite maintenanceSaving website bandwidthPreventing website performance hits
More on robots.txt
Creation of robots.txtValidation of robots.txtPlacement of robots.txtImportant CrawlersUnderstanding the robots.txt FormatRobots.txt DirectivesThe Allow directiveThe Disallow directiveThe wildcard directivesThe Sitemap location directiveThe Crawl-delay directiveCase SensitivityCommon robots.txt ConfigurationsDisallowing image crawlingAllowing Google and Yahoo!, but rejecting all othersBlocking Office documentsBlocking Internet ArchiverSummary of the robots.txt Directive
Robots Meta Directives
HTML Meta DirectivesMixing HTML meta directivesTargeting HTML meta tagsYahoo!-specific directivesGoogle-specific directivesHTTP Header Directives
The nofollow Link Attribute
Dealing with Rogue Spiders
Reverse DNS Crawler Authentication
Summary
10. Sitemaps
Understanding SitemapsWhy Use Sitemaps?Crawl augmentationPoor linking site structureCrawling frequencyContent ownershipPage priorityLarge sitesHistory of changes
HTML Sitemaps
HTML Sitemap GeneratorsCreating a custom HTML Sitemap generator
XML Sitemaps
XML Sitemap FormatUnderstanding <loc>Understanding <lastmod>Understanding <changefreq>Understanding <priority>XML Sitemap ExampleWhy Use XML Sitemaps?XML Sitemap Auto-DiscoveryMultiple XML SitemapsSitemap Location and NamingXML Sitemap LimitationsXML Sitemap GeneratorsXML Sitemap ValidatorsXML Sitemap SubmissionsUsing Google Webmaster Tools to submit SitemapsUsing the ping method to submit SitemapsAutomating ping submissions
Utilizing Other Sitemap Types
Pure Text (URL Listing) SitemapsNews SitemapsRSS and Atom SitemapsMobile SitemapsVideo Sitemaps
Summary
11. Keyword Research
Keyword StrategyLong Tail KeywordsLong tail keywords explained
Keywords and Language
The Importance of Word StemmingKeyword ModifiersTypes of modifiersNiche modifiersKeyword combinationsLatent Semantic Indexing (LSI)Page and site theme
Keyword Research Process
Establish a Current BaselineCompile a Draft List of Keywords You Wish to TargetKeyword brainstormingMake use of localized termsUtilize keyword stemmingMake use of generic keyword modifiersContinue by finding related keywordsFinding related keywordsMake use of Microsoft WordUsing search engine keyword suggestions and related searchesGoogle keyword tools and resourcesGoogle AdWords Keyword ToolGoogle SetsGoogle TrendsGoogle Insights for SearchMicrosoft adCenter LabsEntity association graphKeyword group detectionKeyword mutation detectionKeyword forecastSearch funnelsYahoo! keyword toolsYahoo! ResearchYahoo! Search MarketingAdditional keyword research toolsFree keyword research toolsCommercially available keyword research toolsEvaluate Your KeywordsEstimating keyword competition, revisitedEstimating keyword search volumeFinalize Your Keyword ListImplement Your Strategy
Summary
12. Link Building
Precursors to Link BuildingStart Building Your Reputation EarlyAssess Your Current SituationEmulate Your CompetitorsNatural Link AcquisitionLink Sources and Link Quality
Elements of Link Building
Basic ElementsTake out the guessworkRun a daily, weekly, or monthly email newsletterProvide registered servicesLink BaitWebsite widgets in detailPopular website widgetsCreating custom website widgetsWidget promotion and distributionSocial BookmarkingIntegrating social bookmarks on your siteTracking your bookmarksStay away from too many bookmarking iconsWebsite SyndicationUnderstanding syndication formatsFeed readersUnderstanding FeedBurnerFeedBurner featuresIntegrating FeedBurner with your siteFuture of FeedBurnerDirectoriesStart with the top general directoriesContinue with niche directoriesConsider regional directoriesAdding Your Links EverywhereSubmitting articlesUtilizing blog comments, newsgroups, and forum postingsBuild a Complementary SiteNiche directory hubsAwards websitesSite review websitesSite software
Summary
13. Competitor Research and Analysis
Finding Your CompetitionKeyword-Based Competitor ResearchThe manual approachUtilizing SEO tools and automationFinding Additional Competitor KeywordsUsing the Google AdWords Keyword Tool Website Content featureUsing Alexa keywordsUsing Compete keywordsAdditional competitive keyword discovery toolsCompetitor Backlink ResearchBasic ways to find competitor backlinksFree backlink checkersCommercially available backlink checkers
Analyzing Your Competition
Historical AnalysisWeb Presence and Website Traffic AnalysisNumber of sitesFinding subdomainsHosting and ownership informationGeographical location informationEstimating domain worthDetermining online sizeEstimating Website TrafficAlexa ReachCompete traffic estimatesEstimating Social Networking PresenceTechnorati blog reactions and AuthorityDigg.com (URL history)Additional tips
Competitor Tracking
Current State Competitor AuditingFuture State TrackingExisting competitor trackingNew competitor detectionAutomating Search Engine Rank Checking
Summary
14. Content Considerations
Becoming a ResourcePredictive SEOFuture events, buying cycles, and buzz informationFuture eventsBuying cyclesBuzz informationShort-Term ContentUnexpected buzzExpected buzzLong-Term ContentContent BalanceOrganic contentContent Creation MotivesEngaging your visitorsFortifying your web authorityUpdating and supplementing existing informationCatching additional traffic
Content Duplication
Canonical Link ElementWhat is a canonical link?Canonical link element formatThe Catch-22Possibility of an infinite loopMultiple URLsTrailing slashMultiple slashesWWW prefixDomain misspellingsHTTP to HTTPS and vice versaFine-Grained Content IndexingExternal Content DuplicationMirror sitesContent syndicationSimilar PagesDeep-Linked ContentSitemapsResurfacingNavigation structureProtecting Your ContentPreventing hot links
Content Verticals
Vertical SearchGoogle ImagesGoogle NewsGoogle Product Search
Summary
15. Social Networking Phenomenon
Social Platforms and CommunitiesBlogsTwitterTwitter tipsTwitter toolsOvercoming limitationsContent-Sharing SitesYouTubeYouTube tools and resourcesFlickrFlickr tipsFlickr tools and resourcesPodcastsPodcast.comITunes podcast publishingSocial Bookmarking SitesDiggStumbleUponSocial Networking SitesFacebookFacebook account typesFacebook tipsMySpaceLinkedIn
Social Media Strategy
Do Some ResearchUnderstand the benefitsUnderstand the risksUnderstand the processFormulate Your StrategyImplement Your StrategyReevaluate your strategy
Using Automation
Creating a Twitter SchedulerCreating the databaseBuilding the interfaceSending tweetsScheduling tweetsExtending the application
Google and Social Media Sites
Real-Time Search
Twitter Real-Time SearchOneRiot Real-Time Search
Summary
16. Search Engine Marketing
The World of SEMPPC PlatformsPPC platform selectionPPC FundamentalsKnow your variablesThe SEM ProcessThe planning and research phaseThe content creation phaseThe campaign creation phaseThe campaign monitoring and analysis phaseThe campaign refinements phase
Google AdWords
AdWords SetupCreating a Google accountNavigating through AdWordsCampaign SetupCreating a campaignCreating an ad groupKeyword Match TypesBroad matchPhrase matchExact matchNegative matchAd SetupAnatomy of a text adAd copyDynamic keyword insertion and keyword capitalizationAdWords TestingConversion and conversion rateImproving conversions and ROIA/B testingMultivariate testingAdWords TipsObserve and track your competitionExperimentRefine your ad copyTry other platformsBiddingKeywordsGoogle Content Network or Google SERPsConversion rate
Google AdSense
AdSense SetupAdSense EarningsAdSense Website SetupAdSense TipsUnique contentSeamless blendingStrategic placement
SEM and SEO Unison
SEO Keyword Tuning with PPC TestingChoosing Better Keywords
Summary
17. Search Engine Spam
Understanding Search Engine SpamWhat Constitutes Search Engine Spam?Google guidelinesYahoo! guidelinesBing guidelinesSearch Engine Spam in DetailKeyword stuffingHidden or small textThe <noscript> elementThe text font color schemeCSSTiny textCloakingDoorway pagesScraper sitesLink farms, reciprocal links, and web ringsHidden linksPaid linksBlog, forum, and wiki spamAcquiring expired domainsAcquiring misspelled domain namesSite hackingWhat If My Site Is Penalized for Spam?Requesting Google site reevaluationRequesting Yahoo! site reevaluationRequesting Bing site reevaluation
Summary
18. Industry Buzz
BingThe Keyword Dashboard Tool
SearchWiki
SearchWiki in ActionBenefits of SearchWikiAddressing SearchWiki Concerns
The nofollow Link Attribute
FormatFurther Thoughts
Finding the Buzz
SEO-Related Sites Provided by Search EnginesBlog Sites
Summary
A. Script Listings
Chapter 2spiderviewer.php
Chapter 3
layout1.htmllayout2.htmllayout3.html
Chapter 4
rankingfactors.pl
Chapter 5
linkchecker.plmymonitor.plinlinksAnalysis.pl
Chapter 6
searchPhraseReportGoogle.pl
Chapter 13
getRankings.pl
Chapter 15
sql.txtconfig.phpindex.phpadd.phpdelete.phpsendTweet.phpCrontab
Chapter 18
index.htmlbParser.phpgParser.phpyParser.php
B. Ping Servers
Ping Server List
C. Programming Environment
Building Your Own EnvironmentApache Web ServerPerlPHPMySQL
Utilizing Distribution Packages
Index
About the Author
Colophon
SPECIAL OFFER: Upgrade this ebook with O’Reilly

Content preview from SEO Warrior

Search Engine Anatomy

When I talk about important search engines, I am really talking about the “big three”: Google, Bing, and Yahoo! Search. At the time of this writing, all of these search engines are using their own search technologies.

Web spider–based search engines usually comprise three key components: the so-called web spider, a search or query interface, and underlying indexing software (an algorithm) that determines rankings for particular search keywords or phrases.

Spiders, Robots, Bots, and Crawlers

The terms spider, robot, bot, and crawler represent the same thing: automated programs designed to traverse the Internet with the goal of providing to their respective search engine the ability to index as many websites, and their associated web documents, as possible.

Not all spiders are “good.” Rogue web spiders come and go as they please, and can scrape your content from areas you want to block. Good, obedient spiders conform to Robots Exclusion Protocol (REP), which we will discuss in Chapter 9.

Web spiders in general, just like regular users, can be tracked in your web server logs or your web analytics software. For more information on web server logs, see Chapters 6 and 7.

Web spiders crawl not only web pages, but also many other files, including robots.txt, sitemap.xml, and so forth. There are many web spiders. For a list of known web spiders, see http://www.user-agents.org/.

These spiders visit websites randomly. Depending on the freshness and size of your website’s content, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9780596804749Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

SEO Warrior

by John I Jerkovic

Search Engine Anatomy

Spiders, Robots, Bots, and Crawlers

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.