book

Web Caching

Name: Web Caching
Author: Duane Wessels
ISBN: 9781565925366

by Duane Wessels

June 2001

Intermediate to advanced

320 pages

9h 18m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Web Caching
Preface
Audience
What You Will and Won’t Find Here
Caching Resources
Web SitesMailing Lists
Conventions Used in This Book
How To Contact Us
Acknowledgments
1. Introduction
Web ArchitectureClients and ServersProxiesWeb ObjectsResource Identifiers
Web Transport Protocols
HTTPFTPSSL/TLSGopher
Why Cache the Web?
LatencyBandwidthServer Load

Why Not Cache the Web?
Types of Web Caches
Browser CachesCaching ProxiesSurrogates
Caching Proxy Features
Meshes, Clusters, and Hierarchies
Products
2. How Web Caching Works
HTTP RequestsOrigin Server RequestsProxy RequestsNon-HTTP Proxy Requests
Is It Cachable?
Status CodesRequest MethodsExpiration and ValidationCache-controlAuthenticationCookiesDynamic Content
Hits, Misses, and Freshness
Hit Ratios
Validation
Last-modified TimestampsEntity TagsWeak and Strong Validators
Forcing a Cache to Refresh
The no-cache DirectiveThe max-age DirectiveThe min-fresh Directive
Cache Replacement
Least Recently Used (LRU)First In, First Out (FIFO)Least Frequently Used (LFU)SizeGreedyDual-Size (GDS)Other Algorithms
3. Politics of Web Caching
PrivacyAccess LogsMaking Requests Anonymous
Request Blocking
Copyright
Does Caching Infringe?Cases and PrecedentsThe DMCAHTTP’s Role
Offensive Content
Dynamic Web Pages
Java Applets
Content Integrity
Cache Busting and Server Busting
Advertising
Trust
Effects of Proxies
4. Configuring Cache Clients
Proxy Addresses
Manual Proxy Configuration
Configuring Microsoft Internet ExplorerConfiguring Netscape NavigatorNCSA Mosaic, Lynx, and Wget
Proxy Auto-Configuration Script
Writing a Proxy Auto-Configuration FunctionSample PAC ScriptsSetting the Proxy Auto-Configuration Script
Web Proxy Auto-Discovery
Other Configuration Options
The Bottom Line
5. Interception Proxying and Caching
Overview
The IP Layer: Routing
Inline CachesLayer Four SwitchesWCCPCisco Policy Routing
The TCP Layer: Ports and Delivery
LinuxipchainsiptablesFreeBSDOther Operating Systems
The Application Layer: HTTP
Debugging Interception
Issues
It’s Difficult for Users to BypassPacket Transport ServiceRouting ChangesIt Affects More Than Browsers and UsersNo-Intercept ListsAre Port 80 Packets Always HTTP?HTTP Interoperation ProblemsIP Interoperation Problems
To Intercept or Not To Intercept
6. Configuring Servers to Work with Caches
Important HTTP HeadersDateLast-modifiedExpiresCache-controlContent-length
Being Cache-Friendly
Why?LatencyHiding network failuresServer load reductionTen Ways to be Cache-FriendlyApacheThe Expires headerGeneral header manipulationSetting headers from CGI scriptsHow to Choose Expiration Times
Being Cache-Unfriendly
Other Issues for Content Providers
What About Dynamic Responses?What About Advertisements?Getting Accurate Access Counts
7. Cache Hierarchies
How Hierarchies Work
Why Join a Hierarchy?
PerformanceNondefault Routing
Why Not Join a Hierarchy?
TrustLow Hit RatiosEffects on RoutingFreshnessLarge FamiliesAbuses, Real and ImaginedError MessagesFalse HitsForwarding LoopsFailures and Service Denial
Optimizing Hierarchies
8. Intercache Protocols
ICPHistoryFeaturesHit predictionProbing the networkObject data with hitsSource RTT measurementsIssuesDelaysBandwidthFalse hitsUDPNo request methodQueries for uncachable responsesInteroperationUnwanted queriesMulticast ICP
CARP
HTCP
Issues
Cache Digests
Bloom FiltersComparing Digests and ICP
Which Protocol to Use
9. Cache Clusters
The Hot Spare
Throughput and Load Sharing
Bandwidth
10. Design Considerations for Caching Services
Appliance or Software SolutionAppliancesSoftware
Disk Space
Memory
Network Interfaces
Operating Systems
High Availability
Intercepting Traffic
Load Sharing
Location
Using a Hierarchy
11. Monitoring the Health of Your Caches
What to Monitor?
Monitoring Tools
UCD-SNMPRRDToolOther Tools
12. Benchmarking Proxy Caches
MetricsThroughputResponse TimeHit RatioConnection CapacityCost
Performance Bottlenecks
Disk ThroughputCPU PowerNIC BandwidthMemoryNetwork State
Benchmarking Tools
Web PolygraphBlastWisconsin Proxy BenchmarkWebJammaOther Benchmarks
Benchmarking Gotchas
TCP Delayed ACKsPort Number ExhaustionNIC Duplex ModeBad Ethernet CablesFull CachesTest DurationLong-Lived ConnectionsSmall Working SetsClock SyncMSL (TIME_WAIT) Values
How to Benchmark a Proxy Cache
Configure SystemsTest the NetworkNo-Proxy TestFill the CacheRun the Benchmark
Sample Benchmark Results
ThroughputResponse TimeHit RatioOther Results
A. Analysis of Production Cache Trace Data
Reply and Object Sizes
Content Types
HTTP Headers
Client Request HeadersClient Reply Headers
Protocols
Port Numbers
Popularity
Size and Popularity
Cachability
Service Times
Hit Ratios
Object Life Cycle
Request Methods
Reply Status Code
B. Internet Cache Protocol
ICPv2 Message FormatOpcodeVersionMessage LengthReqnumOptionsOption DataSender Host AddressPayload
Opcodes
Option Flags
Experimental Features
PointersObject AdvertisementRequest NotificationObject Removal and InvalidationMD5 Object KeysEliminating URLs from RepliesWiretappingPrefetching
C. Cache Array Routing Protocol
Membership Table
Routing Function
Examples
D. Hypertext Caching Protocol
Message Format and Magic ConstantsHEADERDATAAUTH
HTCP Data Types
COUNTSTRSPECIFIERDETAILIDENTITY
HTCP Opcodes
NOPTSTTST requestTST responseMONMON requestMON responseSETSET requestSET responseCLRCLR requestCLR response
E. Cache Digests
The Cache Digest ImplementationKeysHash FunctionsSizing the FilterSelecting Objects for the DigestFalse Hits and Digest FreshnessExchanging Digests
Message Format
An Example
F. HTTP Status Codes
1xx Intermediate Status
2xx Successful Response
3xx Redirects
4xx Request Errors
5xx Server Errors
G. U.S.C. 17 Sec. 512. Limitations on Liability Relating to Material Online
List of Acronyms
H. Bibliography
Books and Articles
Request For Comments
Index
Colophon

Content preview from Web Caching

Proxy Auto-Configuration Script

The proxy auto-configuration (PAC) technique is designed to fix many of the manual configuration problems described previously. Instead of using static proxy addresses, the browser executes a function for every request. This function returns a list of proxy addresses that the browser tries until the request is successfully forwarded.

The PAC function is written in JavaScript. In theory, any browser that supports JavaScript can also support PAC. Netscape invented the PAC feature, and it was first available in Version 2 of their browser. Microsoft added PAC support to MSIE Version 3.

Both the Netscape and Microsoft browsers retrieve the PAC script as a URL. This is perhaps the biggest drawback to proxy auto-configuration. Setting the PAC URL requires someone to enter the URL in a pop-up window, or the browser must be preconfigured with the URL.

The best thing about proxy auto-configuration is that it allows administrators to reconfigure the browsers without further intervention from the users. If the proxy address changes, the administrator simply edits the PAC script to reflect the change. The browsers fetch the PAC URL every time they are started, but apparently not while the browser is running, unless the user forces a reload.

Another very nice feature is failure detection, coupled with the ability to specify multiple proxy addresses. If the first proxy in the list is not available, the browser tries the next entry, and so on until the end of the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 156592536XCatalog Page Errata

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Web Caching

by Duane Wessels

Proxy Auto-Configuration Script

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.