This section focuses on design considerations related to infrastructure, including scalability, caching, and rate limiting.
How big should your API infrastructure be? The rule of thumb is “Design for the dream audience but provision for the expected load.” Few API programs go from zero to 500 million daily requests overnight, and it’s unwise to approach management with large budget requests unless there is a strong business case for that expense or strong evidence that the API will scale to justify that upfront cost. Even Twitter in its early days had many scaling problems (resulting in many outages), but the service was so compelling and unique that users put up with it while the system was unstable.
A growing number of new websites and APIs run on a scalable cloud platform, most often Amazon EC2, where the size of the initial hardware order is irrelevant, because it can be scaled, up or down, quickly and as needed (sometimes dynamically, depending on how it’s configured). Most API providers will do perfectly well running their system in the cloud. That said, most of the biggest APIs (such as Google, Yahoo!, and Facebook) run out of data centers and use hardware that they manage directly.
Some companies have established data centers and very specific requirements around latency or audit compliance that can be satisfied only by having physical access to the hardware the API runs on. Many API providers’ biggest issue with latency ...