Shed Load
Services, microservices, websites, and open APIs all share one characteristic: they have zero control over their demand. At any moment, more than a billion devices could make a request. No matter how strong your load balancers or how fast you can scale, the world can always make more load than you can handle.
At the network level, TCP copes with a flood of connection attempts via the listen queue. Every incomplete connection goes into a queue per port. It’s up to the application to accept the connections. When the queue is full, new connection attempts are rejected with an ICMP RST (reset) packet.
TCP can’t save us entirely, though. Services often fall over before the connection queue fills up. When that happens, it’s almost always ...