By far the most powerful form of scaling is autoscaling. Under this model, App Engine monitors key application metrics such as requests per second, latency, errors, and resource utilization. As these metrics change, the App Engine scheduler intelligently determines whether to pass additional requests to existing instances, or to scale the service up or down. Beyond these key performance metrics, the App Engine scheduler also takes into account external factors such as request queue depth and application startup time in order to stay ahead of traffic spikes.
For more information on how the App Engine scheduler performs autoscaling, refer to https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed.
Autoscaling ...