I am running a free application and using 1 max idle instance using GAE's Python runtime.
According to http://code.google.com/appengine/docs/adminconsole/instances.html,
Your application's latency has the biggest impact on the number of instances needed to serve your traffic. If you service requests quickly, a single instance can handle a lot of requests.
This seems to suggest that adjusting the slider in 'Application Settings' to minimum latency would be best.
However, according to http://code.google.com/appengine/docs/adminconsole/performancesettings.html#Setting_the_Minimum_Pending_Latency,
it seems like having a high latency is good for preventing load spikes from spinning up new instances.
So is latency basically a tradeoff between ability to respond to request spikes (high latency) vs. number of requests handled over a given time period (low latency)?