How to scale down Safely and not Interrupt Traffic or Jobs?

When scaling down apps (which includes during deploy, the scale down of the old version), a replica is sent the SIGTERM signal.

If your application stops accepting new requests¹ at this point, then this results in your replica being removed from the load balancer. This means that new requests will not route to the scaling down app. However, any requests that are already on that replica still need to handled by that replica and completed before the next event (SIGKILL).

Then we wait 30 seconds for your app to shutdown. If after 30 seconds, it hasn’t shut down, it is sent SIGKILL and terminated.

If your application doesn’t stop accepting new requests, then you have a race condition for requests around that SIGKILL. This means that you could drop requests. Additionally if you take longer than 30 seconds to respond to requests, then you can also drop requests.

You can test your applications response to this by running a deploy and running requests during the deployment while watching the gigalixir ps command.

If the new pod shows HEALTHY status, and your old pod is still responding to requests, then your app is still responding to requests after the SIGTERM.

¹ This article “Application stops accepting new requests” elaborates the situation and gives some advice on how to gracefully shutdown the pod.