Resilient Kubernetes Deployments: PreStop Hooks and Application Signals
Quick Points
- Kubernetes is full of tools to use to make your deployments rock solid
- However, these tools are only useful if your application is built to take advantage of them
- PreStop Hooks allow your program to try and finish any work it needs to before it is shut down
- Handling
SIGKILLandSIGTERMevents properly for your application can increase application stability
Outline
Kubernetes Container Lifecycle
When a pod is marked for deletion, either by a kubectl command or via some other means
(deployment rollout, HPA scale down, etc) a number of things start to happen.
- The Kubernetes API updates the pod state
- Kubernetes determines if it should remove the pod from any service endpoints
- Any
preStophooks defined on the pod are run preStophook completes orterminationGracePeriodSecondspasses- A
SIGTERMsignal is sent to the pod - If after a grace period (defaults to 30s) the pod is still running, a
SIGKILLsignal is sent
Because of this flow, we can leverage both the preStop hook configuration and build our application
to handle SIGTERM and SIGKILL signals to ensure our users are not interrupted during
deployment rollouts or when a pod shuts down during a scaling event.
Define a PreStop Hook
Below is a minimal deployment manifest for configuring a preStop hook. All it does is force the pod to sleep for 20 seconds and delay the shut down. Oftentimes, this can be enough to allow your pod to finish the work it’s doing and not impact users.
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-name
labels:
app: example-app-label
spec:
selector:
matchLabels:
app: example-app-label
template:
metadata:
labels:
app: example-app-label
spec:
terminationGracePeriodSeconds: 60
containers:
- name: example-container-name
image: example-image-name:latest
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- sleep 20
But you can also run a script you have included in your image:
exec:
command:
- /bin/sh
- -c
- ./custom-shutdown-script.sh
or make a curl request to an endpoint of your app:
exec:
command:
- /bin/sh
- -c
- |
curl -X POST localhost:8080/custom-shutdown-endpoint
sleep 20
All of these are valid and which you pick just depends on the type of application, what a “graceful shutdown” means and the tools you need to use to do it. If your application depends on some process you don’t control (e.g. nginx), you can use this shutdown hook to run any scripts needed to shut that process down. An example of this can be found in the ingress-nginx deployment. The deployment specifies a preStop hook below:
preStop:
exec:
command:
- /wait-shutdown
which eventually hits some Go code and executes this line
err := exec.Command("bash", "-c", "pkill -SIGTERM -f nginx-ingress-controller").Run()
So the ingress controller process receives the SIGTERM signal before the pod does and can shut down
gracefully. The preStop hook is the last chance to prepare your application to be shut down.
SIGTERM and SIGKILL
As long as your application shuts down all right, SIGTERM will be the last signal it receives before returning 0. Below are examples for a few languages on how this could be done.
C# APIs (terminal apps can use PosixSignalRegistration instead)
//Use IHostApplicationLifetime to handle the ApplicationStopping event.
public class MyService
{
public MyService(IHostApplicationLifetime appLifetime)
{
appLifetime.ApplicationStopping.Register(() =>
{
// Code to run when SIGTERM is received
Console.WriteLine("SIGTERM received. Cleaning up...");
});
}
}
Node
import { Server } from 'http';
const server: Server = /* your server config here*/;
function customShutdown() {
console.log('Shut me down please');
server.close(() => {
// Close any other connections or dispose of other resources here
process.exit(0);
});
// Force close the server after 5 seconds
setTimeout(() => {
console.error('Timeout hit. Forcefully shutting down');
process.exit(1);
}, 5000);
}
process.on('SIGTERM', customShutdown);
Python
import signal
import sys
shuttingDown = False
def shutdownRequested(signum, frame):
global shuttingDown = True # Signal main loop to stop
signal.signal(signal.SIGTERM, shutdownRequested)
def myApplication():
while not shuttingDown:
# Your application logic goes here
print("Graceful shutdown complete")
sys.exit(0)
if __name__ == "__myApplication__":
myApplication()
Each language has its own way to bind to these signals and not every language has a way to bind to both SIGTERM and SIGKILL. But doing the research is worth it to ensure that DB connections
are closed, files are closed/saved, and any other housekeeping tasks are completed by your
application when it is being terminated.
Conclusion
By combining preStop hooks with custom SIGTERM and SIGKILL logic, you can help ensure your application handles the saving of user data properly, keeps connections from dropping, and performs
any cleanup necessary to keep your application and your business running well. Keep it simple, many
applications don’t really need any custom logic or anything crazy to shut down gracefully. But these
are just another set of tools that can be used when problems start to occur as you scale.