Handling SIGTERM in Gunicorn
Creating a custom worker to cleanup application before termination
The problem
I was working on an implementation, a code that would run in Google Cloud Run, it is written in Python and exposes a basic API in Flask. It uses Gunicorn to start the application.
Simple application but it would require a kind of control when memory is exceeded and sends a notification to another application.
The Cloud Run, in its container runtime contract, defines that the application may handle gracefully shutdowns due to normal or forced terminations (Occasions like the container exceeds the memory limits).
Before Cloud Run terminates the instance, it sends signals to all containers. The first signal is SIGTERM, which establishes the first stage of the termination, which has a timeout of 10
seconds. When the time expires, signal SIGKILL is sent, then the container is completely terminated.
You can find a basic implementation of how to handle signals in Google Cloud Run documentation.
The strategy
For the application I was working with, the following strategy was required:
A job request was sent by an application Service A
(third-party) to Service B
(My application).
In case of termination of incomplete jobs, I needed to finish the API request with success and tell another service, Service C
(Monitoring agent), that the jobs were incomplete.
So, I had to:
- Handle signals and notify
Service C
of pending/incomplete jobs. - Close current HTTP requests from
Service A
to the API(Service B
) with a success code(HTTP 200).
The implementation
All this was accomplished by creating a custom Gunicorn worker class for the Service B
application. With the custom class, I was able to handle the SIGTERM and also have access to underlying network sockets.
Just make sure that the routine finishes before the SIGTERM timeout, for Cloud Run, is just
10
seconds.
The basic idea is to inherit from a worker, in this case, SyncWorker
. Here is the implementation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
After implementing the custom worker, you need to start Gunicorn with -k
parameter to the path of the new worker:
1 |
|
Notes
- The implementation is not failure-proof but can be a starting point for distributed systems with a
fallback
system. - We talk here about Cloud Run, but for sure, it can be implemented everywhere that would run a Gunicorn application, here was only the case where I needed such a solution.
Conclusion
When the CustomWorker
, instantiated by Gunicorn, accepts a new client connection, it saves the client in a list. This client instance will be used later on to send HTTP 200.
On every request to Service B
, an id
to identify the job is stored inside the WSGI
application. It is cleaned when the job normally finishes. When the worker receives the signal SIGTERM we do two things:
- Check for incomplete jobs, stored inside the
WSGI
, and notifyService C
. - Try to close any opened client sockets and send them a successful response.
For more details and a complete example check the repository:
Thank you.