When a server’s IP changes during runtime, HAProxy does not resolve the hostname again when using external Python health check scripts. It will hold on to the old IP forever.
Here’s a timeline of events that triggers an issue for us:
- HAProxy starts up and resolves the name of server.com to the IP 123.123.123.123. Backend is marked as UP.
- The backend runs a python external health check script for the server every 10 seconds. Python resolves server.com by itself, does it’s tests against the server and keeps it UP.
- The server.com IP changes to 234.234.234.234.
- Python health checks resolve to the new IP, run their tests and everything works fine. Server kept UP.
- A request we want to route to server.com comes from a client to HAProxy. HAProxy still has the old 123.123.123.123 IP configured. The request is routed there and we get a 404 as our expected service is no longer there (but it’s still a valid IP with a response). The 404 is returned to the client.
- Python health check runs again, resolves to the new IP again and passes the checks. Server kept UP.
We haven’t found a way to force HAProxy to resolve names again at set intervals. Instead it will hold on to valid IPs until either restarted or reloaded.
Do you know any approach we could utilize here to make HAProxy to re-resolve a hostname and take the new IP into use even if the old IP is still functional?
Running our own custom health check scripts is a hard requirement.
Here are some of our related HAProxy configuration snippets. We are using HAProxy 2.9.
resolvers default
parse-resolv-conf
hold other 15s
hold refused 15s
hold nx 15s
hold timeout 15s
hold valid 10s
hold obsolete 15s
backend server.com
option external-check
external-check command /health-check.py
server server.com server.com:443 init-addr libc,none
Looking forward to any recommendations!
4 posts - 2 participants