@tjungblut wrote:
Hello community,
We’re using HAProxy in Kubernetes as a sticky load balancer in front of a deployment of five total pods (real HAProxy, not the ingress controller version of it). Here’s our config:
global daemon maxconn 10000 stats socket /usr/local/etc/haproxy/admin.sock mode 600 level admin log /dev/log local0 defaults mode http timeout connect 5000ms timeout client 30000ms timeout server 30000ms resolvers kubernetes nameserver skydns kube-dns.kube-system:53 resolve_retries 10 timeout retry 2s hold valid 5s frontend http-in bind *:80 log /dev/log local0 option httplog default_backend servers backend servers balance roundrobin stick-table type string size 100m option httpchk GET /health http-check expect status 200 option tcp-check stick on path, word(3,/) server-template pod 5 pod.namespace.svc.cluster.local:8080 check resolvers kubernetes inter 500As you can see we’re leveraging the server-templates and k8s dns resolvers to create the backend servers dynamically.
We’ve been pretty happy with the stick table approach so far, but we have some issues when doing a rolling upgrade of the backend servers. Kubernetes will start to terminate a pod and start-up a couple of new ones and wait until the old ones are completely wound down.
Now we’ve observed some inconsistency of the stick table during this rolling upgrade scenario, namely the instance of HAProxy would happily forward all requests to what it thinks is pod3. But on the request log of the backend we would see it ending up on three different backend servers.
Here’s the HAProxy request log for that period:
[29/Apr/2019:10:39:46.149] http-in servers/pod3 0/0/0/12812/12813 200 1031 - - ---- 303/303/235/3/0 0/0 "GET /v1/document/sticky_1556013348827_0fmbgvj50nmc/stepssince/983?_no_ie_cache=1556527174123 HTT [29/Apr/2019:10:39:47.194] http-in servers/pod3 0/0/0/30/30 200 338 - - ---- 300/300/234/3/0 0/0 "GET /v1/document/sticky_1556013348827_0fmbgvj50nmc/stepssince/1006?_no_ie_cache=1556527187165 HTTP/1.1" [29/Apr/2019:10:39:47.197] http-in servers/pod3 0/0/0/31/31 200 309 - - ---- 300/300/233/2/0 0/0 "POST /v1/document/sticky_1556013348827_0fmbgvj50nmc/steps HTTP/1.1" [29/Apr/2019:10:39:47.277] http-in servers/pod3 0/0/0/34/34 200 338 - - ---- 300/300/233/2/0 0/0 "GET /v1/document/sticky_1556013348827_0fmbgvj50nmc/stepssince/1006?_no_ie_cache=1556527187248 HTTP/1.1"HAProxy still thinks it is sending everything to pod3.
Here’s what we receive on the backend, where we can see requests for the same sticky_id path fragment to different pods at nearly the same time (request log -> pod identifier).
2019-04-29T10:39:46.962Z 'REQUEST-OK [GET] [/v1/document/sticky_1556013348827_0fmbgvj50nmc/stepssince/983]' -> pod-799496f69-dbp8s 2019-04-29T10:39:47.223Z 'REQUEST-OK [GET] [/v1/document/sticky_1556013348827_0fmbgvj50nmc/stepssince/1006]' -> pod-58fdfcc477-zjcrq 2019-04-29T10:39:47.227Z 'REQUEST-OK [POST] [/v1/document/sticky_1556013348827_0fmbgvj50nmc/steps]' -> pod-799496f69-dbp8s 2019-04-29T10:39:47.306Z 'REQUEST-START [GET] [/v1/document/sticky_1556013348827_0fmbgvj50nmc/stepssince/1006]' -> pod-58fdfcc477-zjcrqHere’s the proxy logs for that period that shows how the DNS resolver switches the IPs:
April 29th 2019, 10:39:19.000 [WARNING] 118/083919 (1) : Server servers/pod1 is going DOWN for maintenance (No IP for server ). 4 active and 0 backup servers left. 11 sessions active, 0 requeued, 0 remaining in queue. April 29th 2019, 10:39:34.000 [WARNING] 118/083934 (1) : Server servers/pod1 ('pod.namespace.svc.cluster.local') is UP/READY (resolves again). April 29th 2019, 10:39:34.000 [WARNING] 118/083934 (1) : Server servers/pod1 administratively READY thanks to valid DNS answer. April 29th 2019, 10:39:34.000 [WARNING] 118/083934 (1) : Server servers/pod5 is going DOWN for maintenance (No IP for server ). 4 active and 0 backup servers left. 3 sessions active, 0 requeued, 0 remaining in queue. April 29th 2019, 10:39:34.000 [WARNING] 118/083934 (1) : servers/pod1 changed its IP from 172.20.5.223 to 172.20.4.207 by kubernetes/skydns. April 29th 2019, 10:39:34.000 [WARNING] 118/083934 (1) : servers/pod2 changed its IP from 172.20.4.248 to 172.20.4.245 by DNS cache. April 29th 2019, 10:39:39.000 [WARNING] 118/083939 (1) : servers/pod3 changed its IP from 172.20.5.50 to 172.20.5.232 by DNS cache. April 29th 2019, 10:39:49.000 [WARNING] 118/083949 (1) : Server servers/pod5 ('pod.namespace.svc.cluster.local') is UP/READY (resolves again). April 29th 2019, 10:39:49.000 [WARNING] 118/083949 (1) : Server servers/pod5 administratively READY thanks to valid DNS answer. April 29th 2019, 10:39:49.000 [WARNING] 118/083949 (1) : servers/pod4 changed its IP from 172.20.5.176 to 172.20.4.111 by DNS cache. April 29th 2019, 10:39:49.000 [WARNING] 118/083949 (1) : servers/pod5 changed its IP from 172.20.5.98 to 172.20.5.164 by DNS cache.The HAProxy is a single pod that did not restart or anything else that would wipe the sticky table somehow.
We suspect that this might be due to connections being pooled and thus held on while the server underneath is actually changing its IP via the DNS resolver. Does HAProxy have draining support for such a scenario and using the server-template? What else could cause such a behaviour?
I understand that this is probably also fairly specific to Kubernetes, but any helpful pointers on what’s going on here are appreciated.
Thanks a ton,
Thomas
Posts: 1
Participants: 1