@jba wrote:
Hi Willy & team - first off thank you for your amazing software - it’s been a life-saver.
Environment: We have a small cluster of HAProxy servers that have approximately 15k ssl certificates loaded. As certificates are added and removed, HAProxy is called to gracefully reload with the FINISH signal. This happens approximately 100 times a day and has worked perfectly across 1.6/1.7 and now 1.8.
Issue: we’ve recently enabled http/2 for these haproxy servers. Http/2 support has been great, but after enabling we’ve seen a small fraction of HAProxy processes never ending despite receiving a FINISH signal. This will slowly lead to memory exhaustion on the HAProxy servers.
Upon examination of the wedged processes, they always have 1 or more external sockets in CLOSE_WAIT:
tcp 26686 0 62.22.188.41:443 69.123.177.216:59710 CLOSE_WAIT 2335/haproxy udp 0 0 0.0.0.0:49277 0.0.0.0:* 2335/haproxyanother one (different server:
tcp 85 0 62.22.188.40:443 49.204.95.150:53001 CLOSE_WAIT 12032/haproxy tcp 43 0 62.22.188.40:443 43.248.55.131:56715 CLOSE_WAIT 12032/haproxy udp 0 0 0.0.0.0:30855 0.0.0.0:* 12032/haproxyWe are using ‘nbproc’ - and the wedged processes seem to often be the ‘head’ process (hence udp binding), but this is not always the case:
tcp 841 0 62.22.188.41:443 132.170.15.255:42642 CLOSE_WAIT 18760/haproxyStrace’ing the processes just shows a slow epoll_wait:
epoll_wait(3, [], 200, 0) = 0 epoll_wait(3, [], 200, 34) = 0 epoll_wait(3, [], 200, 0) = 0 epoll_wait(3, [], 200, 51) = 0 epoll_wait(3, [], 200, 0) = 0 epoll_wait(3, [], 200, 60) = 0 epoll_wait(3, [], 200, 0) = 0Our configuration is very straightforward:
global user haproxy group haproxy daemon maxconn 21000 tune.ssl.default-dh-param 2048 tune.ssl.cachesize 1000000 tune.maxrewrite 16384 tune.bufsize 49152 nbproc 4 cpu-map 1 0 cpu-map 2 1 cpu-map 3 2 cpu-map 4 3 defaults mode http retries 5 option redispatch maxconn 20000 timeout connect 30s timeout client 30s timeout server 14400s timeout http-keep-alive 5s option httplog option dontlog-normal option http-ignore-probes log _ipaddr_ local3 option httpchk GET /admStatus/si/2 http-check expect status 200 option forwardfor option http-keep-aliveThe frontend just has a bind, set-header, and a default_backend.
Build options:
HA-Proxy version 1.8.8 2018/04/19 Copyright 2000-2018 Willy Tarreau <willy@haproxy.org> Build options : TARGET = linux2628 CPU = generic CC = x86_64-pc-linux-gnu-gcc CFLAGS = -O2 -march=native -pipe -fno-strict-aliasing OPTIONS = USE_LIBCRYPT=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_THREAD=1 USE_OPENSSL=1 USE_PCRE=1 USE_TFO=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with OpenSSL version : OpenSSL 1.0.2o 27 Mar 2018 Running on OpenSSL version : OpenSSL 1.0.2o 27 Mar 2018 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Encrypted password support via crypt(3): yes Built with multi-threading support. Built with PCRE version : 8.40 2017-01-11 Running on PCRE version : 8.41 2017-07-05 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available filters : [SPOE] spoe [COMP] compression [TRACE] traceThis is running on a vanilla linux 4.9.6 kernel.
I confirmed that disabling http/2 in both 1.8.7 and 1.8.8 makes the issue go away. Curious if there’s anything else I might look at or whether this could be a bug. Thanks much!
Posts: 1
Participants: 1