Quantcast
Channel: HAProxy community - Latest topics
Viewing all articles
Browse latest Browse all 4717

Maxconn causing close_wait

$
0
0

We have a load balancer with below configuration -

frontend lb
        bind x.x.x.x:80 mss 1440 alpn h2,http/1.1
        maxconn 100
        mode http
        option httplog
        use_backend busy_backend if { fe_conn ge 100 } || { be_conn_free(actual_backend) le 0 } actual_backend

backend actual_backend
        mode  http
        option httpchk
        http-check send meth GET uri /some/path ver HTTP/1.1 hdr Host some.header.com expect status 200
        server my-server01 x.x.x.x:80 check port 80 maxconn 10000 enabled maxqueue 1
        server my-server02 x.x.x.x:80 check port 80 maxconn 10000 enabled maxqueue 1
        errorfile 503 /etc/path/busy/busy.http

backend busy_backend
        mode http
        option httpchk
        http-check send meth GET uri /path ver HTTP/1.1
        http-check expect status 200
        server busy-server unix@socket.sock enabled backup
        errorfile 503 /etc/path/busy/busy.http

We have configured the lb to take only maximum of 100 conncurr connections at a time by setting 100 as maxconn value.

I start hitting the lb with 120 connections, now I expect to see haproxy to accept 100 connections and rest of the connections should be given with 503.

root@load_test_server:~# ./hey_linux_amd64 -n 1000000 -c 120 -q 30 http://lb-ip/path

The connections reaches to 100 in haproxy for this listener.

once it does, all we see is 503s.

What we suspect it, the haproxy is not closing the connections and keeps them on close-wait state for a very long time.

Number of close_waits on hap

root@haproxy:~# date; ss -antp | grep listener-ip | grep -i close | wc -l
Fri Feb  5 15:47:46 UTC 2021
60

Take one close-wait as example

root@haproxy:~# date; ss -antp | grep listener-ip | grep -i close | tail -1
Fri Feb  5 15:48:00 UTC 2021
CLOSE-WAIT 126      0                                      listener-ip:80                                          load-generator-ip:59280

Now I search for this port’s status in the load generating server, I do not see this port being used.

root@load-gen-server:~# date; ss -antp | grep 59280
Fri Feb  5 15:49:56 UTC 2021

But this connection in haproxy stays for a longer time, until I stop the traffic to the frontend.

root@haproxy:~# date; ss -antp | grep grep listener-ip | grep -i 59280
Fri Feb  5 15:51:10 UTC 2021
CLOSE-WAIT 126      0                                      grep listener-ip:80                                          load-generator-ip:59280

It is in CLOSE-WAIT for almost 3 minutes.

Our understanding is, technically this close_wait connection is a part of the number of connections we set in maxconn hence the listener stops serving any traffic until it finds a free connection.

We tested this in haproxy 2.2.6 and 2.2.3, both behaves the same way.

Can you kindly help in debug this issue?

haproxy -vv

  root@haproxy:~# haproxy -vv
  HA-Proxy version 2.2.3-0e58a34 2020/09/08 - https://haproxy.org/
  Status: long-term supported branch - will stop receiving fixes around Q2 2025.
  Known bugs: http://www.haproxy.org/bugs/bugs-2.2.3.html
  Running on: Linux 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64
  Build options :
    TARGET  = linux-glibc
    CPU     = generic
    CC      = gcc
    CFLAGS  = -O2 -g -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-stringop-overflow -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
    OPTIONS = USE_PCRE=1 USE_LINUX_TPROXY=1 USE_LINUX_SPLICE=1 USE_LIBCRYPT=1 USE_OPENSSL=1 USE_ZLIB=1 USE_SYSTEMD=1

  Feature list : +EPOLL -KQUEUE +NETFILTER +PCRE -PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL -LUA +FUTEX +ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS

  Default settings :
    bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

  Built with multi-threading support (MAX_THREADS=64, default=16).
  Built with OpenSSL version : OpenSSL 1.1.0g  2 Nov 2017
  Running on OpenSSL version : OpenSSL 1.1.1h  22 Sep 2020 (VERSIONS DIFFER!)
  OpenSSL library supports TLS extensions : yes
  OpenSSL library supports SNI : yes
  OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
  Built with network namespace support.
  Built with zlib version : 1.2.11
  Running on zlib version : 1.2.11
  Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
  Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
  Built with PCRE version : 8.39 2016-06-14
  Running on PCRE version : 8.39 2016-06-14
  PCRE library supports JIT : no (USE_PCRE_JIT not set)
  Encrypted password support via crypt(3): yes
  Built with gcc compiler version 7.3.0
  Built with the Prometheus exporter as a service

  Available polling systems :
        epoll : pref=300,  test result OK
         poll : pref=200,  test result OK
       select : pref=150,  test result OK
  Total: 3 (3 usable), will use epoll.

  Available multiplexer protocols :
  (protocols marked as <default> cannot be specified using 'proto' keyword)
              fcgi : mode=HTTP       side=BE        mux=FCGI
         <default> : mode=HTTP       side=FE|BE     mux=H1
                h2 : mode=HTTP       side=FE|BE     mux=H2
         <default> : mode=TCP        side=FE|BE     mux=PASS

  Available services :
  	prometheus-exporter

  Available filters :
  	[SPOE] spoe
  	[COMP] compression
  	[TRACE] trace
  	[CACHE] cache
  	[FCGI] fcgi-app

Global

  global

  user haproxy
  group haproxy
  nbproc 1
  nbthread 16
  cpu-map auto:1/1-16 0-15
  log /dev/log local2
  log /dev/log local0 notice
  chroot /path
  pidfile /path/haproxy.pid
  daemon
  master-worker
  maxconn 200000
  hard-stop-after 1h
  stats socket /path/stats mode 660 level admin expose-fd listeners
  tune.ssl.cachesize 3000000
  tune.ssl.lifetime 60000
  ssl-default-bind-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
  ssl-default-bind-options ssl-min-ver TLSv1.2 ssl-max-ver TLSv1.2
  server-state-file /path/states
  tune.bufsize 40960

Defaults

defaults

  mode http
  log global
  retries 3
  timeout http-request 10s
  timeout queue 10s
  timeout connect 10s
  timeout client 1m
  timeout server 1m
  timeout tunnel 10m
  timeout client-fin 30s
  timeout server-fin 30s
  timeout check 10s
  option httplog
  option forwardfor except 127.0.0.0/8
  option redispatch
  load-server-state-from-file global

5 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 4717

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>