@fhda-mrapczynski wrote:
Attempting my first use of the DNS SRV features in a non-production environment in AWS.
In general, my configuration works. What I am trying to diagnose now is a long delay of ~5 minutes when I re-deploy a service. HAProxy picks up the DNS change quickly, but seems to take several minutes before that change is completely applied to the backend, and health checks resume. This delay defeats the purpose of dynamic service discovery.
The log events below show what HAProxy reports after the service has been redeployed.
- Initially health checks fail because the container has restarted or moved to another host
- At 07:51 the updated SRV record is identified
- At 07:56 HAProxy finally marks the backend up
The key issue to diagnose is what causes the 5 minute waiting time.
Also, the DNS records currently have a TTL of 0.
Log Entries:
Jul 09 07:50:59 haproxy-internal-20190703-1136AM-579 haproxy[21084]: Health check for server bcm_test/bcm1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 2/3 UP. Jul 09 07:51:04 haproxy-internal-20190703-1136AM-579 haproxy[21084]: Health check for server bcm_test/bcm1 succeeded, reason: Layer4 check passed, check duration: 0ms, status: 3/3 UP. Broadcast message from systemd-journald@haproxy-internal-20190703-1136AM-579 (Tue 2019-07-09 07:51:19 PDT): haproxy[21084]: backend bcm_test has no server available! Jul 09 07:51:19 haproxy-internal-20190703-1136AM-579 haproxy[21084]: Server bcm_test/bcm1 is going DOWN for maintenance (DNS NX status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. Jul 09 07:51:19 haproxy-internal-20190703-1136AM-579 haproxy[21084]: backend bcm_test has no server available! Jul 09 07:51:19 haproxy-internal-20190703-1136AM-579 haproxy[21084]: bcm_test/bcm1 changed its FQDN from (null) to 18571aeca1584999960dff1a17acf787._bcmmb-test._tcp.docker by 'SRV record' Jul 09 07:56:21 haproxy-internal-20190703-1136AM-579 haproxy[21084]: Server bcm_test/bcm1 ('18571aeca1584999960dff1a17acf787._bcmmb-test._tcp.docker') is UP/READY (resolves again). Jul 09 07:56:21 haproxy-internal-20190703-1136AM-579 haproxy[21084]: Server bcm_test/bcm1 administratively READY thanks to valid DNS answer. Jul 09 07:56:24 haproxy-internal-20190703-1136AM-579 haproxy[21084]: Health check for server bcm_test/bcm1 succeeded, reason: Layer4 check passed, check duration: 0ms, status: 3/3 UP.
HAProxy Configuration:
resolvers aws nameserver vpc 169.254.169.253:53 resolve_retries 256 timeout retry 5s timeout resolve 2s hold nx 15s hold other 15s hold refused 15s hold timeout 15s hold valid 15s hold obsolete 15s frontend bcm_test bind *:32001 timeout client 30m timeout client-fin 30s default_backend bcm_test backend bcm_test default-server check inter 5s rise 6 timeout connect 10s timeout server 30m server-template bcm 1 _bcmmb-test._tcp.docker check resolvers aws resolve-opts allow-dup-ip
Posts: 1
Participants: 1