Hi folks! We (pydantic) recently started using HAProxy, in particular for the consistent hashing with bounded loads feature (hash-balance-factor) and we’re really enjoying it. It’s well designed super reliable software! We run on GKE / Kuberentes and have HAProxy deployed internally mainly to modulate and manage traffic between internal services.
We have some questions / ideas, I hope here is the right place to ask them.
For consistent hashing with bounded loads I believe the load is calculated as # of active connections. Would it be possible to use another metric, e.g. number of requests processed in the last 30s or something like that? The system (data ingestion) we’re using this for is async so there’s a lot of processing going on after a request is finished that is not captured.
We have another system (essentially a bunch of pods that run users queries) where there are some very small cheap requests (e.g. loading a specific row of data by it’s primary key) and some that are very expensive (complex aggregation over 30 days of data). We currently have a complex system w/ work queues and Redis that ultimately tries to schedule small queries together but only process one large query at a time per pod. We’re hoping we can replace this with haproxy. The most obvious way would be to set maxconn per downstream server to some low number (say 3) but as far as I know there is no way to sync current connection count across haproxy replicas (which we’ll need to be able to scale network thorughput). I saw some exciting new developments on load balancing algorithms that might be usable for this ( New Balancing algorithm (Peak) EWMA · Issue #1570 · haproxy/haproxy · GitHub ) but it’s not a perfect solution (slower to respond) and doesn’t exist yet. FWIW our service can give a very accurate measurement of current in flight work or of how expensive a request is going to be via response headers. Any ideas much appreciated!
Thanks!
1 post - 1 participant