Hello,
We created an ACL for blocking bad user-agents/bots but it gives a false positive 403 in a specific cases which I will explain below.
Here is the config:
acl is-blockedagent-http hdr_sub(user-agent) -f /etc/haproxy/agentblock.lst
http-request deny if is-blockedagent-http
The agentblock.lst list consists of user-agents not in their full form, sample below:
Disco
Discobot
Discoverybot
ZumBot
ZyBorg
So the ACL provided above will normally give a 403 when it finds an exact match of “ZyBorg” anywhere within a user-agent , and again normally will give 200 for “ZyBor” which is missing the last “g” character.
But in cases where similar words exist in the list like “Disco” , “Discobot” and “Discoverybot”,
the ACL will false-fully give a 403 when a user-agent includes the word “Discov”, “Discove” , “Discover”, “Discovery” ,“Discoveryb”, “Discob”, “Discobo”.
So it will trigger a 403 for anything starting with “Disco” and matching any letter from bot or very which are included in “Discobot” and “Discoverybot”
We also tried adding another ACL and list for fixing the above but it doesn’t work:
acl is-blockedagent-http hdr_sub(user-agent) -f /etc/haproxy/agentblock.lst
acl is-goodagent-http hdr_sub(user-agent) -f /etc/haproxy/goodagents.lst
http-request deny if is-blockedagent-http !is-goodagent-http
goodagent.lst:
Discob
Discobo
Discov
Discove
Discover
Discovery
Discoveryb
Discoverybo
Any ideas on how to resolve the above which only occurs when there are user-agents with similar names in the list?
The only alternative working solution we can think of is using req.fhdr(user-agent) but we would need a list of user-agents in their full form which is hard to find.
Thank you
4 posts - 3 participants