This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Slow DNS queries with parallel requests and ipv6

Hi,

We observe slow dns answers from Sophos dns proxy since the introduction of parallel requests in glibc 2.9. When I take a network trace on the firewall with tcpdump, I can see that Sophos answers only to the first request (ipv4). 5 seconds later, the client sends 2 other requests (sequentially) and Sophos answers to the 2 requests. Here is a trace when I launch "telnet www.sophos.com 80" on a client with a recent glibc :

19:40:15.103081 IP 192.168.23.1.35888 > 192.168.23.254.53: 40569+ A? www.sophos.com. (32)
19:40:15.103131 IP 192.168.23.1.35888 > 192.168.23.254.53: 62436+ AAAA? www.sophos.com. (32)
19:40:15.104087 IP 192.168.23.254.53 > 192.168.23.1.35888: 40569 3/8/8 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net., A 172.229.195.18 (393)
19:40:20.107695 IP 192.168.23.1.35888 > 192.168.23.254.53: 40569+ A? www.sophos.com. (32)
19:40:20.122838 IP 192.168.23.254.53 > 192.168.23.1.35888: 40569 3/8/8 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net., A 2.21.3.18 (393)
19:40:20.126727 IP 192.168.23.1.35888 > 192.168.23.254.53: 62436+ AAAA? www.sophos.com. (32)
19:40:20.127196 IP 192.168.23.254.53 > 192.168.23.1.35888: 62436 2/1/0 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net. (163)

I found a workaround by adding these line in /etc/resolv.conf on the client :

options single-request

options single-request-reopen

.

But I would like to find a generic solution to apply on the firewall (I have the same problem on multiple firewalls with different customers).

I wonder why I'm the only one to speak about this problem (I didn't find any thread speaking about this problem).

Thanks,
Nicolas

This thread was automatically locked due to age.

0 sarabanjina over 11 years ago

Thanks.

I'm not sure there is an easy solution for this problem. This problem exists in Astaro/Sophos since the introduction of glibc 2.9 (2010). I never had time to track it down.

Could you keep us informed about the evolution/resolution of your support case?

Regards,
Nicolas
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Will keep you informed.

I think at least the quick solution would be to disable connection tracking for UDP to port 53 of the UTM.
I'm not into netfilter that much, but don't see that this will be a big (security) problem. I mean, it's UDP. It's stateless anyway.

I'm installing a new internal DNS based on PowerDNS at the moment (which is amazing btw.) and found this one in their docs:
A Recursor under high load puts a severe stress on any stateful (connection tracking) firewall, so much so that the firewall may fail.

Specifically, many Linux distributions run with a connection tracking firewall configured. For high load operation (thousands of queries/second), It is advised to either turn off iptables completely, or use the 'NOTRACK' feature to make sure DNS traffic bypasses the connection tracking.
Source: 4. PowerDNS Recursor performance
Cancel
Vote Up 0 Vote Down

Cancel
0 sarabanjina over 11 years ago

Thanks.

The commands I posted in previous message does exactly that : disable connection tracking on port 53 (the target LOCAL_TRAFFIC is just a chain which disables connection tracking with target CT --notrack (equivalent to deprecated target NOTRACK)).

But be aware that it disables too every protection on port 53 like IPS, spoofing, flood, port scan, ...

Nicolas
Cancel
Vote Up 0 Vote Down

Cancel
0 BarryG over 11 years ago

Hi, my guess is that "High load" would be 100's or 1000's of queries per second.

Barry
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Yes, but it results in the same problem, and the same "fix" also works on the UTM as Nicolas mentioned already.
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Received an answer from the Support Team:
They could reproduce it, but only with IPS activated. They also agree that it seems to be conntrack which is causing the problems. So, they are working on it.

Nicolas, are you able to verify the IPS thing? Can't do it in your live environment.
Cancel
Vote Up 0 Vote Down

Cancel
0 sarabanjina over 11 years ago

Hi,

I just verified on my UTM (v9.109-1) by disabling IPS, UDP flood protection and portscan detection and the problem is the same (I already tried that). I can see the packet going in conntrack with IPS disabled.

Nicolas
Cancel
Vote Up 0 Vote Down

Cancel
0 t14u over 11 years ago in reply to sarabanjina

Hi, can anyone explain please, how to create this rule in the webinterface?
Thanks
T.
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago
You can't do it in the web interface, and it's good you cannot do it. This was just for debugging what exactly is causing the problem. It should be fixed in V9.3 btw.

The only workaround until 9.3 is released is to set one of these options in /etc/resolv.conf on all client machines (having the problem):
options single-request-reopen
options single-request

The first worked fine for me.
Cancel
Vote Up 0 Vote Down

Cancel
0 t14u over 11 years ago in reply to UrsWeiss

so this is only for doing with SSH? (I am not able to SSH to my UTM)
Cancel
Vote Up 0 Vote Down

Cancel