This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Slow DNS queries with parallel requests and ipv6

Hi,

We observe slow dns answers from Sophos dns proxy since the introduction of parallel requests in glibc 2.9. When I take a network trace on the firewall with tcpdump, I can see that Sophos answers only to the first request (ipv4). 5 seconds later, the client sends 2 other requests (sequentially) and Sophos answers to the 2 requests. Here is a trace when I launch "telnet www.sophos.com 80" on a client with a recent glibc :

19:40:15.103081 IP 192.168.23.1.35888 > 192.168.23.254.53: 40569+ A? www.sophos.com. (32)
19:40:15.103131 IP 192.168.23.1.35888 > 192.168.23.254.53: 62436+ AAAA? www.sophos.com. (32)
19:40:15.104087 IP 192.168.23.254.53 > 192.168.23.1.35888: 40569 3/8/8 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net., A 172.229.195.18 (393)
19:40:20.107695 IP 192.168.23.1.35888 > 192.168.23.254.53: 40569+ A? www.sophos.com. (32)
19:40:20.122838 IP 192.168.23.254.53 > 192.168.23.1.35888: 40569 3/8/8 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net., A 2.21.3.18 (393)
19:40:20.126727 IP 192.168.23.1.35888 > 192.168.23.254.53: 62436+ AAAA? www.sophos.com. (32)
19:40:20.127196 IP 192.168.23.254.53 > 192.168.23.1.35888: 62436 2/1/0 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net. (163)

I found a workaround by adding these line in /etc/resolv.conf on the client :

options single-request

options single-request-reopen

.

But I would like to find a generic solution to apply on the firewall (I have the same problem on multiple firewalls with different customers).

I wonder why I'm the only one to speak about this problem (I didn't find any thread speaking about this problem).

Thanks,
Nicolas

This thread was automatically locked due to age.

0 RFCat_vk_01 over 11 years ago

Hi,
what version of the UTM are you running on this UTM?

Ian
Cancel
Vote Up 0 Vote Down

Cancel
0 sarabanjina over 11 years ago

Hi,

I did the trace on my UTM v9.109-1 (but I think the problem is not recent).

I think nobody speaks about this problem, as it appears only on "recent" linux distribution (often workstations or laptops). I never had the problem on linux servers (glibc is often older or there is a local dns or a cache daemon like nscd).

Thanks,
Nicolas
Cancel
Vote Up 0 Vote Down

Cancel
0 sarabanjina over 11 years ago in reply to sarabanjina

Hi,

I did a test with another firewall (linux based too) with exactly the same configuration (my old firewall) and I can see the 2 requests going on and coming back without problem :
00:32:24.123475 IP 192.168.23.1.41484 > 192.168.23.254.53:  36529+ A? www.sophos.com. (32)
00:32:24.123552 IP 192.168.23.1.41484 > 192.168.23.254.53:  34095+ AAAA? www.sophos.com. (32)
00:32:24.176641 IP 192.168.23.254.53 > 192.168.23.1.41484:  36529 3/8/8 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net., A 2.21.3.18 (393)
00:32:24.178892 IP 192.168.23.254.53 > 192.168.23.1.41484:  34095 2/1/0 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net. (163)

I did another test with Sophos using Google DNS directly (without using the DNS proxy) and I can see the 2 requests going to the internal interface, but only 1 going to the wan link. I tried to disable IPS, portscan detection, ..., but the problem remains the same. Sometimes, when the second request have some delay compared to the first request, it works (rarely).

I think there is a parameter too agressive in the Sophos kernel which drops a packet coming from the same host and port almost at the same timestamp.

Has somebody an idea where I could find such a param to tweak? It's very frustating to wait 5 seconds for each dns request... Hopefully, it happens only with some programs : ssh, telnet, ...

Thanks,
Nicolas
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Hi,

I have a support case open since over three months because of that. They don't want to fix it...
Just mentioned in another thread that i haven't found the muse to write a post about it yet, because it's one more annoying support case with Sophos...

https://community.sophos.com/products/unified-threat-management/astaroorg/f/54/t/41451

I'll try to write the post about the odyssey i had already over the next days.

(For now they tell it will be fixed in 9.2. Btw. 9.2 never was broken)
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Oh, btw. OPEN A SUPPORT CASE! The more people do it, the more likely they get their asses up and fix it.

I will provide you all informations you need in the post i'll write later this week (this will become a long one...)
Cancel
Vote Up 0 Vote Down

Cancel
0 sarabanjina over 11 years ago

Hi,

I tried on Sophos UTM v9.200-11 today and I see the same problem. I can reproduce the problem from a server on SLES11 SP3.

I found a thread about a similar problem in the netfilter-devel mailinglist (from 2010) and the gui (from Astaro) explains the problem like this :
Normally parallel DNS lookups works fine, first packet is received and
forwarded, so conntrack is confirmed before second packet is received.

However in combination with NFQUEUE, the second DNS requests is
received while the first one is still in the queue and both DNS requests
have an unconfirmed conntrack. So the second one will be dropped
in nf_conntrack_confirm, which results in an DNS timeout and retransmit.

Here is a copy of the thread :
Linux Netfilter Devel -- [RFC PATCH] nfqueue: nf_conntrack_confirm race condition

Regards,
Nicolas
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Great, could not test it with 9.2 myself yet. Thanks for that.

Hmm... so it could be a packet filter issue and not the DNS proxy itself as i thought. Interesting. Will forward that to the support team.
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Ha ha, funny, the mail you linked to is from Ulrich and he is working for Astaro since many many years! Already had contact with him myself some time ago. He's a great software engineer ;-)
Cancel
Vote Up 0 Vote Down

Cancel
0 sarabanjina over 11 years ago in reply to UrsWeiss
Hi,

Based on the previous message I posted, I made a quick test on my UTM : I disabled completely connection tracking on port 53 with these commands :
iptables -t raw -I PREROUTING -p udp --destination-port=53 -j LOCAL_TRAFFIC
iptables -t raw -I PREROUTING -p udp --source-port=53 -j LOCAL_TRAFFIC
iptables -t raw -I OUTPUT -p udp --destination-port=53 -j LOCAL_TRAFFIC
iptables -t raw -I OUTPUT -p udp --source-port=53 -j LOCAL_TRAFFIC

And now everything is working fine (every time):
14:05:23.402808 IP 192.168.23.1.57865 > 192.168.23.254.53: 9401+ A? www.sophos.com. (32)
14:05:23.402875 IP 192.168.23.1.57865 > 192.168.23.254.53: 37400+ AAAA? www.sophos.com. (32)
14:05:23.403729 IP 192.168.23.254.53 > 192.168.23.1.57865: 9401 3/8/8 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net., A 23.34.179.18 (393)
14:05:23.403777 IP 192.168.23.254.53 > 192.168.23.1.57865: 37400 2/1/0 CNAME www.sophos.com.edgekey.net., CNAME e6203.b.akamaiedge.net. (163)

I'm almost sure the problem is not related to dns proxy, but on some relation between iptables, connection tracking and nfqueue.

Regards,
Nicolas
Cancel
Vote Up 0 Vote Down

Cancel
0 UrsWeiss over 11 years ago

Nice one! +1

So, lets hope they find a solution for this issue.
Cancel
Vote Up 0 Vote Down

Cancel