Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

XG 19 SD WAN Application timeout

I have XG V19 Firewalls and created a SD-WAN policy to handle traffic for Site 2 Site Route based IPSec VPN with xfrm interfaces.

it works great, just some strange issue, many application that are used over that VPN timeout and crash after around 15 - 20 minutes,

so if a user has open an RDP session it will suddenly crash



This thread was automatically locked due to age.
Parents
  • Here 10.21.11.1 is that the gateway or private DNS ?

    also can you reduce the sample size for sample size for sla to 5...

  • Hello ,

    Conduct a session again, and wait for the disconnection to happen again:
    in the process, capture the following things:

    1.) Tcpdump again 
    2.) Drop packet capture: 
    Monitor dropped packets using CLI : https://support.sophos.com/support/s/article/KB-000036858?language=en_US
    3.) Monitor traffic using Packet Capture Utility : https://support.sophos.com/support/s/article/KB-000035761?language=en_US
    4.) Create and download a packet capture : https://support.sophos.com/support/s/article/KB-000037007?language=en_US
    5.) And run a live conntrack: conntrack -E -d <dstp ip> | grep <src ip> 

  • see attached packet capture, it's possible that the capture stopped before the connection dropped.

    the connection was from 192.168.1.250 to 192.168.4.159 port 48991

    totermw - 20220603-005359940-M0400.txt

    
    Sophos Firmware Version SFOS 19.0.0 GA-Build317 
    
    Main Menu 
    
        1.  Network  Configuration
        2.  System   Configuration
        3.  Route    Configuration 
        4.  Device Console 
        5.  Device Management
        6.  VPN Management
        7.  Shutdown/Reboot Device
        0.  Exit 
    
        Select Menu Number [0-7]: 5
    
    Sophos Firmware Version SFOS 19.0.0 GA-Build317 
    
    Device Management 
    
        1.  Reset to Factory Defaults
        2.  Show Firmware(s)
        3.  Advanced Shell
        4.  Flush Device Reports
        0.  Exit
    
        Select Menu Number [0-4]: 3
    
    
    Sophos Firewall
    ===============
    (C) Copyright 2000-2022 Sophos Limited and others. All rights reserved.
    Sophos is a registered trademark of Sophos Limited and Sophos Group.
    All other product and company names mentioned are trademarks or registered
    trademarks of their respective owners.
    
    For Sophos End User Terms of Use - https://www.sophos.com/en-us/legal/sophos-end-user-terms-of-use.aspx
    
    NOTE: If not explicitly approved by Sophos support, any modifications
          done through this option will void your support.
    
    
    SFVH_VM01_SFOS 19.0.0 GA-Build317# tcpdu
    SFVH_VM01_SFOS 19.0.0 GA-Build317# tcpdump fil
    SFVH_VM01_SFOS 19.0.0 GA-Build317# tcpdump f
    feedbackconfig  fontconfig/
    
    SFVH_VM01_SFOS 19.0.0 GA-Build317# tcpdump ffiledump 'host 192.168.4.159 -s0'
    tcpdump: can't parse filter expression: syntax error
    SFVH_VM01_SFOS 19.0.0 GA-Build317# exit
    
    Sophos Firmware Version SFOS 19.0.0 GA-Build317 
    
    Device Management 
    
        1.  Reset to Factory Defaults
        2.  Show Firmware(s)
        3.  Advanced Shell
        4.  Flush Device Reports
        0.  Exit
    
        Select Menu Number [0-4]: 0
    Exit
    
    Sophos Firmware Version SFOS 19.0.0 GA-Build317 
    
    Main Menu 
    
        1.  Network  Configuration
        2.  System   Configuration
        3.  Route    Configuration 
        4.  Device Console 
        5.  Device Management
        6.  VPN Management
        7.  Shutdown/Reboot Device
        0.  Exit 
    
        Select Menu Number [0-7]: 4
    Sophos Firmware Version SFOS 19.0.0 GA-Build317 
    
    console> tcpdump filedump 
    <text>     count      hex        interface  llh        no_time    quite      verbose    
    console> tcpdump filedump 'ho' st 192.168.4.159 s) 0'
    tcpdump: can't parse filter expression: syntax error
    console> tcpdump filedump 'host 192.168.4.159 s0' 
    tcpdump: can't parse filter expression: syntax error
    console> tcpdump filedump 'host 192.168.4.159 s0'-s0'
    tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
    1000 packets captured
    1131 packets received by filter
    0 packets dropped by kernel
    console> tcpdump filedump 'host 192.168.4.159 -s0'
    tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
    1000 packets captured
    1176 packets received by filter
    0 packets dropped by kernel
    console> 
    totermw - 20220603-005805005-M0400.txt
    
    Sophos Firmware Version SFOS 19.0.0 GA-Build317 
    
    Main Menu 
    
        1.  Network  Configuration
        2.  System   Configuration
        3.  Route    Configuration 
        4.  Device Console 
        5.  Device Management
        6.  VPN Management
        7.  Shutdown/Reboot Device
        0.  Exit 
    
        Select Menu Number [0-7]: 4
    Sophos Firmware Version SFOS 19.0.0 GA-Build317 
    
    console> drop-packet-capture ; ''p'o'r't' '4'8'9'9'1'
    ^Cconsole> 

  • Hello ,

    Thank you for providing the packet captures...
    Upon looking at the Conntrack captured we are able to see the initial tcp handshake:

    =============
    But it looks like while capturing the pcap file, the packet capture was started late and hence it did not capture the initial tcp-handshake and just data packets were captured:


    Looks like there is a latency on either side especially when the packets coming from 4.159...
    As you notice the delta time, the packet response almost took 16s to reply and if you noticed the time reference, the packet took 120s, i.e. 2 mins, so if the server side has request time out set to 2 mins, there are chances the session may disconnect. 

    You can compare this non-working scenario with the working scenario with the packet capture and try to diagnose whether or not latency occurs or not ? With the help of wireshark on either sides. 

    Based on that whatever the packet loss or re-transmission or dup-ack from the server side, you can fine tune the timeout settings on the server side with the help of your server team. 

    Additionally, on the Advance firewall settings you can toggle the following to see if that helps in improving the situation: 
    You can check the status by logging with the admin credential via putty by SSH protocol > press 4 for the device console: 

    console> show advanced-firewall
    ===============================

    And the try toggling the following:
    1.) Midstream Connection Pickup
    2.)  TCP Seq Checking
    3.)  TCP Window Scaling
    4.)  TCP Selective Acknowledgements 

    Commands can be found here along with the explanation: https://docs.sophos.com/nsg/sophos-firewall/18.5/Help/en-us/webhelp/onlinehelp/CommandLineHelp/DeviceConsole/Set/index.html


  • This is a definitely a behaviour since upgrade to v19. as I have the same issues on another sites that worked perfectly fine till v19

    not sure what my next step should be, as I am not an expert in fine tuning advanced TCP settings

  • Your Dump does not include the entire connection. 

    So it is missing the handshake (connection establishment) and the timeout. 

    But from a perspective of SD-WAN and conntrack, this looks fine. It matches to SD-wan rule 11 and it moves the traffic to xfrm10 interface. 

    But in one of your plain text dumps: 

    01:10:39.264263 PortA, IN: In 00:50:56:90:7d:14 ethertype IPv4 (0x0800), length 1416: 192.168.1.250.53727 > 192.168.4.159.48991: Flags [P.], seq 5801:7161, ack 111552, win 4117, length 1360
    01:10:39.264300 xfrm10, OUT: Out ethertype IPv4 (0x0800), length 1416: 192.168.1.250.53727 > 192.168.4.159.48991: Flags [P.], seq 5801:7161, ack 111552, win 4117, length 1360
    01:10:45.392532 PortA, IN: In 00:50:56:90:7d:14 ethertype IPv4 (0x0800), length 56: 192.168.1.250.53727 > 192.168.4.159.48991: Flags [R.W], seq 7161, ack 111552, win 0, length 0
    01:10:45.392566 xfrm10, OUT: Out ethertype IPv4 (0x0800), length 56: 192.168.1.250.53727 > 192.168.4.159.48991: Flags [R.W], seq 7161, ack 111552, win 0, length 0

    Basically this means, the connection is there, both are talking and the 192.168.1.250 decides to close the connection for what ever reason. R.W means basically to reset (close the connection). 

    That looks odd to me, as this connection is healthy and gets closed by the client for some reason. 

    There seems to be a huge timeout after some time: 

    Basically this is the last packet coming from the peer: 

    01:04:00.997339 xfrm10, IN:  In ethertype IPv4 (0x0800), length 156: 192.168.4.159.48991 > 192.168.1.250.53727: Flags [P.], seq 111452:111552, ack 5801, win 4118, length 100

    After this, the client is sending 6 minutes data but no respond from the peer anymore. It is push/ack, which means, no respond needed but there could be the timeout. 

    I would recommend to look at the peer at this point. Because why is actually no packets arriving for 6 minutes? 

  • the application i tested with is Radmin viewer to Radmin server.

    i started a session and let it go.

    with v18 it never dropped. and this is not the only application I see it happening.

    I have upgraded 2 firewalls in another site to V19 and the exact same drop / disconnect happens.

  • What about the other peer? We are currently looking at one end. 

  • both peers are updated to V19

  • Can you give the same data like above from the other appliance as well? Maybe do the same processes on both appliances. 

Reply Children