Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Issue with VLAN between 2 XG firewalls with V19 home edition

Hello,

I encounter weird communication issues on a network shared between two XG firewalls since V19 upgrade. A drawing being better than a long speech, here's my network topology :

In short : 

192.168.1.x is the LAN side of my ISP router.

192.168.2.x is my internal LAN, protected by XG1. Since some of my hosts cannot be directly plugged to XG1, I put a second XG (XG2) in another room, with some other hosts connected to it. XG1 and XG2 are interconnected with a trunk comprised of VLAN 100 for my LAN (and some other VLANS). VLAN 100 is member of LAN bridges on both XG1 and XG2. Actually, XG2 is simply used as a simple manageable L2 switch, extending XG1's LAN. 

The issue : 

This configuration used to work in V18.5. But since V19 migration, communications between LAN hosts on XG1 and XG2 are blocked, with a weird behavior

- the desktop on the left can flawlessly reach the Internet

- the desktop fails to communicate with the server on the right. A wireshark trace shows that a few first packets are exchanged, then communications are blocked.

- when pinging from desktop to server, the first packet gets a successful answer, but all following packets don't get answers. Same behavior with a ping from server to desktop.

- the desktop cannot connect to the XG1 admin interface on its LAN IP (even if admin is allowed on zone LAN) 

Since it looks like communications are blocked not immediately but after a few successful messages, I was thinking about firewall erroneously dropping packets. Unfortunately both the firewall log and the drop-packet-capture command are silent. IPS logs are silent too.

Any idea would be helpful.

Best Regards,

Matthieu



This thread was automatically locked due to age.
Parents
  • Where did you do the packet capture? One the firewalls? Webadmin or CLI? 

    Maybe there is a routing issue? Try doing the packet capture on Webadmin and check for the routes, if the packet arrive but no outbound packet, check the drop packet capture on the affected firewall. 

    __________________________________________________________________________________________________________________

  • Hello,

    I did the packet capture on XG1 directly through the Unix shell. Given the various tests I made I don't think it is a L3 issue but rather a L2 issue (desktop can access the Internet flawlessly for instance). drop packet capture doesn't provide any information on concerned trafic. 

    Best Regards

    Matthieu

  • The issue is fixed with the latest firmware but in your case, it is not working can you reboot the firewall with fsck-on-nextboot from Sophos XG SSH CLI console and check it again? Have you contacted Sophos Support Team with the case id?

    console>system fsck-on-nextboot show
    console>system fsck-on-nextboot on

    Regards

    "Sophos Partner: Infrassist Technologies Pvt Ltd".

    If a post solves your question please use the 'Verify Answer' button.

  • Hello,

    The reboot with fsck on doesn't solve the issue. I made some complimentary tests and found interesting results when pinging the other way around, from server to desktop

    If I run tcpdump icmp on XG1 (on server side, source of ping) : it solves the issue as long as tcpdump runs. 

    If I run tcpdump icmp on XG2 (on desktop side, target of ping) : it does'nt solve the issue

    This behavior allowed me to take a tcpdump on XG2 for both a working scenario (tcpdump running on XG1) and a non working scenario (tcpdump not running on XG1). Here are the results

    tcpdump output on XG2 while tcpdump was also running on XG1 (working scenario)

    SFVH_SO01_SFOS 19.0.1 MR-1-Build365# tcpdump icmp
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
    10:23:33.711770 Port1, IN: ethertype IPv4, IP server > desktop: ICMP echo request, id 1, seq 62, length 40
    10:23:33.711770 Port1.100, IN: IP server > desktop: ICMP echo request, id 1, seq 62, length 40
    10:23:33.712112 Port2, OUT: IP server > desktop: ICMP echo request, id 1, seq 62, length 40
    10:23:33.712444 Port2, IN: IP desktop > server: ICMP echo reply, id 1, seq 62, length 40
    10:23:33.712537 Port1.100, OUT: IP desktop > server: ICMP echo reply, id 1, seq 62, length 40
    10:23:33.712543 Port1, OUT: ethertype IPv4, IP desktop > server: ICMP echo reply, id 1, seq 62, length 40
    10:23:34.725940 Port1, IN: ethertype IPv4, IP server > desktop: ICMP echo request, id 1, seq 63, length 40
    10:23:34.725940 Port1.100, IN: IP server > desktop: ICMP echo request, id 1, seq 63, length 40
    10:23:34.726117 Port2, OUT: IP server > desktop: ICMP echo request, id 1, seq 63, length 40
    10:23:34.726604 Port2, IN: IP desktop > server: ICMP echo reply, id 1, seq 63, length 40
    10:23:34.726669 Port1.100, OUT: IP desktop > server: ICMP echo reply, id 1, seq 63, length 40
    10:23:34.726673 Port1, OUT: ethertype IPv4, IP desktop > server: ICMP echo reply, id 1, seq 63, length 40

    In this trace we can see two successful ping requests going through XG2, with a sequence like PORT1 IN => Port1.100 IN=>Port2 OUT=>Port2 IN=>Port1.100 OUT=>Port1 OUT

    Now the tcpdump output on XG2 while tcpdump was not running on XG1 (non-working scenario)

    10:22:27.215738 Port1, IN: ethertype IPv4, IP server > desktop: ICMP echo request, id 1, seq 58, length 40
    10:22:27.215738 Port1.100, IN: IP server > desktop: ICMP echo request, id 1, seq 58, length 40
    10:22:27.216043 Port2, OUT: IP server > desktop: ICMP echo request, id 1, seq 58, length 40
    10:22:27.216539 Port2, IN: IP desktop > server: ICMP echo reply, id 1, seq 58, length 40
    10:22:27.216656 Port1.100, OUT: IP desktop > server: ICMP echo reply, id 1, seq 58, length 40
    10:22:27.216667 Port1, OUT: ethertype IPv4, IP desktop > server: ICMP echo reply, id 1, seq 58, length 40
    10:22:28.238536 Port1, IN: IP server > desktop: ICMP echo request, id 1, seq 59, length 40
    10:22:32.943130 Port1, IN: IP server > desktop: ICMP echo request, id 1, seq 60, length 40
    10:22:37.942102 Port1, IN: IP server > desktop: ICMP echo request, id 1, seq 61, length 40

    We see that the first ping request works and is similar to the previous scenario. But the following requests fail and we only see the Port1 IN, and never the next steps of the sequence. It looks like XG2 never pass the Port1 IN packet to the Port1.100 interface 

    Moreover, when downloading the trace from XG2 and seeing it in Wireshark it seems that with the non working scenario, the "Port1 IN" request of the first PING request arrives with the appropriate 802.1Q VLAN tag from XG1, but  "Port1 IN" request of the following PING failing requests arrive untagged, which could explain that it breaks the sequence. 

    Hope this helps understand the issue

    BTW am I allowed to open a support case with the home licence ? 

    Best Regards

    Matthieu

  • If you have Enhanced Support you can raise a case on https://support.sophos.com/support/s/?language=en_US#t=AllTab&sort=relevancy 

    As you said earlier it was working with version 18, the same version is available on licensing portal if you have the backup from version 18 where it was working you can upgrade to v18 and restore the backup ?

    https://docs.sophos.com/nsg/sophos-firewall/18.5/Help/en-us/webhelp/onlinehelp/AdministratorHelp/BackupAndFirmware/Firmware/FirmwareDownloadFirmware/index.html#download-firmware 

    Please make sure the backup is from version 18.5.4 MR4-Build418 not from later version

    Regards

    "Sophos Partner: Infrassist Technologies Pvt Ltd".

    If a post solves your question please use the 'Verify Answer' button.

  • Sounds like you have a problem with your virtual firewalls.

    If you start a tcpdump, essentially the NIC will start to go into the https://en.wikipedia.org/wiki/Promiscuous_mode 

    If this mode will resolve your issue, it could be a potential driver issue or network issue within your network. 

    So why is the tagging not working. Who is doing the tagging in your network. It will be hard to prove, if you dont do a dump, if the firewall actually use the tagging or not. But maybe your switch is doing something. 

    __________________________________________________________________________________________________________________

  • Hello

    In my network (home network) VLANs exist only on the XG1-XG2 link. This link uses a direct Ethernet cable between XG1 and XG2. Apart from that, hosts are connected either directly on untagged XG physical ports (members of bridge), or through basic unmanageable Ethernet switches (Netgear GS608). What I see in the trace in my previous post is that after the first successful ping request, XG1 seems to "forget" to tag the next requests sent to XG2. 

    XG1 and XG2 are not virtual, they are installed on Jetway mini PC fitted with 10 lan ports.

    Best Regards 

    Matthieu

  • Jetway mini PC fitted with 10 lan ports.

    May I know the current hardware such as Network interface cards, RAM, HDD or SSD, USB pen drive or maybe it is End-of-support and End-of-Life?

    Sophos XG works well on tested platforms as per the link: https://docs.sophos.com/nsg/sophos-firewall/18.0/Help/en-us/webhelp/onlinehelp/VirtualAndSoftwareAppliancesHelp/vs_VirtualSoftwareApplianceIntro/index.html 

    Also refer: https://docs.sophos.com/nsg/sophos-firewall/18.0/Help/en-us/webhelp/onlinehelp/VirtualAndSoftwareAppliancesHelp/SoftwareAppliance/index.html 

    Regards 

    "Sophos Partner: Infrassist Technologies Pvt Ltd".

    If a post solves your question please use the 'Verify Answer' button.

  • Sure, here's the product sheets below of the mini PC hosting XG1 (XG2 is similar but with Celeron J1900 CPU). XG1 has 8 GB RAM and a Samsung 860 EVO 250GB M2 SSD drive. XG2 has a 256 GB Samsung Serial ATA Samsung SSD and 8 GB RAM too

    Hope this helps

  • In this case, can you check with system firewall-acceleration by turning off from both XG as per the below snapshot and share the result ?

    Thanks and Regards

    "Sophos Partner: Infrassist Technologies Pvt Ltd".

    If a post solves your question please use the 'Verify Answer' button.

  • Hi,

    I turned off firewall acceleration on both XG and it solved the issue : the ping test succeeds and communications on the shared network between XG1 and XG2 seem to work fine now. So it seems to be a bug in firewall acceleration. I will let firewall acceleration disabled and see if I encounter further issues or not. If it works fine, this workaround seems OK for me since it has unnoticeable performance impact on my home network, so I can wait for a fix. Do I need to open a case to have this issue fixed in a future maintenance release or is it OK through this forum thread ?

    Best Regards

    Matthieu

  • Hi matthieu, 

    Thank you for reporting this. We have already opened a bug and is investigating the issue. For reference it is ID NC-102614. 

Reply Children
No Data