Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

After Upgrade to v18.0 MR4 Auxillary Appliances boots in Failsafe Mode - Reason "Unable to apply NAT Rules"

Hi,

today i upgraded an Sophos XG Cluster from v18.0 MR 3 to v18.0 MR 4.

Everything looked fine, so i did an Failover check, Afterwards not all outgoing WAN Connection possible.

After some checks we recognized that the Appliance booted in the Failsafe mode.

After another Failover the Primary Appliances booted also in Failsafe Mode so the Problem was persistent.

So i decided to rebuild the HA Cluster, after i disabled it and rebooted the now Standalone Appliance, everything was working correctly.

After a Factory Reset for the Auxillary Device, i rebuild the Cluster.

Sadly the now Auxillary Appliance booted again into Failsafe Mode, the Reason is:

"Sophos Firmware Version SFOS 18.0.4 MR-4

failsafe> show failure-reason
Unable to apply NAT Rules"

Has anyone an Idea how i find out more details?

The Cluster has two WAN Interfaces

There are still several auto created and linked NAT Rules and SD-WAN Rules from the Migration from SFOS 17.5 to v18.0 MR3.

Sincerly

Gordon Leisering



This thread was automatically locked due to age.
Parents
  • Hi : Any suspected logs in nat_rule.log & applog.log when appliance giving fail safe status? I would suggest to open support case to have further investigation and conclude the issue further. 

  • Hi,

    The NAT Log was the solution.

    There was Log entries like:

    2021-01-05 14:23:43: NAT - executing cmd : /bin/nat add --id 73 --position 11 --state 1 --family 0 --translated-src 445 --dst-vhost-type 4 --translated-dst 506 --original-src 457 --original-dst 456 --original-service 29 --in-interface LAG_2.101,LAG_2.104,LAG_2.120,LAG_2.180,reds1.1020,reds1.1040
    2021-01-05 14:23:43: NAT - nat add cmd returned error : Error: Unknown interface reds1.1020


    2021-01-05 15:19:17: NAT - executing cmd : /bin/nat update --id 73 --position 11 --state 1 --family 0 --translated-src 445 --dst-vhost-type 4 --translated-dst 506 --original-src 457 --original-dst 456 --original-service 29 --in-interface LAG_2.101,LAG_2.104,LAG_2.120,LAG_2.180
    2021-01-05 15:19:17: NAT - nat update cmd returned error : ERROR:Invalid ruleid or family

    So it seems that the Auxillary Appliance has Problems to Build NAT Rules which has some RED VLAN Interfaces involved.

    In total there was 4 Rules i had to change.

    Sadly the XG stops building the NAT Rules whenever it first encounters an Problem and turns into Failsafe Mode.

    The totally explains the behaviour.

    The Problem is solved.

Reply
  • Hi,

    The NAT Log was the solution.

    There was Log entries like:

    2021-01-05 14:23:43: NAT - executing cmd : /bin/nat add --id 73 --position 11 --state 1 --family 0 --translated-src 445 --dst-vhost-type 4 --translated-dst 506 --original-src 457 --original-dst 456 --original-service 29 --in-interface LAG_2.101,LAG_2.104,LAG_2.120,LAG_2.180,reds1.1020,reds1.1040
    2021-01-05 14:23:43: NAT - nat add cmd returned error : Error: Unknown interface reds1.1020


    2021-01-05 15:19:17: NAT - executing cmd : /bin/nat update --id 73 --position 11 --state 1 --family 0 --translated-src 445 --dst-vhost-type 4 --translated-dst 506 --original-src 457 --original-dst 456 --original-service 29 --in-interface LAG_2.101,LAG_2.104,LAG_2.120,LAG_2.180
    2021-01-05 15:19:17: NAT - nat update cmd returned error : ERROR:Invalid ruleid or family

    So it seems that the Auxillary Appliance has Problems to Build NAT Rules which has some RED VLAN Interfaces involved.

    In total there was 4 Rules i had to change.

    Sadly the XG stops building the NAT Rules whenever it first encounters an Problem and turns into Failsafe Mode.

    The totally explains the behaviour.

    The Problem is solved.

Children
  • looks like we had the same issue with a dangling reference to a missing RED interface.

    @sophos do you even software test?!

    .18,reds12,reds12.18,reds13,reds13.18,reds15,reds15.18,reds16,reds16.18,reds17,reds17.18,reds18,reds18.18,reds19,reds19.18,reds20,reds20.18,reds21,reds21.18,reds22,reds22.18,reds23,reds23.18,reds24,reds24.18 --out-interface E4_E5.997,PortE2,PortE2.999
    2021-02-26 23:36:47: NAT - nat add cmd returned error : Error: Unknown interface reds11.18


    2021-03-01 08:25:15: NAT - executing cmd : /bin/nat update --id 2 --position 10 --state 1 --family 0 --masq --dst-vhost-type 0 --out-interface E4_E5.997,PortE2,PortE2.999
    2021-03-01 08:25:15: NAT - nat update cmd returned error : ERROR:Invalid ruleid or family


    2021-03-01 08:25:41: NAT - executing cmd : /bin/nat update --id 4 --position 9 --state 1 --family 0 --masq --dst-vhost-type 0 --in-interface E4_E5,E4_E5.17,E4_E5.1911,E4_E5.1950,E4_E5.1951,E4_E5.1952,E4_E5.1954,E4_E5.1956,E4_E5.1957,PortE0,reds11,reds11.18,reds12,reds12.18,reds13,reds13.18,reds15,reds15.18,reds16,reds16.18,reds17,reds17.18,reds18,reds18.18,reds19,reds19.18,reds20,reds20.18,reds21,reds21.18,reds22,reds22.18,reds23,reds23.18,reds24,reds24.18 --out-interface E4_E5.997,PortE2,PortE2.999
    2021-03-01 08:25:41: NAT - nat update cmd returned error : ERROR:Invalid ruleid or family


    2021-03-01 08:28:24: NAT - executing cmd : /bin/nat add --id 6 --position 9 --state 1 --family 0 --masq --dst-vhost-type 0 --out-interface E4_E5.997,PortE2,PortE2.999
    2021-03-01 08:28:36: NAT - executing cmd : /bin/nat delete --id 4 --family 0
    2021-03-01 08:28:36: NAT - nat delete cmd returned error : ERROR:Invalid ruleid or family


    2021-03-01 08:28:45: NAT - executing cmd : /bin/nat delete --id 2 --family 0
    2021-03-01 08:28:45: NAT - nat delete cmd returned error : ERROR:Invalid ruleid or family