Today we had some strange behaviour on our central Sophos XG 550 firewall cluster.
The firewall became unresponsive on the webinterface, from SophosFirewallManager. I was able to login through ssh however most commands did not work. For example it was not possible to get the ha status on the device console or do a reboot. Some kind of timeout resp. hang of the session.
The physical display on the master firewall was dark the buttons unresponsive.
The auxiliary looked good and as the shutdown via ssh was also not possible we powered down the master firewall with the power button. For some reasons the slave did not take over and became standalone and all services were interrupted.
After this we powered on the old master and after the rebooting the services were operable again. The webinterface is still not operable the ssh is working now however I did not try reboot or shutdown for obvious reasons.
l
Sophos Firmware Version SFOS 17.5.8 MR-8
console> system ha show details
HA status : Enabled
Current Appliance Key : C51028XBTB23G97
Peer Appliance Key : C51028W2RPBGB4C
Current HA state : Primary
Peer HA state : Auxiliary
HA Config Mode : Active-Passive
Load Balancing : Not Applicable
Dedicated Port : PortA4
Current Dedicated IP : 10.255.254.1
Peer Dedicated IP : 10.255.254.2
Monitoring Port : PortB2,PortB1,PortB3
Auxiliary Admin Port : PortB1
Auxiliary Admin IP : 10.19.255.249
Auxiliary Admin IPv6 :
Last time when I updated from MR-7 to MR-8 everything worked liked expected (update Auxiliary, switch to Auxiliary, Update Primary (=new Auxiliary) with minimum interruption (besides SSLVPN and IPSEC tunnels and RED tunnels through an SG in the DMZ.
What is happening here? How often does this happen? This HA seems to be very unstable and unusable. Just have a look on the prerquisites and compare it to the HA of other products like Kemp, Fortigate or Sophos SG.
This thread was automatically locked due to age.