This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Malfunction in Cluster

Today we had some strange behaviour on our central Sophos XG 550 firewall cluster.

 

The firewall became unresponsive on the webinterface, from SophosFirewallManager. I was able to login through ssh however most commands did not work. For example it was not possible to get the ha status on the device console or do a reboot. Some kind of timeout resp. hang of the session.

The physical display on the master firewall was dark the buttons unresponsive.

 

The auxiliary looked good and as the shutdown via ssh was also not possible we powered down the master firewall with the power button. For some reasons the slave did not take over and became standalone and all services were interrupted.

 

After this we powered on the old master and after the rebooting the services were operable again. The webinterface is still not operable the ssh is working now however I did not try reboot or shutdown for obvious reasons. 

l
Sophos Firmware Version SFOS 17.5.8 MR-8

console> system ha show details
HA status : Enabled
Current Appliance Key : C51028XBTB23G97
Peer Appliance Key : C51028W2RPBGB4C
Current HA state : Primary
Peer HA state : Auxiliary
HA Config Mode : Active-Passive
Load Balancing : Not Applicable
Dedicated Port : PortA4
Current Dedicated IP : 10.255.254.1
Peer Dedicated IP : 10.255.254.2
Monitoring Port : PortB2,PortB1,PortB3
Auxiliary Admin Port : PortB1
Auxiliary Admin IP : 10.19.255.249
Auxiliary Admin IPv6 :


Last time when I updated from MR-7 to MR-8 everything worked liked expected (update Auxiliary, switch to Auxiliary, Update Primary (=new Auxiliary) with minimum interruption (besides SSLVPN and IPSEC tunnels and RED tunnels through an SG in the DMZ.



What is happening here? How often does this happen? This HA seems to be very unstable and unusable. Just have a look on the prerquisites and compare it to the HA of other products like Kemp, Fortigate or Sophos SG.



This thread was automatically locked due to age.
Parents
  • FormerMember
    0 FormerMember

    Hi, 

    Apologies for the inconvenience caused. Can you please check HA monitoring ports, is there any WAN interface added there? If, yes please check dgd.log to see if HA monitored WAN port reported dead prior to this issue. 

    Thanks,

     

Reply
  • FormerMember
    0 FormerMember

    Hi, 

    Apologies for the inconvenience caused. Can you please check HA monitoring ports, is there any WAN interface added there? If, yes please check dgd.log to see if HA monitored WAN port reported dead prior to this issue. 

    Thanks,

     

Children
  • The WAN interface is in the HA monitoring ports.

    The entries in the dgd.log stop on 4th of November and started again yesterday after the reboot of the old master.

    As I was not onsite personally I must correct myself a little bit:

    After my colleague rebooted the old master (by pulling the plug and leaving the power off for 5 minutes) the Slave was going into standalone mode and services were interrupted. 

    Currently the old slave is master and the old master is slave. Webinterface is not working.


    Can I acces the log dgd.log on the old master somehow?

  • FormerMember
    0 FormerMember in reply to BeEf

    Hi BeEf,

    Yes, you can if the old master is still part of HA configuration and still connected to the new master.

    SSH to the new master and use this command to access old masters CLI : dbclient hauser@<IP address of the old master>. Once you have access to old masters CLI you should be able to review the logs from it.

    Thanks,