Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Passive appliance in HA is in Faulty State

I recently upgraded and moved XG 3300 into a full fledge production mode and after upgrading it to 19.01 MR1 i am seeing issues related to stability of the cluster. The main issue is after the Auxiliary device joins the cluster it stays as Auxiliary for a day or 2 then changes state to Fault, after some time Report Db, Support Access services are also dead. We have already performed RMA once and now the replacement device is showing the same behavior. Sophos support says its a DB corruption issue and device has to be re imaged.  Has any one seen this kind of behavior?



This thread was automatically locked due to age.
Parents Reply Children
  • I received response from Sophos Global Escalation Specialist according to whom the DNS servers reachable behind the XFRM interface are responsible for the crash and asked us to switch to public DNS servers not reachable over XFRM interface. 2 things to note here

    1. DNS servers behind XFRM interfaces causing the crash in our case is not valid as we have same devices in other location configured with same DNS servers behind XFRM interface and they are not showing this behaviour

    2. Using Public DNS serves will result in name resolution failures for internal devices who rely on internal DNS servers for names  resolution.

    The solution provided indicated the issue with Sophos firewall basic functionality. If what Sophos says DNS server behind xfrm interface is the root cause of the issue then the device cannot be configured or connected to any other location via route based VPN service as XFRM interface is automatically created when route based VPN is configured and having a local DNS server will result in device crashes.

    Also when the device goes into fault state why and how does it take all the services of the Primary node down eventually? This behavior is unclear to us and neither has Sophos being able to diagnose this problem.

  • Hello Ankur,

    Thank you for the update.

    Checking in the NC that the GES engineer assigned (NC-108226) it seems an RCA has been requested. As your case seems to match these symptoms. 

    I have left a note in the case about your comments.

    Regards,


     
    Emmanuel (EmmoSophos)
    Technical Team Lead, Global Community Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.