Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Passive appliance in HA is in Faulty State

I recently upgraded and moved XG 3300 into a full fledge production mode and after upgrading it to 19.01 MR1 i am seeing issues related to stability of the cluster. The main issue is after the Auxiliary device joins the cluster it stays as Auxiliary for a day or 2 then changes state to Fault, after some time Report Db, Support Access services are also dead. We have already performed RMA once and now the replacement device is showing the same behavior. Sophos support says its a DB corruption issue and device has to be re imaged.  Has any one seen this kind of behavior?



This thread was automatically locked due to age.
Parents
  • Hi Ankur,

    Thank you for reaching out to Sophos Community.

    Would you be so kind as to share the case ID related to this issue? Thank you 

    Erick Jan
    Community Support Engineer | Sophos Technical Support
    Sophos Support Videos Product Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.

  • 06051687. We re imaged the secondary node but that did not restore the services so we ended up restarted the primary node as well after which the services came back online. However after about 5 hours we received notification of the secondary node going into Fault State but it looks like it recovered and is back into Auxiliary mode. What's unclear is if the secondary node is in Fault state why and how does it take all the services of the Primary node down/Dead? 

  • Hello Ankur,

    Thank you for the Case ID; it seems you’ll have another session with an L2 on Friday; after today's session, some steps have been provided to you to be performed today after hours.

    We’ll keep monitoring the Case.

    Regards,


     
    Emmanuel (EmmoSophos)
    Technical Team Lead, Global Community Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • I received response from Sophos Global Escalation Specialist according to whom the DNS servers reachable behind the XFRM interface are responsible for the crash and asked us to switch to public DNS servers not reachable over XFRM interface. 2 things to note here

    1. DNS servers behind XFRM interfaces causing the crash in our case is not valid as we have same devices in other location configured with same DNS servers behind XFRM interface and they are not showing this behaviour

    2. Using Public DNS serves will result in name resolution failures for internal devices who rely on internal DNS servers for names  resolution.

    The solution provided indicated the issue with Sophos firewall basic functionality. If what Sophos says DNS server behind xfrm interface is the root cause of the issue then the device cannot be configured or connected to any other location via route based VPN service as XFRM interface is automatically created when route based VPN is configured and having a local DNS server will result in device crashes.

    Also when the device goes into fault state why and how does it take all the services of the Primary node down eventually? This behavior is unclear to us and neither has Sophos being able to diagnose this problem.

Reply
  • I received response from Sophos Global Escalation Specialist according to whom the DNS servers reachable behind the XFRM interface are responsible for the crash and asked us to switch to public DNS servers not reachable over XFRM interface. 2 things to note here

    1. DNS servers behind XFRM interfaces causing the crash in our case is not valid as we have same devices in other location configured with same DNS servers behind XFRM interface and they are not showing this behaviour

    2. Using Public DNS serves will result in name resolution failures for internal devices who rely on internal DNS servers for names  resolution.

    The solution provided indicated the issue with Sophos firewall basic functionality. If what Sophos says DNS server behind xfrm interface is the root cause of the issue then the device cannot be configured or connected to any other location via route based VPN service as XFRM interface is automatically created when route based VPN is configured and having a local DNS server will result in device crashes.

    Also when the device goes into fault state why and how does it take all the services of the Primary node down eventually? This behavior is unclear to us and neither has Sophos being able to diagnose this problem.

Children