This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

XG Firewall freezes up completely every month or so, nothing in logs so I can't determine the cause

At one of our client sites, an XG firewall works flawlessly most of the time, but every month or so it just stops working.  You can't ping it, it stops routing traffic, just nothing.  When you physically look at the firewall it looks fine and the activity lights blink.  Every time I've tried to track down a cause, I haven't been able to find one.  For example, in the system logs, those are normal up to the point where it stops functioning, and the logs don't resume until after a power cycle.  I have most of the logging enabled so it should catch at least something.  Any ideas to the cause of the freezing, or why nothing is caught in the logs?



This thread was automatically locked due to age.
Parents
  • Interesting, we managed approx 60 XG Firewalls most running 17.10 and have similar symptoms on approx 4-5 firewalls. We have a case open with GES at present. So far they seem to be focusing their attention on looking at whats occurring between the firewall & Sophos Firewall Manager.

Reply
  • Interesting, we managed approx 60 XG Firewalls most running 17.10 and have similar symptoms on approx 4-5 firewalls. We have a case open with GES at present. So far they seem to be focusing their attention on looking at whats occurring between the firewall & Sophos Firewall Manager.

Children
  • Below is what we have received from Sophos so far & it appears to be same issue across our affected firewalls 

    "Hello


    Development team found that 'DB is In deadlock due to the following command.'
    DEBUG     Mar 29 10:50:01  [apiExport:4847]: exec: argv[2] = '/bin/sh /scripts/API/cleanupdb.sh'

    To gather more cause analysis and then to proceed with resolution, we will need output of requested command when issue occurs. 

    The command should be executed when / during you face the issue and not after rebooting / restarting the appliance.

    Also please let us know immediately once you face the issue as we will have to collect /log/corpvaccum.log

    Let me know in case you have any query. 


    Regards,"

  • Hi  

    Thank you for sharing the details, could you please PM us the service request number so that we can keep an eye on the progress and details.

  • Thanks for reaching out Keyur.

    PM has been sent.

    Many thanks

    Adam

  • Hi,

    Below is the update we received last evening.

    “Development team have found following 2 lines from psql which indicates 2 parallel vacuum process running.

    23531 | vacuum full | active | 2020-05-11 19:19:58.115152+10 | 2020-05-11 19:19:58.116417+10 | 2020-05-11 19:19:58.116417+10 | 2020-05-11 19:19:58.116418+10

    9580 | vacuum full | active | 2020-05-11 20:31:47.524481+10 | 2020-05-11 20:31:47.52581+10 | 2020-05-11 20:31:47.52581+10 | 2020-05-11 20:31:47.525812+10

    To solve we need to change content of script /scripts/API/cleanupdb.sh and monitor it. “

  • So after all of this we received this reply

     

    "Hello All,
    The solution for the reported issue is to disable 'CCL' for each firewall from SFM.

    Please note that 'End of Life' for SFM is June 30, 2021

    Central Management which will be used instead of SFM does not have 'CCL' feature.
    www.sophos.com/.../xg-firewall-in-central.aspx

    For finding another workaround/solution we need to put CSC service in debug mode and will need output of some more Postgres queries. In case you are looking for another solution please provide me support Access ID of 2-3 different firewalls on which issue had occurred in the past.

    I will put the services in debugging and provide you steps on how to collect the required information when issues occur."


    Really disappointed with the response.