This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Up2Date causing 100% CPU spikes, cured by a reboot for 3 or so weeks and then starts happening again.

On a regular basis and for quite some time, my ASG320 UTM running firmware version 9.411-3 and prior versions, the firewall will suddenly become unresponsive/very slow about the same time everyday and it coincides with an Up2Date pattern definitions update. In fact, if I manually force a definition update, it will cause the problem. It certainly seems to be related to an Up2Date attempt and/or the DB process that is necessary right after the attempt.

What happens:

The CPU graph in the Dashboard spikes to 100% and the firewall becomes very sluggish and users start complaining about timeouts on websites/services typical of a failing internet connection. This goes on for about 10 minutes of disruption and then appears to resolve itself. During this problem it is difficult to navigate webadmin and even atop is sluggish.

Running atop shows cpu's as very busy with wait states and sda is busy over 100%. It definitely shows a problem.

Calls to Sophos tech support do not resolve this issue. I reinstalled the ASG320 from scratch and restored the config as per their but that did not fix it.

My current solution is to reboot the firewall when it happens and then I am good for about 3 weeks.

Any suggestions?



This thread was automatically locked due to age.
Parents
  • Harrison, I think this is a problem with cssd that I've written about here in the past.  When the UTM is busy and cssd starts applying antivirus pattern updates, it gobbles RAM and never releases it.  I tried getting someone in Support to escalate to the devs.  They may have, but I never heard back.

    The solution I've put in place for several of my clients is to limit pattern updates to non-work hours and lunch.  This involves setting the interval to "Manual" in WebAdmin after adding a line like 0 7,12,18 * * 0,1,2,3,4,5,6 root /sbin/audld.plx --nosys --trigger to /etc/crontab-static.

    If you're not using Sandstorm, you might instead try using just single-scan in Email & Web.  Let us know if avoiding Sophos or Avira helps.

    Cheers - Bob

  • Thanks Bob, I believe you mean a memory issue such as the attached image taken from my freshly rebooted firewall and run through 2 up2date cycles?

     

     

    One would think that the dev team would be all over this. The above image clearly shows the memory grab at the beginning of two up2date definition settings happening 12 hours apart. Looks like it grabs about 25% of total memory each time and then does not release it and forces memory to be swapped. This could easily explain why performance degrades if stuff keeps getting pushed to a swap file until critical processes get caught up.

    I am going to start rebooting weekly for now by scheduling a cron job and I also am going to limit definition updates to once a day - I may also use your cron example and do the manual update.

    FYI, I manually entered a reboot command using crontab -e and it survived a reboot. Maybe though it will not survive a version update? I will stick my custom schedules into crontab-static still.

     

  • Thanks for the good research and presentation of those graphs, Harrison.  Those were pattern updates, I bet, instead of Up2Dates to firmware - right?

    Every time you change a schedule in WebAdmin, crontab is rebuilt by the configuration daemon.  It probably survives most Up2Dates and will survive all reboots.

    Cheers - Bob

  • I am having issues with crontab-static and also getting the reboot to work as expected.

     

    Here is what is in my crontab-static file:

    0 6,19 * * 0,1,2,3,4,5,6 root /sbin/audld.plx --nosys --trigger
    0 18 * * 0 root /sbin/reboot
    SHELL=/bin/sh
    PATH=/usr/bin:/usr/sbin:/sbin:/bin
    MAILTO=""
     
    I am not certain as to why the Environment variable assignment lines exist in this file as I think they should be handled elsewhere.
     
    At any rate, no matter what I do, the cron jobs never appear when I do a crontab -l at the command line. I have entered the update schedule manually via crontab -e and it is working as designed for now. I also entered the reboot command the same way but it does not execute. (was logged in as root via ssh when I did this).
     
    Also, when I modify the schedule in webadmin, I see no evidence that the crontab is being refreshed.
     
    Any suggestions?  
Reply
  • I am having issues with crontab-static and also getting the reboot to work as expected.

     

    Here is what is in my crontab-static file:

    0 6,19 * * 0,1,2,3,4,5,6 root /sbin/audld.plx --nosys --trigger
    0 18 * * 0 root /sbin/reboot
    SHELL=/bin/sh
    PATH=/usr/bin:/usr/sbin:/sbin:/bin
    MAILTO=""
     
    I am not certain as to why the Environment variable assignment lines exist in this file as I think they should be handled elsewhere.
     
    At any rate, no matter what I do, the cron jobs never appear when I do a crontab -l at the command line. I have entered the update schedule manually via crontab -e and it is working as designed for now. I also entered the reboot command the same way but it does not execute. (was logged in as root via ssh when I did this).
     
    Also, when I modify the schedule in webadmin, I see no evidence that the crontab is being refreshed.
     
    Any suggestions?  
Children