Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SG 430 (Home Licence) Crashing/Freezing

Hi, 

I have a second hand SG 430, running with a home licence. Firmware version = 9.711-5. It has been running for over a year with no issues.

 However, since the beginning of May, it has crashed/frozen 3 times. (see attached picture of hardware usage - 3rd time was today). When it freezes, WebAdmin is unavailable, internet access is unavailable, VPN is unavailable, etc. Even the joystick control on the front of the SG 430 doesn't do anything. Only way to fix is to power off at the wall and then power back on. 

There seems to be no regularity to the crashes. I have checked the SMART status of the hard disk, which appears to have passed.

I'm wondering how to troubleshoot this issue? Is it likely to be a hardware or software issue? I'm thinking of completely re-installing UTM software, and then restoring configuration from Back-Up. Any advice would be greatly appreciated. 

Many thanks 



This thread was automatically locked due to age.
  • Hello ,

    I am glad that the database re-build was a success, now continue to monitor the status and if does not occur again as often it was, guess then the issue has been resolved. Or else last option would be to re-image the appliance. 

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • So SG430 has crashed - AGAIN!!! Going to go for the complete re-image, and then restore configuration from backup.

  • Yup, that would be the next plan of action...

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Okay so after complete re-image and restoration from backup, the SG430 is still crashing. So, back to square one...! 

    Might it be a hardware issue - if so what's the best way to troubleshoot this?

  • Run a memory and storage diagnostic. Are you able to pin point a specific service that's crashing?

  • Hello ,

    Use the following KBA to perform disk health: https://support.sophos.com/support/s/article/KB-000035535?language=en_US

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • I tried running this early on in the troubleshooting. And it said it had passed:

    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

    However looking in more detail, it appears some attributes are pre-fail:

    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       8961
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       62
    170 Unknown_Attribute       0x0033   089   100   010    Pre-fail  Always       -       0
    171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
    172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
    174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       22
    183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
    184 End-to-End_Error        0x0033   100   100   090    Pre-fail  Always       -       0
    187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
    190 Airflow_Temperature_Cel 0x0032   033   100   000    Old_age   Always       -       33 (Min/Max 12/51)
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       22
    199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
    225 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always       -       794249
    226 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always       -       65535
    227 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always       -       29
    228 Power-off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       65535
    232 Available_Reservd_Space 0x0033   089   100   010    Pre-fail  Always       -       0
    233 Media_Wearout_Indicator 0x0032   032   100   000    Old_age   Always       -       0
    241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       794249
    242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       331787
    249 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       245932
    

    Could this be the cause?

  • Storage diagnostic results above. I have been running atop "live" whilst waiting for it to freeze/crash, and the top process is always syslog-ng at 100% cpu usage

  • If you read the column headers, you'd see that pre-fail is the type of statistic that's collected not the status. The When_Failed column being empty should also give you some hints about whether or not anything has failed (nothing has).

    https://www.linuxjournal.com/article/6983
    Each Attribute also has a Threshold value (whose range is 0 to 255) which is printed under the heading "THRESH". If the Normalized value is less than or equal to the Threshold value, then the Attribute is said to have failed. If the Attribute is a pre-failure Attribute, then disk failure is imminent.

    So as long as the normalized value is higher than the thresshold value there's nothing to worry about.




    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • , what if you turn off the IPS and web-filtering and then check the results...

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.