Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How RAID status of standby member of HA clusters can be proactively monitored?

Hello,

it happened that a failed SSD RAID of  standby member was discovered by chance, watching the dashboard during Up2date process.

Specifically, RAID status under resource usage panel changed from OK to Not OK or something when the standby member became temporarily active because of ongoing Up2date process.

IMHO, it seems that RAID drive pair of any standby member can fail without notice. Am I wrong? 



This thread was automatically locked due to age.
  • Hi Gabriele, and welcome to the UTM Community!

    Health monitoring of the Slave system sounds like an excellent feature suggestion!  

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • It's really a problem to monitor the hardware status of a slave. I'm doing a master reboot every week to rotate the roles and make a regulary restart.
    If You want to have a look at the raid, must connect to the slave via "ha_utils ssh" decribed here:
    community.sophos.com/.../21326

    and use the shell raid commands like: tw_cli-x86_64 (for areca raid in asg525)
    It should (not) looks like that:

    utm:/home/login # tw_cli-x86_64 info c0

    Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
    ------------------------------------------------------------------------------
    u0 RAID-1 REBUILDING 74% - - 465.651 RiW ON

    VPort Status Unit Size Type Phy Encl-Slot Model
    ------------------------------------------------------------------------------
    p0 DEGRADED u0 465.76 GB SATA 0 - ST500DM002-1BD142
    p1 OK u0 465.76 GB SATA 1 - ST500DM002-1BD142

     

    Sophos Certified Architect (UTM + XG)