This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How RAID status of standby member of HA clusters can be proactively monitored?

Hello,

it happened that a failed SSD RAID of standby member was discovered by chance, watching the dashboard during Up2date process.

Specifically, RAID status under resource usage panel changed from OK to Not OK or something when the standby member became temporarily active because of ongoing Up2date process.

IMHO, it seems that RAID drive pair of any standby member can fail without notice. Am I wrong?

This thread was automatically locked due to age.

0 BAlfson over 9 years ago

Hi Gabriele, and welcome to the UTM Community!

Health monitoring of the Slave system sounds like an excellent feature suggestion!

Cheers - Bob

Sophos UTM Community Moderator
Sophos Certified Architect - UTM
Sophos Certified Engineer - XG
Gold Solution Partner since 2005

MediaSoft, Inc. USA
Cancel
Vote Up 0 Vote Down

Cancel
0 CS over 9 years ago

It's really a problem to monitor the hardware status of a slave. I'm doing a master reboot every week to rotate the roles and make a regulary restart.
If You want to have a look at the raid, must connect to the slave via "ha_utils ssh" decribed here:
community.sophos.com/.../21326

and use the shell raid commands like: tw_cli-x86_64 (for areca raid in asg525)
It should (not) looks like that:

utm:/home/login # tw_cli-x86_64 info c0

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 REBUILDING 74% - - 465.651 RiW ON

VPort Status Unit Size Type Phy Encl-Slot Model
------------------------------------------------------------------------------
p0 DEGRADED u0 465.76 GB SATA 0 - ST500DM002-1BD142
p1 OK u0 465.76 GB SATA 1 - ST500DM002-1BD142

Sophos Certified Architect (UTM + XG)
Cancel
Vote Up 0 Vote Down

Cancel