This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

HA Active-Passive Configuration on VMs (devices not accessible)

I'm attempting to spin up a new HA cluster of a couple of XG VMs for a new environment.  This is hosted in Hyper-V

I followed the basic setup here for active-passive mode: https://community.sophos.com/kb/en-us/123174

I confirmed that the devices can see each other on the HA link interface, and both will respond properly if I assign addresses to various interfaces they have.

I enable the HA settings on the aux device and it saves properly, however as soon as I enable HA on the primary device, both devices become unreachable either by ping or on the admin console.  The VMs then seem to go into an endless loop of rebooting every 2-3 minutes.  If I log in via the console to either device and run system ha show details they always show HA status enabled, with the current HA state of Standalone and the peer HA state of Fault.

I attempted to leave the VMs for several hours to see if they would sort themselves out of the issue, but nothing changed.  I also tried simply powering off the secondary device entirely and rebooting the primary to determine if the primary would at least come up in that state but it had the same behavior.

If I run system ha disable from the console on a device, it seems to immediately start to respond again, so it doesn't appear anything is corrupt or fully broken with the config itself, just when HA is enabled.

If I disable HA and look at the log viewer the only relevant thing I seem to see is "Appliance with appliance key XXXX becomes standalone at appliance startup" 

Are there additional steps that need to be taken in order to deploy XG in an HA config on Hyper-V or a way to view detailed information as to why both devices seem to report the other in fault mode or why the primary won't even respond if the secondary is off/unplugged?



This thread was automatically locked due to age.
Parents
  • Hi,

    HA in virtual Env is a little bit tricky because XG (like UTM HA) spoofs the MAC in Case of a HA takeover. And basically the Hyper-V vSwitch does not like this stage. 

    Check the Hyper-V Forum for a workaround on MAC - Spoofing and how to enable it. 

  • Thanks, that got me a little further :)

    I enabled MAC spoofing on the NICs for both VMs.  Configured the secondary/aux device and then enabled HA on the primary.  I receive an error message that says "HA has been enabled successfully, but it is recommended to check the physical connectivity of peer monitoring ports" and the secondary device seems to go into an endless rebooting loop, although the primary device seems to work/respond properly.

    Interestingly enough the secondary device seems to stop rebooting if I disconnect the peer port in Hyper-V (make it not connected to a network).

     

    Also if I disabled HA, I can confirm that both devices are accessible via SSH on the monitoring port, so it doesn't looks like it's an actual link issue.

    Are there any other issues that might cause that error?  Or additional steps I can take in troubleshooting?

  • I have exactly the same issue on hyper-v. I cannot fathom this out. I can ssh into each appliance.

  • Has anyone figured this out yet? HA deployment in VMware appears to work fine, but Hyper-V is a no go. Interestingly enough, an HA deployment in Azure doesn't appear to use the built in HA functionality - it uses external load balancers (community.sophos.com/.../127934 )

    @SophosSupport: does this mean HA in Hyper-V is not supported? What's the official position?

    Thx in advance!
    M.

  • I ended up giving up without solving it, figuring I could schedule the downtime for things behind this device when I needed to reboot for updates/etc.

    I thought it may have had something to do with the mac spoofing tripping up the physical layer (either host or switch) in this environment because we had a different issue with the arp cache not properly updating in some cases.  It was around that time I determined that the headaches weren't worth it and I stopped trying as I had other projects I needed to move to.

    If someone can determine an easy way to configure this on hyper-v (especially in a replicable manner that doesn't tie up a real license key so I can play with it in a test environment), I'd love to have it solved.

  • Multicast is not supported in public cloud environments like Azure and AWS hence the reason for using the Azure/AWS load balancers to achieve HA in those environments.

  • That would explain Azure/AWS, but not on prem Hyper-V environments.

    The lack of any kind of statement in this regard (HA on Hyper-V) is a bit disappointing …..

Reply Children
No Data