This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ASG 8.2 HA under vSphere 5 -> WAN issues

Okay, so this is probably a bit special...

I am running two virtual ASG 8.202's in HA (Hot standby) on two ESXi 5.0 hosts.
There are two WAN Uplinks - both "Cable Modem" DHCP.
Due to previous issues on the LAN I have disabled HA MACs.

The two WAN DHCP leases bind to MAC addresses.
So I have spoofed WAN MACs in VMWare - so WAN1 MAC is identical on both ASGs and WAN2 MAC is identical on both ASGs.

The setup worked fine on a combination of ESXi 4.1 and ASG 8.1.
But I have not tested failover since upgrading ASG and ESXi - until now.

When I power down the master, the slave does not bring up the two WAN interfaces,

In the Dashboard WAN1 and WAN2 display as "State=Error" and "Link=Down".
But if I go to "Advanced" -> "Support" -> "Interfaces Table", both interfaces are listed as "up" and seem to have assigned IP-addresses.

If I power on the master, it will become active again and everything will run fine.

If I instead reboot the slave, the interfaces will come up, and everything runs fine.

I am not sure if this is a ESXi 5 issue - or an ASG 8.2 issue?
Maybe something to do with the spoofed MACs?

Any ideas about where to start looking?

Best regards
Martin

This thread was automatically locked due to age.

Parents

0 da_merlin over 13 years ago
Probably this one, will be fixed in 8.300. In the meantime, you have to replace dhcp_updown.plx:

If you are affected by this issue, you can run the following commands as root on your ASG:
wget http://people.astaro.com/uweber/mantis_19139/dhcp_updown.plx
mv /var/chroot-dhcpc/usr/sbin/dhcp_updown.plx /var/chroot-dhcpc/usr/sbin/dhcp_updown.plx.org
mv dhcp_updown.plx /var/chroot-dhcpc/usr/sbin/dhcp_updown.plx
chmod a+x /var/chroot-dhcpc/usr/sbin/dhcp_updown.plx

Cheers
Ulrich
Cancel
Vote Up 0 Vote Down

Cancel
0 martinh_dk over 13 years ago in reply to da_merlin

Probably this one, will be fixed in 8.300. In the meantime, you have to replace dhcp_updown.plx:

Cheers
Ulrich

Hi Ulrich,

Thank you for the answer.
The fix changed HA from "not working at all" to "flaky" [:)]

Both interfaces still come up as "Error" and "Down" in the dashboard.
But WAN2 was actually up and running.

A manual DHCP Renew on both interfaces brought them up and corrected the status in the dashboard.

I will run a few tests and report if the behavior is consistent.

Best regards
Martin
Cancel
Vote Up 0 Vote Down

Cancel
0 martinh_dk over 13 years ago in reply to martinh_dk

Strange indeed:

Both interfaces seem to bind to DHCP assigned IPs.
But WAN1 is only up for about 20 seconds - and then it stops responding.

If I renew the lease on WAN1 manually, it will come up shortly after.
But simultaneously WAN2 will stop responding - until I renew the lease on WAN2 as well.

Here is a log of the initial DHCP bind:
2011:11:14-12:31:25 ASTARO-2 dhclient: Listening on LPF/eth1/00:50:56:11:22:33
2011:11:14-12:31:25 ASTARO-2 dhclient: Sending on   LPF/eth1/00:50:56:11:22:33
2011:11:14-12:31:25 ASTARO-2 dhclient: Sending on   Socket/fallback
2011:11:14-12:31:25 ASTARO-2 dhclient: DHCPREQUEST on eth1 to 255.255.255.255 port 67
2011:11:14-12:31:25 ASTARO-2 dhclient: DHCPACK from 87.55.253.***
2011:11:14-12:31:28 ASTARO-2 dhclient: Listening on LPF/eth2/00:50:56:33:44:55
2011:11:14-12:31:28 ASTARO-2 dhclient: Sending on   LPF/eth2/00:50:56:33:44:55
2011:11:14-12:31:28 ASTARO-2 dhclient: Sending on   Socket/fallback
2011:11:14-12:31:28 ASTARO-2 dhclient: DHCPREQUEST on eth2 to 255.255.255.255 port 67
2011:11:14-12:31:28 ASTARO-2 dhclient: DHCPACK from 172.27.0.***
2011:11:14-12:31:30 ASTARO-2 dhclient: bound to 176.21.36.*** -- renewal in 2940 seconds.
2011:11:14-12:31:31 ASTARO-2 dhclient: bound to 87.104.147.*** -- renewal in 30214 second

Here is a log of the subsequent manual renew on WAN1:
2011:11:14-12:42:29 ASTARO-2 dhclient: Listening on LPF/eth1/00:50:56:11:22:33
2011:11:14-12:42:29 ASTARO-2 dhclient: Sending on   LPF/eth1/00:50:56:11:22:33
2011:11:14-12:42:29 ASTARO-2 dhclient: Sending on   Socket/fallback
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 6
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPREQUEST on eth1 to 255.255.255.255 port 67
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPOFFER from 87.55.253.***
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPACK from 87.55.253.***
2011:11:14-12:42:30 ASTARO-2 dhclient: bound to 176.21.36.*** -- renewal in 2756 seconds.

Any ideas?

BTW: The issue only occurs with "forced" failover.
When the master node is up and running again, it gracefully takes over.

/Martin
Cancel
Vote Up 0 Vote Down

Cancel

Reply

0 martinh_dk over 13 years ago in reply to martinh_dk

Strange indeed:

Both interfaces seem to bind to DHCP assigned IPs.
But WAN1 is only up for about 20 seconds - and then it stops responding.

If I renew the lease on WAN1 manually, it will come up shortly after.
But simultaneously WAN2 will stop responding - until I renew the lease on WAN2 as well.

Here is a log of the initial DHCP bind:
2011:11:14-12:31:25 ASTARO-2 dhclient: Listening on LPF/eth1/00:50:56:11:22:33
2011:11:14-12:31:25 ASTARO-2 dhclient: Sending on   LPF/eth1/00:50:56:11:22:33
2011:11:14-12:31:25 ASTARO-2 dhclient: Sending on   Socket/fallback
2011:11:14-12:31:25 ASTARO-2 dhclient: DHCPREQUEST on eth1 to 255.255.255.255 port 67
2011:11:14-12:31:25 ASTARO-2 dhclient: DHCPACK from 87.55.253.***
2011:11:14-12:31:28 ASTARO-2 dhclient: Listening on LPF/eth2/00:50:56:33:44:55
2011:11:14-12:31:28 ASTARO-2 dhclient: Sending on   LPF/eth2/00:50:56:33:44:55
2011:11:14-12:31:28 ASTARO-2 dhclient: Sending on   Socket/fallback
2011:11:14-12:31:28 ASTARO-2 dhclient: DHCPREQUEST on eth2 to 255.255.255.255 port 67
2011:11:14-12:31:28 ASTARO-2 dhclient: DHCPACK from 172.27.0.***
2011:11:14-12:31:30 ASTARO-2 dhclient: bound to 176.21.36.*** -- renewal in 2940 seconds.
2011:11:14-12:31:31 ASTARO-2 dhclient: bound to 87.104.147.*** -- renewal in 30214 second

Here is a log of the subsequent manual renew on WAN1:
2011:11:14-12:42:29 ASTARO-2 dhclient: Listening on LPF/eth1/00:50:56:11:22:33
2011:11:14-12:42:29 ASTARO-2 dhclient: Sending on   LPF/eth1/00:50:56:11:22:33
2011:11:14-12:42:29 ASTARO-2 dhclient: Sending on   Socket/fallback
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 6
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPREQUEST on eth1 to 255.255.255.255 port 67
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPOFFER from 87.55.253.***
2011:11:14-12:42:29 ASTARO-2 dhclient: DHCPACK from 87.55.253.***
2011:11:14-12:42:30 ASTARO-2 dhclient: bound to 176.21.36.*** -- renewal in 2756 seconds.

Any ideas?

BTW: The issue only occurs with "forced" failover.
When the master node is up and running again, it gracefully takes over.

/Martin
Cancel
Vote Up 0 Vote Down

Cancel

Children

No Data