Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Site-To-Site VPN to Azure stays down if WAN connection bounces

Hello,

I've had a VPN from on-prem XG to Azure running for several months with no issues, until recently. There was a power outage few months ago at the client's site. Once power was restored, and all the devices back online, I noticed that the VPN stayed down. Was not able to bring it up by some standard tshooting (did reset of VPN on XG, Azure, Reset Azure GW, etc). Ended up deleting the VPN Connection in Azure/recreating again + configuring new VPN configuration on XG, using this guide: https://community.sophos.com/sophos-xg-firewall/f/recommended-reads/118404/sophos-xg-firewall-how-to-configure-a-site-to-site-ipsec-vpn-with-multiple-sas-to-a-route-based-azure-vpn-gateway

Few weeks later, there was an issue with Internet provider and the WAN connection bounced (was down for several minutes). Again, once connection was back online, the VPN stayed down. Tried the standard tshooting steps as above again, with no luck. Again, had to recreate the same way as before (connection in Azure, and VPN on XG).

So, it looks like if the WAN connection that is used for VPN to Azure goes down, the VPN is struggling to come up once the WAN is up again. I'm wondering about what to do to avoid this.

FYI - I have 2 XG 210 boxes on the site in HA. Created a VPN connection via PORT2. There are 2 WAN connections (Primary and Backup). The WAN connections are DSL.

Would appreciate help on this.



This thread was automatically locked due to age.
Parents
  • FormerMember
    0 FormerMember

    Hi ,

    Thank you for reaching out to Sophos Community.

    Did you check strongswan.log events at the time of instance?

    ==> Login to SSH > 5. Device Management > 3. Advanced Shell

    Check live logs:

    # tail -f /log/strongswan.log

    Filter old logs with tunnel name:

    # cat /log/strongswan.log | grep -i "Tunnel_Name"

    Filter old logs with time:

    # cat /log/strongswan.log | grep -i "YYYY-MM-DD HR:MM"

    eg. # cat /log/strongswan.log | grep -i "2021-07-27 10:03"

  • Hi Yash... I've just checked the logs, and below are the omitted outputs. The VPN is now up and running fine.

    • 195.244.209.182 is the WAN IP of XG
    • 20.49.207.83 is the IP of VPN GW in Azure
    • AzureVPNv2 in the live logs is the name of the VPN that has been running between XG and Azure 

    # tail -f /log/strongswan.log

    2021-08-07 00:10:01 20[IKE] <AzureVPNv2-1|40717> sending DPD request
    2021-08-07 00:10:01 20[ENC] <AzureVPNv2-1|40717> generating INFORMATIONAL request 192 [ ]
    2021-08-07 00:10:01 20[NET] <AzureVPNv2-1|40717> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (80 bytes)
    2021-08-07 00:10:01 26[NET] <AzureVPNv2-1|40717> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (80 bytes)
    2021-08-07 00:10:01 26[ENC] <AzureVPNv2-1|40717> parsed INFORMATIONAL response 192 [ ]
    2021-08-07 00:10:11 20[NET] <40801> received packet: from 185.41.232.4[500] to 195.244.210.238[500] (670 bytes)
    2021-08-07 00:10:11 20[ENC] <40801> parsed IKE_SA_INIT request 0 [ SA KE No V V V V N(NATD_S_IP) N(NATD_D_IP) ]
    2021-08-07 00:10:11 20[IKE] <40801> no IKE config found for 195.244.210.238...185.41.232.4, sending NO_PROPOSAL_CHOSEN
    2021-08-07 00:10:11 20[ENC] <40801> generating IKE_SA_INIT response 0 [ N(NO_PROP) ]
    2021-08-07 00:10:11 20[NET] <40801> sending packet: from 195.244.210.238[500] to 185.41.232.4[500] (36 bytes)
    2021-08-07 00:10:31 19[IKE] <AzureVPNv2-1|40717> sending DPD request
    2021-08-07 00:10:31 19[ENC] <AzureVPNv2-1|40717> generating INFORMATIONAL request 193 [ ]
    2021-08-07 00:10:31 19[NET] <AzureVPNv2-1|40717> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (80 bytes)
    2021-08-07 00:10:31 32[NET] <AzureVPNv2-1|40717> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (80 bytes)
    2021-08-07 00:10:31 32[ENC] <AzureVPNv2-1|40717> parsed INFORMATIONAL response 193 [ ]
    2021-08-07 00:10:41 19[NET] <40802> received packet: from 185.41.232.4[500] to 195.244.210.238[500] (670 bytes)
    2021-08-07 00:10:41 19[ENC] <40802> parsed IKE_SA_INIT request 0 [ SA KE No V V V V N(NATD_S_IP) N(NATD_D_IP) ]
    2021-08-07 00:10:41 19[IKE] <40802> no IKE config found for 195.244.210.238...185.41.232.4, sending NO_PROPOSAL_CHOSEN
    2021-08-07 00:10:41 19[ENC] <40802> generating IKE_SA_INIT response 0 [ N(NO_PROP) ]
    2021-08-07 00:10:41 19[NET] <40802> sending packet: from 195.244.210.238[500] to 185.41.232.4[500] (36 bytes)


    --------------------------------------------------------

    # cat /log/strongswan.log | grep -i "YYYY-MM-DD HR:MM"

    XG210_WP03_SFOS 18.0.5 MR-5-Build586# cat /log/strongswan.log | grep -i "2021-07-21 13:00"
    2021-07-21 13:00:14 20[NET] <19839> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (408 bytes)
    2021-07-21 13:00:14 20[ENC] <19839> parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) V V V V ]
    2021-07-21 13:00:14 20[IKE] <19839> received MS NT5 ISAKMPOAKLEY v9 vendor ID
    2021-07-21 13:00:14 20[IKE] <19839> received MS-Negotiation Discovery Capable vendor ID
    2021-07-21 13:00:14 20[IKE] <19839> received Vid-Initial-Contact vendor ID
    2021-07-21 13:00:14 20[ENC] <19839> received unknown vendor ID: 01:52:8b:bb:c0:06:96:12:18:49:ab:9a:1c:5b:2a:51:00:00:00:02
    2021-07-21 13:00:14 20[IKE] <19839> 20.49.207.83 is initiating an IKE_SA
    2021-07-21 13:00:14 20[ENC] <19839> generating IKE_SA_INIT response 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(MULT_AUTH) ]
    2021-07-21 13:00:14 20[NET] <19839> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (312 bytes)
    2021-07-21 13:00:15 12[NET] <19839> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (408 bytes)
    2021-07-21 13:00:15 12[ENC] <19839> parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) V V V V ]
    2021-07-21 13:00:15 12[IKE] <19839> received retransmit of request with ID 0, retransmitting response
    2021-07-21 13:00:15 12[NET] <19839> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (312 bytes)
    2021-07-21 13:00:16 22[NET] <19839> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (408 bytes)
    2021-07-21 13:00:16 22[ENC] <19839> parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) V V V V ]
    2021-07-21 13:00:16 22[IKE] <19839> received retransmit of request with ID 0, retransmitting response
    2021-07-21 13:00:16 22[NET] <19839> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (312 bytes)
    2021-07-21 13:00:24 20[JOB] <19837> deleting half open IKE_SA with 20.49.207.83 after timeout
    2021-07-21 13:00:24 20[DMN] <19837> [GARNER-LOGGING] (child_alert) ALERT: IKE_SA timed out before it could be established
    2021-07-21 13:00:29 26[NET] <19840> received packet: from 185.41.232.4[500] to 195.244.210.238[500] (670 bytes)
    2021-07-21 13:00:29 26[ENC] <19840> parsed IKE_SA_INIT request 0 [ SA KE No V V V V N(NATD_S_IP) N(NATD_D_IP) ]
    2021-07-21 13:00:29 26[IKE] <19840> no IKE config found for 195.244.210.238...185.41.232.4, sending NO_PROPOSAL_CHOSEN
    2021-07-21 13:00:29 26[ENC] <19840> generating IKE_SA_INIT response 0 [ N(NO_PROP) ]
    2021-07-21 13:00:29 26[NET] <19840> sending packet: from 195.244.210.238[500] to 185.41.232.4[500] (36 bytes)
    2021-07-21 13:00:36 29[NET] <19841> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (408 bytes)
    2021-07-21 13:00:36 29[ENC] <19841> parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) V V V V ]
    2021-07-21 13:00:36 29[IKE] <19841> received MS NT5 ISAKMPOAKLEY v9 vendor ID
    2021-07-21 13:00:36 29[IKE] <19841> received MS-Negotiation Discovery Capable vendor ID
    2021-07-21 13:00:36 29[IKE] <19841> received Vid-Initial-Contact vendor ID
    2021-07-21 13:00:36 29[ENC] <19841> received unknown vendor ID: 01:52:8b:bb:c0:06:96:12:18:49:ab:9a:1c:5b:2a:51:00:00:00:02
    2021-07-21 13:00:36 29[IKE] <19841> 20.49.207.83 is initiating an IKE_SA
    2021-07-21 13:00:36 29[ENC] <19841> generating IKE_SA_INIT response 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(MULT_AUTH) ]
    2021-07-21 13:00:36 29[NET] <19841> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (312 bytes)
    2021-07-21 13:00:37 17[NET] <19841> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (408 bytes)
    2021-07-21 13:00:37 17[ENC] <19841> parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) V V V V ]
    2021-07-21 13:00:37 17[IKE] <19841> received retransmit of request with ID 0, retransmitting response
    2021-07-21 13:00:37 17[NET] <19841> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (312 bytes)
    2021-07-21 13:00:38 13[NET] <19841> received packet: from 20.49.207.83[500] to 195.244.209.182[500] (408 bytes)
    2021-07-21 13:00:38 13[ENC] <19841> parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) V V V V ]
    2021-07-21 13:00:38 13[IKE] <19841> received retransmit of request with ID 0, retransmitting response
    2021-07-21 13:00:38 13[NET] <19841> sending packet: from 195.244.209.182[500] to 20.49.207.83[500] (312 bytes)
    2021-07-21 13:00:59 22[NET] <19842> received packet: from 185.41.232.4[500] to 195.244.210.238[500] (670 bytes)
    2021-07-21 13:00:59 22[ENC] <19842> parsed IKE_SA_INIT request 0 [ SA KE No V V V V N(NATD_S_IP) N(NATD_D_IP) ]
    2021-07-21 13:00:59 22[IKE] <19842> no IKE config found for 195.244.210.238...185.41.232.4, sending NO_PROPOSAL_CHOSEN
    2021-07-21 13:00:59 22[ENC] <19842> generating IKE_SA_INIT response 0 [ N(NO_PROP) ]
    2021-07-21 13:00:59 22[NET] <19842> sending packet: from 195.244.210.238[500] to 185.41.232.4[500] (36 bytes)

  • FormerMember
    0 FormerMember in reply to Michal Sumega

    Please put the strongswan service in debugging and grab the output again when the issue reoccurs.

    Enter the below command in the shell to put strongswan service in debugging.

    # service strongswan:debug -ds nosync

    Use same command to stop debugging.

    Filter old logs with time:

    # cat /log/strongswan.log | grep -i "YYYY-MM-DD HR:MM"

    eg. # cat /log/strongswan.log | grep -i "2021-07-27 10:03"

  • may I enable the debugging now...? ..or it will increase the CPU/memory usage drastically so it is better to enable it only when the issue occurs again. 

  • FormerMember
    0 FormerMember in reply to Michal Sumega
    Few weeks later, there was an issue with Internet provider and the WAN connection bounced (was down for several minutes). Again, once connection was back online, the VPN stayed down. Tried the standard tshooting steps as above again, with no luck. Again, had to recreate the same way as before (connection in Azure, and VPN on XG).

    If this is something which you've observed then you can put strongswan service in debugging when the issue persists again.

Reply
  • FormerMember
    0 FormerMember in reply to Michal Sumega
    Few weeks later, there was an issue with Internet provider and the WAN connection bounced (was down for several minutes). Again, once connection was back online, the VPN stayed down. Tried the standard tshooting steps as above again, with no luck. Again, had to recreate the same way as before (connection in Azure, and VPN on XG).

    If this is something which you've observed then you can put strongswan service in debugging when the issue persists again.

Children