This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Remote Access IPSec VPN disconnects

Our Remote Access IPSec VPN is disconnecting when the IKE SA lifetime is met.

The IPSec policy is set to defaults (with strict policy checked)

IKE SA lifetime – 7800

IPsec SA lifetime - 3600

Sophos IPSec Client log:

9/15/2017 8:21:04 PM - ERROR - 4035: IKE(phase1):Disconnect due to rekey failure.

Sophos UTM IPSec log:

2017:09:15-20:21:02 hostname pluto[30292]: "IPSEC VPN-0"[2] 36.X.X.X:10954 #17: max number of retransmissions (2) reached STATE_XAUTH_R1

2017:09:15-20:21:02 hostname pluto[30292]: "IPSEC VPN-0"[2] 36.X.X.X:10954: deleting connection "IPSEC VPN-0"[2] instance with peer 36.X.X.X {isakmp=#0/ipsec=#0}

Is this normal behavior for an IPSec Remote Access VPN to disconnect after the IKE SA lifetime is met? At the end of the IKE SA lifetime, isn’t it supposed to re-authenticate and compare policies?  Why is it disconnecting after the IKE SA lifetime?



This thread was automatically locked due to age.
Parents
  • Hi

    I'm not sure if it helps, however we saw similar behaviour for some of our users who connect to our UTM's for Remote Access via IPSEC. In our case, we tracked the issue down to a mismatch between the SA lifetimes configured on the UTM vs the lifetimes configured within the Sophos IPSEC client. Our UTM used the default values of 7800/3600s whereas our clients were configured to use 1 day/8 hours (aka 86400/28800s).

    Once we adjusted these values to be the same number, the disconnect problem went away.

    Cheers
    Steve

  • We downloaded the configs from the User Portal soon after the VPN was set up so the configs matched.  We then imported the configs into the IPSec client. Also verified these settings matched as well.  One thing that I did notice is that when we start up our client machines, the very first time that we try to connect to the VPN it always fails.  All others connection attempts afterward succeed.  When the client machine is rebooted, the same thing happens – the first connection try fails and then all after succeed. 

    When the first connection fails the Sophos UTM IPSec log shows an error:

    hostname pluto[30292]: ERROR: asynchronous network error report on eth0 for message to 36.X.X.X port 10954, complainant 192.168.250.1: Message too long [errno 90, origin ICMP type 3 code 4 (not authenticated)]

    So if I try to connect again it works, but what I noticed is that the same message error also shows again right before the VPN disconnects at the end of the IKE SA lifetime:

    hostname pluto[30292]: ERROR: asynchronous network error report on eth0 for message to 36.X.X.X port 10954, complainant 192.168.250.1: Message too long [errno 90, origin ICMP type 3 code 4 (not authenticated)]

    I wonder if this is the issue.

  • I'm not a networking expert, however that sounds to me like you've got an issue with your MTU settings on your network. If this is the case, i'm not quite sure why the second connection would work where the first did not, however the "Message too long" error is usually symptomatic of a packet that exceeds the network's allowed packet size.

  • I'll try a WAG - does the client have DPD enabled?  If that wasn't the problem, please disable the IPsec Remote Access rule and power cycle the client.  When the client is ready to connect, start the IPsec Live Log and then have the client try to connect after the Live Log shows a few lines.  Show us the lines up to and including the ERROR above.

    Cheers - Bob

  • I apologize for just getting back to this.

    Stephen, our Sophos UTM is an instance within AWS and the IP on the interface is dynamic.  The MTU that populates with the dynamic IP is 9001.  If I try to change it to 1500, for example, it will just change itself back to 9001.

    BAlfson, I also thought that DPD might be causing the issue.  Yes it is enabled on the UTM and also on the client, but I made some changes elsewhere that really seem to improve the VPN uptime.

    I tested uptime using the below setting and the VPN, and the client machine connected RDP sessions, stayed up for 18+ hours.  I then manually disconnected the VPN because I really only need the VPN to stay up for a max of about 12-14 hours.  I know that I set the IKE SA lifetime a to 1 day, but at least it is staying up vs. disconnecting after 7800 seconds.  I have read a lot and many say that that the best practice setting for the IKE SA lifetime is 86400 in reference to site-to-site IPsec VPNs, but I really haven't seen anything different in terms of best practice for remote access IPSec VPN.


    Policy:
    IKE SA lifetime - 86400 (this was originally 7800)
    IPsec SA lifetime - 3600 (no change)
    PFS - off/not used (this was originally enabled)
    Strict policy - checked
    Compression - checked (this was originally unchecked)

Reply
  • I apologize for just getting back to this.

    Stephen, our Sophos UTM is an instance within AWS and the IP on the interface is dynamic.  The MTU that populates with the dynamic IP is 9001.  If I try to change it to 1500, for example, it will just change itself back to 9001.

    BAlfson, I also thought that DPD might be causing the issue.  Yes it is enabled on the UTM and also on the client, but I made some changes elsewhere that really seem to improve the VPN uptime.

    I tested uptime using the below setting and the VPN, and the client machine connected RDP sessions, stayed up for 18+ hours.  I then manually disconnected the VPN because I really only need the VPN to stay up for a max of about 12-14 hours.  I know that I set the IKE SA lifetime a to 1 day, but at least it is staying up vs. disconnecting after 7800 seconds.  I have read a lot and many say that that the best practice setting for the IKE SA lifetime is 86400 in reference to site-to-site IPsec VPNs, but I really haven't seen anything different in terms of best practice for remote access IPSec VPN.


    Policy:
    IKE SA lifetime - 86400 (this was originally 7800)
    IPsec SA lifetime - 3600 (no change)
    PFS - off/not used (this was originally enabled)
    Strict policy - checked
    Compression - checked (this was originally unchecked)

Children
No Data