This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

v17 MR5: VPN still unstable!

Hi,

 

I Upgraded to MR5 yesterday, all went great, suddenly this evening, tunnels start dropping up and down, and I am being "spammed" with notifications from my SFM that tunnels are terminated.

charon.log shows a lot of theese:

invalid ID_V1 payload length, decryption failed?                                

I have Read here:
Sophos XG Firewall: Cannot handle more than 2 concurrent Quick Mode exchanges per IKE_SA when using IKEv1

That there are issues in MR5, that will be resolved in MR6, but theese errors should read:
"invalid HASH_V1 payload length, decryption failed?"
as stated in the KB above.

I have 4 tunnels on my XG.

Are others seeing this?

A little more log:
2018-01-29 19:54:58 10[ENC] <622> invalid ID_V1 payload length, decryption fail 
ed?                                                                             
2018-01-29 19:54:58 10[ENC] <622> could not decrypt payloads                    
2018-01-29 19:54:58 10[IKE] <622> message parsing failed                        
2018-01-29 19:54:58 10[ENC] <622> generating INFORMATIONAL_V1 request 158523599 
 [ HASH N(PLD_MAL) ]                                                            
2018-01-29 19:54:58 10[NET] <622> sending packet: from x.x.x.x[500] to 5.1 
03.12.171[500] (76 bytes)                                                       
2018-01-29 19:54:58 10[IKE] <622> ID_PROT request with message ID 0 processing  
failed                                                                          
2018-01-29 19:54:58 10[DMN] <622> [GARNER-LOGGING] (child_alert) ALERT: parsing 
 IKE message from x.x.x.x[500] failed                                      
2018-01-29 19:54:58 19[JOB] <622> deleting half open IKE_SA with x.x.x.x a 
fter timeout                                                                    
2018-01-29 19:54:58 19[DMN] <622> [GARNER-LOGGING] (child_alert) ALERT: IKE_SA  
timed out before it could be established                                        
All tunnels are unstable during this, yesterday with MR3, it worked great for weeks!



This thread was automatically locked due to age.
Parents Reply Children
  • I've been using your XG firewalls for just a month now.  I have XG-XG at 3 locations. Had to rebuild my first implementation because unknown to me the firmware that was installed already had faulty IPSEC.  It's been a damn headache ever since.  I lose at least one site a week, random reboots on one, one site doesn't re-establish the tunnel after internet loss. One RED device that randomly stops sending traffic.  Even after 17 5.  You guys have your stuff together over there or did I make a terrible decision switching to Sophos? My last firewalls never had to be rebooted and fought with this much. 

  • In the article that you link to in your knowledge base article, that group discussion brings this problem up back in  2015 -  although it does appear that in MR-5 so you are using an older version of Strongswan than the current 5.6 release of strongswan.  Just pointing this out. 

     

    -Scott

  • Hi Sachin,

     

    Can you please advise of config changes and how they can be implemented on current MR5 version we have a support ticket #7909029 open however have not been advised or told to implement this config fix.

     

    Thanks,

     

    Adrian

  • Hi Nick,

    If you have a case logged in support then please PM me the ID and I will take a look to investigate further.

    Thanks

  • Hi Sachin,

     

    We are facing the same problem, with MR 6 firmware.

    SFOS 17.0.6 MR-6

     

    The charon.log log show me a lot of failed message and randonly the all VPN tunnels are disconnected.

     

    2018-04-25 15:46:10 07[ENC] <CSC_PEN_2-1|1> invalid HASH_V1 payload length, decryption failed?
    2018-04-25 15:46:10 07[IKE] <CSC_PEN_2-1|1> message parsing failed
    2018-04-25 15:46:10 07[IKE] <CSC_PEN_2-1|1> QUICK_MODE request with message ID 3138699598 processing failed
    2018-04-25 15:46:10 07[DMN] <CSC_PEN_2-1|1> [GARNER-LOGGING] (child_alert) ALERT: parsing IKE message from xx.xx.xx.xx[500] failed
    2018-04-25 15:46:10 30[ENC] <LDA_PEN_1-1|7> invalid HASH_V1 payload length, decryption failed?
    2018-04-25 15:46:10 30[IKE] <LDA_PEN_1-1|7> message parsing failed
    2018-04-25 15:46:10 30[IKE] <LDA_PEN_1-1|7> QUICK_MODE request with message ID 1031610501 processing failed
    2018-04-25 15:46:10 30[DMN] <LDA_PEN_1-1|7> [GARNER-LOGGING] (child_alert) ALERT: parsing IKE message from xx.xx.xx.xx[500] failed
    2018-04-25 15:46:13 29[ENC] <LDA_PEN_1-1|7> invalid HASH_V1 payload length, decryption failed?
    2018-04-25 15:46:13 29[IKE] <LDA_PEN_1-1|7> message parsing failed
    2018-04-25 15:46:13 29[IKE] <LDA_PEN_1-1|7> QUICK_MODE request with message ID 3103575795 processing failed
    2018-04-25 15:46:13 29[DMN] <LDA_PEN_1-1|7> [GARNER-LOGGING] (child_alert) ALERT: parsing IKE message from xx.xx.xx.xx[500] failed
    2018-04-25 15:46:15 15[ENC] <GPV_PEN_2-1|1328> invalid HASH_V1 payload length, decryption failed?
    2018-04-25 15:46:15 15[IKE] <GPV_PEN_2-1|1328> message parsing failed
    2018-04-25 15:46:15 15[IKE] <GPV_PEN_2-1|1328> QUICK_MODE request with message ID 2801021379 processing failed
    2018-04-25 15:46:15 15[DMN] <GPV_PEN_2-1|1328> [GARNER-LOGGING] (child_alert) ALERT: parsing IKE message from xx.xxx.xx.xx[500] failed
    2018-04-25 15:46:15 26[ENC] <GPV_PEN_2-1|1328> invalid HASH_V1 payload length, decryption failed?
    2018-04-25 15:46:15 26[IKE] <GPV_PEN_2-1|1328> message parsing failed
    2018-04-25 15:46:15 26[IKE] <GPV_PEN_2-1|1328> QUICK_MODE request with message ID 2246590669 processing failed
    2018-04-25 15:46:15 26[DMN] <GPV_PEN_2-1|1328> [GARNER-LOGGING] (child_alert) ALERT: parsing IKE message from xx.xx.xx.xx[500] failed
    2018-04-25 15:46:15 06[ENC] <GPV_PEN_2-1|1328> invalid HASH_V1 payload length, decryption failed?
    2018-04-25 15:46:15 06[IKE] <GPV_PEN_2-1|1328> message parsing failed
    2018-04-25 15:46:15 06[IKE] <GPV_PEN_2-1|1328> QUICK_MODE request with message ID 3372262754 processing failed
    2018-04-25 15:46:15 06[DMN] <GPV_PEN_2-1|1328> [GARNER-LOGGING] (child_alert) ALERT: parsing IKE message from xx.xx.xx.xx[500] failed
    2018-04-25 15:46:16 24[ENC] <MVD_PEN_1-1|5> invalid HASH_V1 payload length, decryption failed?
    2018-04-25 15:46:16 24[IKE] <MVD_PEN_1-1|5> message parsing failed
    2018-04-25 15:46:16 24[IKE] <MVD_PEN_1-1|5> QUICK_MODE request with message ID 2256274882 processing failed
    2018-04-25 15:46:16 24[DMN] <MVD_PEN_1-1|5> [GARNER-LOGGING] (child_alert) ALERT: parsing IKE message from xx.xx.xx.xx[500] failed

     

     

    Regards,

    Carlos

  • Hello

    I do not want to put you on depression, but look what XG does ...  The next picture self explains how bad networking is implemented on XG.  I mean VERY VERY VERY BAD.  It shows a simultaneous ping.  One of our Firewalls main internal address is 10.30.1.1.  On one of the subnets we have on that particular firewall, the firewall has IP address 10.29.1.1 ...  It is not only VPN that freezes.  All networking freezes.  In other words, our ISP have nothing to do with.  It does that on all XGs we have installed  Look at this :) :) :) 

    At the moment 10.29.0.0 reconnects, all other subnets and the internet reconnects. Hard not to swear and remain polite.  This happens many times a day.  On all our firewalls whether they are connected together via VPN or not.

    Imagine now if I had put IP telephony on that thing ... Oups.  Forgot to mention.  Ip Telephony has its own Mikrotik Firewall on the same WAN subnet as the XG firewall.  It NEVER (read my lips ... NEVER) go down.  Furthermore proving our ISP has nothing to do with that disgraceful situation.  At our other location, IP Telephony is set up much the same.  On the WAN lan, both are within the same 8 IP addresses subnet. Again.  Read my lips ... It NEVER goes down.  

    No.  XG is years away from being fixed.  And clearly not enterprise ready. 

  • I have the same issue, also, from time to time I'm experiencing a high latency when pinging the ISP next hop.

  • Are you doing VPN to other sfos devices? If so, have you tried RED? Seems to do the same job(not native RED devices, just red vpn between xg's) which means you can still policy and firewall each machine individually and has been stable for me with 2 RED for quite some time without timeouts. Telephony on the other hand is a whole other topic. But once you set it up, it works no matter what.

    Oh, btw I'm using Software XG. Maybe this counts too.

    My Setup is one main XG and 2 other XG's who RED to the main one. From outside those networks, I SSLVPN to the main one and work as usually, communicating with all 3.

  • Question: What network adapter are you using ?  How many ports ?

  • All sorts of cards. On the main machine the lan port is  HP NC382i(it's a server with sophos as a vm) and 2 realtek gigabit as wans(all are recognised as intel E1000 from sophos through the vm), on the others I think Intel gigabit. Will check for specific versions if you like. All devices are basic setup with 2 wans and 1 lan