This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

XG Virtual Appliance (18.0.3) kernel panic from VPN

I m having this issue on multiple VMs, running 18.0.3.

One of them is rebooting twice every day, even after a clean download and reinstall.

I also set up a second device, put them to HA, and then again both of the devices are rebooting with 20' difference.

I changed the ethernet adapter from VMCNET3 to e1000.

I see a lot of messages

packet dropped in ipsec0 device

I finally found a kernel dump in syslog.log

Nov  4 06:51:17 (none) user.warn kernel: [56361.459211] sched: RT throttling activated
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@x100
Nov  4 06:51:18 (none) user.warn kernel: [56373.072100]  linux_netmap_change_mtu+0x506/0x610 [netmap]
Nov  4 06:51:18 (none) user.warn kernel: [56373.083980]  ? core_sys_select+0x15f/0x250
Nov  4 06:51:18 (none) user.warn kernel: [56373.083984]  ? core_sys_select+0x194/0x250
Nov  4 06:51:18 (none) user.warn kernel: [56373.083993]  ? SyS_sendto+0xae/0x130
Nov  4 06:51:18 (none) user.warn kernel: [56373.083996]  do_vfs_ioctl+0x88/0x5c0
Nov  4 06:51:18 (none) user.warn kernel: [56373.110154]  SyS_ioctl+0x36/0x70
Nov  4 06:51:18 (none) user.warn kernel: [56373.110159]  do_syscall_64+0x63/0x120
Nov  4 06:51:18 (none) user.warn kernel: [56373.112452]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Nov  4 06:51:18 (none) user.warn kernel: [56373.121487] RIP: 0033:0x7f95ed44c037
Nov  4 06:51:18 (none) user.warn kernel: [56373.121489] RSP: 002b:00007ffe755773e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
Nov  4 06:51:18 (none) user.warn kernel: [56373.121491] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f95ed44c037
Nov  4 06:51:18 (none) user.warn kernel: [56373.121492] RDX: 0000000000000000 RSI: 0000000000006994 RDI: 000000000000002b
Nov  4 06:51:18 (none) user.warn kernel: [56373.121493] RBP: 00007f95aabaa000 R08: 0000000000000001 R09: 0000000000000060
Nov  4 06:51:18 (none) user.warn kernel: [56373.121493] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000015eb9c50
Nov  4 06:51:18 (none) user.warn kernel: [56373.121494] R13: 00000000000001ff R14: 000000000000014e R15: 0000000000000000

Nov  4 06:51:18 (none) user.err kernel: [56373.127479] VPN Panic, dst NULL in output
Nov  4 06:51:18 (none) user.warn kernel: [56373.127495] CPU: 0 PID: 4928 Comm: snort Tainted: G        W  O    4.14.38 #2
Nov  4 06:51:18 (none) user.warn kernel: [56373.127496] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
Nov  4 06:51:18 (none) user.warn kernel: [56373.127516] Call Trace:
Nov  4 06:51:18 (none) user.warn kernel: [56373.127523]  dump_stack+0x5c/0x78
Nov  4 06:51:18 (none) user.warn kernel: [56373.127527]  ip_finish_output+0x21e/0x240
Nov  4 06:51:18 (none) user.warn kernel: [56373.142583]  nf_reinject+0x130/0x150
Nov  4 06:51:18 (none) user.warn kernel: [56373.142590]  0xffffffffa074b461
Nov  4 06:51:18 (none) user.warn kernel: [56373.142597]  ? __getnstimeofday64+0x36/0xc0
Nov  4 06:51:18 (none) user.warn kernel: [56373.142598]  ? do_gettimeofday+0x10/0x50
Nov  4 06:51:18 (none) user.warn kernel: [56373.142604]  ? netmap_ioctl+0x23d/0x11c0 [netmap]
Nov  4 06:51:18 (none) user.warn kernel: [56373.142606]  ? 0xffffffffa074b9d4
Nov  4 06:51:18 (none) user.warn kernel: [56373.142607]  0xffffffffa074b9d4
Nov  4 06:51:18 (none) user.warn kernel: [56373.142610]  netmap_pipe_txsync+0xca/0x5c0 [netmap]
Nov  4 06:51:18 (none) user.warn kernel: [56373.142614]  netmap_ioctl+0x298/0x11c0 [netmap]
Nov  4 06:51:18 (none) user.warn kernel: [56373.142619]  ? __alloc_skb+0x62/0x1b0
Nov  4 06:51:18 (none) user.warn kernel: [56373.142622]  ? __kmalloc_track_caller+0x1e/0x100
Nov  4 06:51:18 (none) user.warn kernel: [56373.142625]  linux_netmap_change_mtu+0x506/0x610 [netmap]
Nov  4 06:51:18 (none) user.warn kernel: [56373.142628]  ? core_sys_select+0x15f/0x250
Nov  4 06:51:18 (none) user.warn kernel: [56373.142629]  ? core_sys_select+0x194/0x250
Nov  4 06:51:18 (none) user.warn kernel: [56373.142633]  ? SyS_sendto+0xae/0x130
Nov  4 06:51:18 (none) user.warn kernel: [56373.142634]  do_vfs_ioctl+0x88/0x5c0
Nov  4 06:51:18 (none) user.warn kernel: [56373.142636]  SyS_ioctl+0x36/0x70
Nov  4 06:51:18 (none) user.warn kernel: [56373.142639]  do_syscall_64+0x63/0x120
Nov  4 06:51:18 (none) user.warn kernel: [56373.142641]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Nov  4 06:51:18 (none) user.warn kernel: [56373.142643] RIP: 0033:0x7f95ed44c037
Nov  4 06:51:18 (none) user.warn kernel: [56373.142644] RSP: 002b:00007ffe755773e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
Nov  4 06:51:18 (none) user.warn kernel: [56373.142646] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f95ed44c037
Nov  4 06:51:18 (none) user.warn kernel: [56373.142647] RDX: 0000000000000000 RSI: 0000000000006994 RDI: 000000000000002b
Nov  4 06:51:18 (none) user.warn kernel: [56373.142648] RBP: 00007f95aabaa000 R08: 0000000000000001 R09: 0000000000000060
Nov  4 06:51:18 (none) user.warn kernel: [56373.142648] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000015eb9c50
Nov  4 06:51:18 (none) user.warn kernel: [56373.142649] R13: 00000000000001ff R14: 000000000000014e R15: 0000000000000000


In one member of the HA cluster:
Nov  4 00:00:04 (none) user.err kernel: [20469.855647] 215:appfiltermap_adt_parser:policy 3 max app order 7 max eac apporder 0
Nov  4 00:00:04 (none) user.err kernel: [20469.855651] 711:appdev_write:count 1599
Nov  4 00:00:04 (none) user.err kernel: [20469.855656] 758:appdev_release:dev open 3
Nov  4 00:00:04 (none) user.err kernel: [20469.855657] 771:appdev_release:counter 7 size 128
Nov  4 00:00:04 (none) user.err kernel: [20469.855658] 774:appdev_release:dev open 0
Nov  4 00:48:55 (none) user.info kernel: [23400.920469] IPsec XFRM device driver
Nov  4 02:42:35 (none) user.warn kernel: [30216.894582] sched: RT throttling activated
Nov  4 06:51:16 (none) user.err kernel: [45141.622466] arprep_proxy PTOA(prim): Dedicated interface PortE is found.
Nov  4 06:51:16 (none) user.err kernel: [45141.673819] packet dropped in ipsec0 device
Nov  4 06:51:16 (none) user.err kernel: [45141.674692] packet dropped in ipsec0 device
Nov  4 06:51:17 (none) user.err kernel: [45142.636231] packet dropped in ipsec0 device
Nov  4 06:51:17 (none) user.err kernel: [45142.970714] packet dropped in ipsec0 device
Nov  4 06:51:20 (none) user.err kernel: [45145.683923] packet dropped in ipsec0 device
Nov  4 06:51:22 (none) daemon.info init: System will reboot
Nov  4 06:51:22 (none) user.err kernel: [45147.488290] ND reply PTOA(prim): Dedicated interface PortE is found.
Nov  4 06:51:27 (none) daemon.info init: The system is going down NOW!
Nov  4 06:51:27 (none) syslog.info syslogd exiting
Nov  4 06:52:41 (none) syslog.info syslogd started: BusyBox v1.21.1






The ipsec0 dropped traffic is from the ipsec0 default ip address,
SF01V_VM01_SFOS 18.0.3 MR-3# tcpdump |grep 169.254.234
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
12:15:47.401360 ipsec0, OUT: IP 169.254.234.5 > 192.168.1.138: ICMP host 192.168.3.94 unreachable, length 153
12:15:47.401376 ipsec0, OUT: IP 169.254.234.5 > 192.168.1.197: ICMP host 192.168.3.100 unreachable, length 153
12:15:47.401393 ipsec0, OUT: IP 169.254.234.5 > 192.168.1.104: ICMP host 192.168.3.100 unreachable, length 153
12:15:47.401403 ipsec0, OUT: IP 169.254.234.5 > 192.168.1.197: ICMP host 192.168.3.100 unreachable, length 153
12:15:47.401412 ipsec0, OUT: IP 169.254.234.5 > 192.168.1.104: ICMP host 192.168.3.100 unreachable, length 153
12:15:47.401421 ipsec0, OUT: IP 169.254.234.5 > 192.168.1.197: ICMP host 192.168.3.100 unreachable, length 153
12:15:47.401483 ipsec0, OUT: IP 169.254.234.5 > 192.168.1.197: ICMP host 192.168.3.94 unreachable, length 153
12:15:49.453354 ipsec0, OUT: IP 169.254.234.5 > 192.168.25.31: ICMP host 192.168.3.121 unreachable, length 78
12:15:49.453402 ipsec0, OUT: IP 169.254.234.5 > 192.168.25.31: ICMP host 192.168.3.121 unreachable, length 79





This thread was automatically locked due to age.
Parents
  • FormerMember
    0 FormerMember

    Hi ,

    Thank you for reaching out to the Community!

    Could you please check if there are any core dumps on your firewall by running the following command from the advanced shell? 

    • ls -al /var/cores 

    Did you notice any spike in resource utilization? 

    Thanks,

  • In the first  node ( old one,  before I create the HA pair)

    SF01V_VM01_SFOS 18.0.3 MR-3# ls -lah /var/cores
    drwxrwxrwt    2 root     0           4.0K Nov  4 19:49 .
    drwxr-xr-x   36 root     0           4.0K Nov  4 23:18 ..
    -rw-------    1 root     0          31.3K Nov  3 14:31 14a4443c-1b7b-46eb-7336df98-ae19039f.dmp
    -rw-------    1 root     0          31.3K Nov  3 15:03 1a7cfcb5-df50-4302-8565b9ba-e67b6c58.dmp
    -rw-------    1 root     0          39.8K Nov  4 15:14 4fbc2151-1bfe-4ede-58500095-d61793e0.dmp
    -rw-------    1 root     0          24.0K Nov  3 15:36 51af5e56-2533-4012-9f7d5cb9-4b2c03f2.dmp
    -rw-------    1 root     0          39.8K Nov  4 15:39 5550e57b-8cb3-4d1d-63d17c9d-7260e977.dmp
    -rw-------    1 root     0          35.3K Nov  3 14:49 7191679c-ede0-4a52-41492ead-ea87f74a.dmp
    -rw-------    1 root     0          31.8K Nov  4 19:49 7db36263-98bf-4f82-553da3ab-14733359.dmp
    -rw-------    1 root     0          35.8K Nov  4 16:03 bb8db6a1-4a79-46c9-cd9d41a4-1b336905.dmp
    -rw-------    1 root     0           4.9M Nov  4 15:23 core.awarrenhttp
    -rw-------    1 root     0           1.6M Nov  4 19:49 core.pktcapd
    -rw-------    1 root     0          31.8K Nov  4 15:23 d766be89-060f-4984-981129a5-65b3148c.dmp
    -rw-------    1 root     0          42.9K Nov  4 15:23 dde0ade3-bbc8-404b-e82a88a2-07dc8fef.dmp
    -rw-------    1 root     0          35.3K Nov  3 14:12 f380b04b-13ea-44a6-1c5defb1-a00c1679.dmp
    

    In the second node

    SF01V_VM01_SFOS 18.0.3 MR-3# ls -alh /var/cores
    drwxrwxrwt    2 root     0           4.0K Nov  3 23:58 .
    drwxr-xr-x   38 root     0           4.0K Nov  4 23:19 ..
    -rw-------    1 root     0          56.2M Nov  3 18:20 core.garner

    In both nodes the performance graphs in vSphere are normal.

    The issue arise during working hours, with more active users but less bandwidth, since I transfer backup off-hours, and never had an issue during that time.

Reply
  • In the first  node ( old one,  before I create the HA pair)

    SF01V_VM01_SFOS 18.0.3 MR-3# ls -lah /var/cores
    drwxrwxrwt    2 root     0           4.0K Nov  4 19:49 .
    drwxr-xr-x   36 root     0           4.0K Nov  4 23:18 ..
    -rw-------    1 root     0          31.3K Nov  3 14:31 14a4443c-1b7b-46eb-7336df98-ae19039f.dmp
    -rw-------    1 root     0          31.3K Nov  3 15:03 1a7cfcb5-df50-4302-8565b9ba-e67b6c58.dmp
    -rw-------    1 root     0          39.8K Nov  4 15:14 4fbc2151-1bfe-4ede-58500095-d61793e0.dmp
    -rw-------    1 root     0          24.0K Nov  3 15:36 51af5e56-2533-4012-9f7d5cb9-4b2c03f2.dmp
    -rw-------    1 root     0          39.8K Nov  4 15:39 5550e57b-8cb3-4d1d-63d17c9d-7260e977.dmp
    -rw-------    1 root     0          35.3K Nov  3 14:49 7191679c-ede0-4a52-41492ead-ea87f74a.dmp
    -rw-------    1 root     0          31.8K Nov  4 19:49 7db36263-98bf-4f82-553da3ab-14733359.dmp
    -rw-------    1 root     0          35.8K Nov  4 16:03 bb8db6a1-4a79-46c9-cd9d41a4-1b336905.dmp
    -rw-------    1 root     0           4.9M Nov  4 15:23 core.awarrenhttp
    -rw-------    1 root     0           1.6M Nov  4 19:49 core.pktcapd
    -rw-------    1 root     0          31.8K Nov  4 15:23 d766be89-060f-4984-981129a5-65b3148c.dmp
    -rw-------    1 root     0          42.9K Nov  4 15:23 dde0ade3-bbc8-404b-e82a88a2-07dc8fef.dmp
    -rw-------    1 root     0          35.3K Nov  3 14:12 f380b04b-13ea-44a6-1c5defb1-a00c1679.dmp
    

    In the second node

    SF01V_VM01_SFOS 18.0.3 MR-3# ls -alh /var/cores
    drwxrwxrwt    2 root     0           4.0K Nov  3 23:58 .
    drwxr-xr-x   38 root     0           4.0K Nov  4 23:19 ..
    -rw-------    1 root     0          56.2M Nov  3 18:20 core.garner

    In both nodes the performance graphs in vSphere are normal.

    The issue arise during working hours, with more active users but less bandwidth, since I transfer backup off-hours, and never had an issue during that time.

Children
No Data