This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

skb_warn_bad_offload Kernel error on packet routing

I am setting up a new Sophos UTM (Business Essentials), in addition to other licensed UTMs. This new one hangs off a licensed one as a VM guest. This is the second install of this kind I have done. The first one works fine. This one is having an issue.

The only difference between the two setups is that this new one has only a single bridged interface for the guest, the intention is to VLAN interfaces within Sophos.

I configured a subnet behind the UTM, and response packets from external hosts with length less than 1000 route though okay. When the packet length is above 1500, the first few packets do not make it from the external interface to the subnet route. But the last packet, that is under length of 1000 makes it.

I performed a tcpdump on both the external interface and internal subnet interface, and have verified this is true. Packets make it to the external interface, but the ones with a length of 1500 do not route to the internal subnet interface.

I am not having this issue on the first UTM setup that is similar to this topology. The configuration is the same between them, except they are on different remote networks, and newest setup uses a single untagged (KVM) virtio network card, where as the prior uses multiple bridged VLAN tagged interfaces.

MTU is 1500 on all interfaces.

After watching the logs, I see this error appearing, and subsequent discussion on the Internet point to a similar issue as I am experiencing, and question a possible bug in the
kernel.

2014:09:04-13:41:35 net301ima kernel: [ 2375.772054] ------------[ cut here ]------------
2014:09:04-13:41:35 net301ima kernel: [ 2375.772067] WARNING: at net/core/dev.c:2033 skb_warn_bad_offload+0xb8/0xc0()
2014:09:04-13:41:35 net301ima kernel: [ 2375.772070] Hardware name: KVM
2014:09:04-13:41:35 net301ima kernel: [ 2375.772074] : caps=(0x0000000000005020, 0x0000000000000000) len=1500 data_len=1390 gso_size=1448 gso_type=5 ip_summed=1
2014:09:04-13:41:35 net301ima kernel: [ 2375.772076] Modules linked in: sd_mod xt_connmark xt_tcpudp xt_multiport xt_set xt_addrtype ip_set_hash_net ip_set_hash_ip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_ftp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_irc nf_conntrack_ftp af_packet 8021q ebtable_filter ebtables bridge stp llc redv2_netlink ip6table_ips ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_ips iptable_mangle iptable_nat nf_nat_ipv4 nf_nat xt_NFLOG xt_condition(O) xt_logmark xt_confirmed xt_owner ip6t_REJECT ipt_REJECT xt_state ip_set red2 ip_scheduler red nfnetlink_log mperf nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_raw nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_filter iptable_raw xt_CT nf_conntrack_netlink nfnetlink nf_conntrack ip6_tables ip_tables x_tables ipv6 loop sg button virtio_net rtc_cmos sr_mod cdrom pcspkr i2c_piix4 evdev virtio_balloon microcode uhci_hcd ehci_hcd virtio_blk processor thermal_sys hwmon pata_acpi ata_generic edd ata_piix libata scsi_mod virtio_pci virtio_ring virtio hid_generic usbhid
2014:09:04-13:41:35 net301ima kernel: [ 2375.772166] Pid: 0, comm: swapper/0 Tainted: G        W  O 3.8.13.15-110.g4be5643-smp #1
2014:09:04-13:41:35 net301ima kernel: [ 2375.772168] Call Trace:
2014:09:04-13:41:35 net301ima kernel: [ 2375.772174]  [] ? skb_warn_bad_offload+0xb8/0xc0
2014:09:04-13:41:35 net301ima kernel: [ 2375.772179]  [] ? warn_slowpath_common+0x7b/0x90
2014:09:04-13:41:35 net301ima kernel: [ 2375.772183]  [] ? skb_warn_bad_offload+0xb8/0xc0
2014:09:04-13:41:35 net301ima kernel: [ 2375.772187]  [] ? warn_slowpath_fmt+0x33/0x37
2014:09:04-13:41:35 net301ima kernel: [ 2375.772191]  [] ? skb_warn_bad_offload+0xb8/0xc0
2014:09:04-13:41:35 net301ima kernel: [ 2375.772195]  [] ? skb_gso_segment+0x9b/0x1d9
2014:09:04-13:41:35 net301ima kernel: [ 2375.772199]  [] ? dev_hard_start_xmit+0x1d4/0x37f
2014:09:04-13:41:35 net301ima kernel: [ 2375.772203]  [] ? dev_queue_xmit+0x1d0/0x263
2014:09:04-13:41:35 net301ima kernel: [ 2375.772209]  [] ? ip_finish_output2+0x27a/0x2c3
2014:09:04-13:41:35 net301ima kernel: [ 2375.772212]  [] ? skb_dst+0x7/0x7
2014:09:04-13:41:35 net301ima kernel: [ 2375.772216]  [] ? ip_finish_output2+0x2c3/0x2c3
2014:09:04-13:41:35 net301ima kernel: [ 2375.772220]  [] ? NF_HOOK_COND+0x4f/0x56
2014:09:04-13:41:35 net301ima kernel: [ 2375.772224]  [] ? ip_finish_output2+0x2c3/0x2c3
2014:09:04-13:41:35 net301ima kernel: [ 2375.772227]  [] ? ip_output+0x82/0x88
2014:09:04-13:41:35 net301ima kernel: [ 2375.772231]  [] ? ip_finish_output2+0x2c3/0x2c3
2014:09:04-13:41:35 net301ima kernel: [ 2375.772235]  [] ? dst_output+0x9/0xa
2014:09:04-13:41:35 net301ima kernel: [ 2375.772239]  [] ? ip_rcv_finish+0x27d/0x293
2014:09:04-13:41:35 net301ima kernel: [ 2375.772243]  [] ? ip_rcv+0x277/0x277
2014:09:04-13:41:35 net301ima kernel: [ 2375.772246]  [] ? NF_HOOK+0x48/0x4f
2014:09:04-13:41:35 net301ima kernel: [ 2375.772263]  [] ? ip_rcv+0x277/0x277
2014:09:04-13:41:35 net301ima kernel: [ 2375.772272]  [] ? ip_rcv+0x240/0x277
2014:09:04-13:41:35 net301ima kernel: [ 2375.772275]  [] ? ip_rcv+0x277/0x277
2014:09:04-13:41:35 net301ima kernel: [ 2375.772279]  [] ? __netif_receive_skb+0x424/0x475
2014:09:04-13:41:35 net301ima kernel: [ 2375.772283]  [] ? build_skb+0x27/0xb5
2014:09:04-13:41:35 net301ima kernel: [ 2375.772287]  [] ? netif_receive_skb+0x63/0x68
2014:09:04-13:41:35 net301ima kernel: [ 2375.772297]  [] ? virtnet_poll+0x47d/0x563 [virtio_net]
2014:09:04-13:41:35 net301ima kernel: [ 2375.772303]  [] ? net_rx_action+0x91/0x1b1
2014:09:04-13:41:35 net301ima kernel: [ 2375.772308]  [] ? __do_softirq+0x84/0x143
2014:09:04-13:41:35 net301ima kernel: [ 2375.772312]  [] ? irq_enter+0x4d/0x4d
2014:09:04-13:41:35 net301ima kernel: [ 2375.772314]    [] ? irq_exit+0x2f/0x92
2014:09:04-13:41:35 net301ima kernel: [ 2375.772321]  [] ? do_IRQ+0x81/0x95
2014:09:04-13:41:35 net301ima kernel: [ 2375.772324]  [] ? irq_exit+0x91/0x92
2014:09:04-13:41:35 net301ima kernel: [ 2375.772329]  [] ? smp_apic_timer_interrupt+0x6f/0x7b
2014:09:04-13:41:35 net301ima kernel: [ 2375.772334]  [] ? common_interrupt+0x2c/0x31
2014:09:04-13:41:35 net301ima kernel: [ 2375.772339]  [] ? native_safe_halt+0x2/0x3
2014:09:04-13:41:35 net301ima kernel: [ 2375.772343]  [] ? default_idle+0x1c/0x31
2014:09:04-13:41:35 net301ima kernel: [ 2375.772346]  [] ? cpu_idle+0x52/0x71
2014:09:04-13:41:35 net301ima kernel: [ 2375.772350]  [] ? start_kernel+0x31d/0x322
2014:09:04-13:41:35 net301ima kernel: [ 2375.772354]  [] ? repair_env_string+0x4f/0x4f
2014:09:04-13:41:35 net301ima kernel: [ 2375.772357] ---[ end trace 49019babd7ff281d ]---


This thread was automatically locked due to age.
Parents
  • Hi, RG, and welcome to the User BB!

    this new one has only a single bridged interface for the guest, the intention is to VLAN interfaces within Sophos.

    Sorry, you lost me.

    Cheers - Bob
  • In Linux you can create a bridged interface on a Host for qemu/KVM (or other type of) Guests to use as their virtual network device. For this Guest running Sophos, in which I am experiencing the issue, there is only one bridged interface that is untagged as a virtual network device. I am creating multiple tagged interfaces in Sophos.

    In the Guest running Sophos not having the issue, I have seven bridged interfaces that are tagged as virtual network devices. These are single untagged interfaces in Sophos.

    I am not speaking of bridging as implemented by Sophos, I am speaking of the bridged interface provided as a virtual network device provided to the Guest VM. This is a bridged interface as opposed to the plain hardware network device from the Host.

    So
    this new one has only a single bridged interface for the guest

    The Guest OS is running Sophos. The Guest OS has a virtual network device. That single virtual network device is a bridged device from the Host.
    the intention is to VLAN interfaces within Sophos.

    Multiple interfaces in Sophos are created by tagging the single virtual network device.

    See the linux bridge utilities, rpm as bridge-utils in Fedora/RHEL.
    Or see KVM Networking: Networking - KVM

    Hi, RG, and welcome to the User BB!


    Sorry, you lost me.

    Cheers - Bob
  • Here is some more detail on my use of virtual devices in the Guests and the difference in routing between the two Sophos installs. Note the difference between the two Sophos installs is only the Interfaces.

    [Host Network Devices]
    p3p1 MASTER=bond0; SLAVE=yes
    p3p2 MASTER=bond0; SLAVE=yes
    bond0 BRIDGE=br-network

    [Host Bridges]
    Bridge Name Interfaces
    br-network bond0
    br-net-001 br-network.1
    br-net-002 br-network.2
    br-net-003 br-network.3
    br-net-004 br-network.4
    br-net-005 br-network.5
    br-net-006 br-network.6
    br-net-007 br-network.7

    [VM Guest 1 (is having the kernel error)]
    Host Bridge -> Device Type -> Sophos Interfaces
    br-network -> virtio -> eth0.1 eth0.2

    [VM Guest 2 (not having any routing issues)]
    Host Bridge -> Device Type -> Sophos Interfaces
    br-net-001 -> virtio -> eth0
    br-net-002 -> virtio -> eth1
    br-net-003 -> virtio -> eth2
    br-net-004 -> virtio -> eth3
    br-net-005 -> virtio -> eth4
    br-net-006 -> virtio -> eth5
    br-net-007 -> virtio -> eth6

    [Sophos on Guest 1 (is having the kernel error)]
    Client -> BES-UTM (eth0.2 -> eth0.1) -> LIC-UTM -> Server

    [Sophos on Guest 2 (not having any routing issues)]
    Client -> BES-UTM (eth5 -> eth0) -> LIC-UTM -> Server

    Where:
    BES-UTM = Business Essentials Sophos UTM 9.200-11, KVM Guest
    LIC-UTM = Our Licensed copy of Sophos UTM, Hardware
  • This is solved.

    I was running the Guest for Sophos on a CentOS 6.5 Host with libvirt-0.10.2-29.el6_5.2. I resolve the issue by updating to libvirt-0.10.2-29.el6_5.9.

    There was an issue in this version (CVE-2014-1447) that caused a crash if connections closed early. I am not sure if this is the exact cause of my issue. However, I did experience libvirtd crashing in my scenario when I made an HTTPS request to the firewall WebUI. Those packets were not dropped, but sent through to the client, which caused libvirtd to crash. So I assume this issue is related.

    My Procedure:

    I first updated the kernel from 3.10.26 to 3.10.48, but that alone did not fix the issue. I then upgraded libvirt on the Host to libvirt-0.10.2-29.el6_5.9, and my issue is now resolved.

    A rpm update of libvirt from 2 to 9, and then a 'service libvirtd restart' (while the guest was running) fixed the issue.

    Reference:

    For anyone else experiencing the skb_warn_bad_offload error in their Guest VM (seen in dmesg or kernel.log).

    First determine the version of your underlying libraries that provide the virtualization capabilities. I have seen this issue reported for vbox (Virtual Box) as well as VMWare. You probably need to update them to resolve this issue.

    If you are using libvirt, ensure you are using a version that fixes CVE-2014-1447
    http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-1447
    libvirt before 1.2.1 is vulnerable

    Here is the changelog for libvirt up to version 9 (0.10.2.9)
    http://www.rpmfind.net/linux/RPM/centos/updates/6.5/x86_64/Packages/libvirt-0.10.2-29.el6_5.9.x86_64.html

    Again, my issue with skb_warn_bad_offload error was resolved with an upgrade from libvirt-0.10.2-29.el6_5.2 to libvirt-0.10.2-29.el6_5.9

    (This was from September 2014)

    ===== EDIT 2016-05-24 =====

    I now have a new hypervisor host on RHEL7.1 with a new install of Sophos, and was experiencing the same issue again. This time the problem was not solved by updating libvirt, but a change in the kernel version.

    I am using the same topology, yet RHEL7 uses team0 instead of bond0 for the naming convention.

    Apparently, the ethernet driver matches the kernel version. And you can see the ethernet driver version with ethtool -i as follows:

         ethtool -i team0

    Bug Reference: Intel Ethernet Drivers and Utilities, 2015-07-22
    https://sourceforge.net/p/e1000/bugs/481/

    I resolved this issue by migrating the Sophos VM away from HostA to HostB with a change in kernel versions as follows:

        HostA: 3.10.0-229.el7.x86_64
        HostB: 3.10.0-229.1.2.el7.x86_64

    This resulted in running Sophos on a host with different ethernet driver versions, as follows:

        HostA:
        driver: team
        version: 3.10.0-229.el7.x86_64

        HostB:
        driver: team
        version: 3.10.0-229.1.2.el7.x86_64

    This resolved the issue this time. I believe the resolution to my 2014 issue was both kernel and libvirt updates.

Reply
  • This is solved.

    I was running the Guest for Sophos on a CentOS 6.5 Host with libvirt-0.10.2-29.el6_5.2. I resolve the issue by updating to libvirt-0.10.2-29.el6_5.9.

    There was an issue in this version (CVE-2014-1447) that caused a crash if connections closed early. I am not sure if this is the exact cause of my issue. However, I did experience libvirtd crashing in my scenario when I made an HTTPS request to the firewall WebUI. Those packets were not dropped, but sent through to the client, which caused libvirtd to crash. So I assume this issue is related.

    My Procedure:

    I first updated the kernel from 3.10.26 to 3.10.48, but that alone did not fix the issue. I then upgraded libvirt on the Host to libvirt-0.10.2-29.el6_5.9, and my issue is now resolved.

    A rpm update of libvirt from 2 to 9, and then a 'service libvirtd restart' (while the guest was running) fixed the issue.

    Reference:

    For anyone else experiencing the skb_warn_bad_offload error in their Guest VM (seen in dmesg or kernel.log).

    First determine the version of your underlying libraries that provide the virtualization capabilities. I have seen this issue reported for vbox (Virtual Box) as well as VMWare. You probably need to update them to resolve this issue.

    If you are using libvirt, ensure you are using a version that fixes CVE-2014-1447
    http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-1447
    libvirt before 1.2.1 is vulnerable

    Here is the changelog for libvirt up to version 9 (0.10.2.9)
    http://www.rpmfind.net/linux/RPM/centos/updates/6.5/x86_64/Packages/libvirt-0.10.2-29.el6_5.9.x86_64.html

    Again, my issue with skb_warn_bad_offload error was resolved with an upgrade from libvirt-0.10.2-29.el6_5.2 to libvirt-0.10.2-29.el6_5.9

    (This was from September 2014)

    ===== EDIT 2016-05-24 =====

    I now have a new hypervisor host on RHEL7.1 with a new install of Sophos, and was experiencing the same issue again. This time the problem was not solved by updating libvirt, but a change in the kernel version.

    I am using the same topology, yet RHEL7 uses team0 instead of bond0 for the naming convention.

    Apparently, the ethernet driver matches the kernel version. And you can see the ethernet driver version with ethtool -i as follows:

         ethtool -i team0

    Bug Reference: Intel Ethernet Drivers and Utilities, 2015-07-22
    https://sourceforge.net/p/e1000/bugs/481/

    I resolved this issue by migrating the Sophos VM away from HostA to HostB with a change in kernel versions as follows:

        HostA: 3.10.0-229.el7.x86_64
        HostB: 3.10.0-229.1.2.el7.x86_64

    This resulted in running Sophos on a host with different ethernet driver versions, as follows:

        HostA:
        driver: team
        version: 3.10.0-229.el7.x86_64

        HostB:
        driver: team
        version: 3.10.0-229.1.2.el7.x86_64

    This resolved the issue this time. I believe the resolution to my 2014 issue was both kernel and libvirt updates.

Children
No Data