Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

e1000 resetting

i'm having *QUITE* a bit of an issue with a "new" UTM 9.203 install
the NIC is RESETTING wreaking havoc in comms.
kernel message:
2014:06:26-15:17:31 utm kernel: [94499.808337] e1000e 0000:00:19.0 eth2: Detected Hardware Unit Hang:

2014:06:26-15:17:31 utm kernel: [94499.808337]   TDH                  
2014:06:26-15:17:31 utm kernel: [94499.808337]   TDT                  
2014:06:26-15:17:31 utm kernel: [94499.808337]   next_to_use          
2014:06:26-15:17:31 utm kernel: [94499.808337]   next_to_clean        
2014:06:26-15:17:31 utm kernel: [94499.808337] buffer_info[next_to_clean]:
2014:06:26-15:17:31 utm kernel: [94499.808337]   time_stamp           
2014:06:26-15:17:31 utm kernel: [94499.808337]   next_to_watch        
2014:06:26-15:17:31 utm kernel: [94499.808337]   jiffies              
2014:06:26-15:17:31 utm kernel: [94499.808337]   next_to_watch.status 
2014:06:26-15:17:31 utm kernel: [94499.808337] MAC Status             
2014:06:26-15:17:31 utm kernel: [94499.808337] PHY Status             
2014:06:26-15:17:31 utm kernel: [94499.808337] PHY 1000BASE-T Status  
2014:06:26-15:17:31 utm kernel: [94499.808337] PHY Extended Status    
2014:06:26-15:17:31 utm kernel: [94499.808337] PCI Status             
2014:06:26-15:17:33 utm kernel: [94501.808357] e1000e 0000:00:19.0 eth2: Detected Hardware Unit Hang:
2014:06:26-15:17:33 utm kernel: [94501.808357]   TDH                  
2014:06:26-15:17:33 utm kernel: [94501.808357]   TDT                  
2014:06:26-15:17:33 utm kernel: [94501.808357]   next_to_use          
2014:06:26-15:17:33 utm kernel: [94501.808357]   next_to_clean        
2014:06:26-15:17:33 utm kernel: [94501.808357] buffer_info[next_to_clean]:
2014:06:26-15:17:33 utm kernel: [94501.808357]   time_stamp           
2014:06:26-15:17:33 utm kernel: [94501.808357]   next_to_watch        
2014:06:26-15:17:33 utm kernel: [94501.808357]   jiffies              
2014:06:26-15:17:33 utm kernel: [94501.808357]   next_to_watch.status 
2014:06:26-15:17:33 utm kernel: [94501.808357] MAC Status             
2014:06:26-15:17:33 utm kernel: [94501.808357] PHY Status             
2014:06:26-15:17:33 utm kernel: [94501.808357] PHY 1000BASE-T Status  
2014:06:26-15:17:33 utm kernel: [94501.808357] PHY Extended Status    
2014:06:26-15:17:33 utm kernel: [94501.808357] PCI Status             
2014:06:26-15:17:35 utm kernel: [94503.808389] e1000e 0000:00:19.0 eth2: Detected Hardware Unit Hang:
2014:06:26-15:17:35 utm kernel: [94503.808389]   TDH                  
2014:06:26-15:17:35 utm kernel: [94503.808389]   TDT                  
2014:06:26-15:17:35 utm kernel: [94503.808389]   next_to_use          
2014:06:26-15:17:35 utm kernel: [94503.808389]   next_to_clean        
2014:06:26-15:17:35 utm kernel: [94503.808389] buffer_info[next_to_clean]:
2014:06:26-15:17:35 utm kernel: [94503.808389]   time_stamp           
2014:06:26-15:17:35 utm kernel: [94503.808389]   next_to_watch        
2014:06:26-15:17:35 utm kernel: [94503.808389]   jiffies              
2014:06:26-15:17:35 utm kernel: [94503.808389]   next_to_watch.status 
2014:06:26-15:17:35 utm kernel: [94503.808389] MAC Status             
2014:06:26-15:17:35 utm kernel: [94503.808389] PHY Status             
2014:06:26-15:17:35 utm kernel: [94503.808389] PHY 1000BASE-T Status  
2014:06:26-15:17:35 utm kernel: [94503.808389] PHY Extended Status    
2014:06:26-15:17:35 utm kernel: [94503.808389] PCI Status             
2014:06:26-15:17:37 utm kernel: [94505.808037] e1000e 0000:00:19.0 eth2: Reset adapter unexpectedly
2014:06:26-15:17:40 utm kernel: [94509.332884] e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None


i've googled and this seems to be a problem with a lot of linux installations.

so far themost useful: kernel modules - Linux e1000e (Intel networking driver) problems galore, where do I start? - Server Fault
says about running kernel with ASPM disabled, how do i do this?, or am i better off opening a support ticket(which brings me to the issues that this is running on a trial lic on software until the HW SG arrives...)


This thread was automatically locked due to age.
  • If you have support make certain to get a case open.

    Google: site:astaro.org "Detected Hardware Unit Hang"
  • i'm seeing a TON of results from older betas and all thathave been "apparently fixed" in versions prior to the one i have.
    it seems disabling GSO/GRO can solve it. trying that now.

    edit: so far 10mins without falls, but there isn't enough load
  • Your pre-edit post referenced difficultly with the correct command line syntax:  please share what works (or doesn't).
  • i was using the full offload name "generic-segmentation-offload off" and kept getting syntax errors, instead it is:
     ethtool -K eth2 gro off
     ethtool -K eth2 gso off
  • To further help anyone else that finds this thread:

    What is the hardware?

    What is the output from "lspci -tv"?
  • HW is a HP DC5800 minitower(C2D E4600, 2GB RAM) with 3 NICs, one the intel onboard, a 3c905B and a tplink gigabit pcie 

    utm:/home/login # lspci -tv
    
    -[0000:00]-+-00.0  Intel Corporation 82Q33 Express DRAM Controller
               +-02.0  Intel Corporation 82Q33 Express Integrated Graphics Controller
               +-03.0  Intel Corporation 82Q33 Express MEI Controller
               +-19.0  Intel Corporation 82566DM-2 Gigabit Network Connection
               +-1a.0  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4
               +-1a.1  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5
               +-1a.2  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6
               +-1a.7  Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2
               +-1b.0  Intel Corporation 82801I (ICH9 Family) HD Audio Controller
               +-1c.0-[20]--
               +-1c.1-[30]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller
               +-1d.0  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1
               +-1d.1  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2
               +-1d.2  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3
               +-1d.7  Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1
               +-1e.0-[07]----04.0  3Com Corporation 3c905B 100BaseTX [Cyclone]
               +-1f.0  Intel Corporation 82801IB (ICH9) LPC Interface Controller
               +-1f.2  Intel Corporation 82801IB (ICH9) 2 port SATA Controller [IDE mode]
               \-1f.5  Intel Corporation 82801I (ICH9 Family) 2 port SATA Controller [IDE mode]


    extra detail on the e1000:
    utm:/home/login # lspci -vvv -s 00:19.0
    
    00:19.0 Ethernet controller: Intel Corporation 82566DM-2 Gigabit Network Connection (rev 02)
            Subsystem: Hewlett-Packard Company Device 281e
            Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
            Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 


    [EDIT[

    Also:
    utm:/home/login # ethtool -d eth2
    
    MAC Registers
    -------------
    0x00000: CTRL (Device control register)  0x40100240
          Endian mode (buffers):             little
          Link reset:                        normal
          Set link up:                       1
          Invert Loss-Of-Signal:             no
          Receive flow control:              disabled
          Transmit flow control:             disabled
          VLAN mode:                         enabled
          Auto speed detect:                 disabled
          Speed select:                      1000Mb/s
          Force speed:                       no
          Force duplex:                      no
    0x00008: STATUS (Device status register) 0x00080283
          Duplex:                            full
          Link up:                           link config
          TBI mode:                          disabled
          Link speed:                        1000Mb/s
          Bus type:                          PCI
          Bus speed:                         33MHz
          Bus width:                         32-bit
    0x00100: RCTL (Receive control register) 0x04008002
          Receiver:                          enabled
          Store bad packets:                 disabled
          Unicast promiscuous:               disabled
          Multicast promiscuous:             disabled
          Long packet:                       disabled
          Descriptor minimum threshold size: 1/2
          Broadcast accept mode:             accept
          VLAN filter:                       disabled
          Canonical form indicator:          disabled
          Discard pause frames:              filtered
          Pass MAC control frames:           don't pass
          Receive buffer size:               2048
    0x02808: RDLEN (Receive desc length)     0x00001000
    0x02810: RDH   (Receive desc head)       0x0000009D
    0x02818: RDT   (Receive desc tail)       0x00000090
    0x02820: RDTR  (Receive delay timer)     0x00000000
    0x00400: TCTL (Transmit ctrl register)   0x3103F0FA
          Transmitter:                       enabled
          Pad short packets:                 enabled
          Software XOFF Transmission:        disabled
          Re-transmit on late collision:     enabled
    0x03808: TDLEN (Transmit desc length)    0x00001000
    0x03810: TDH   (Transmit desc head)      0x000000A8
    0x03818: TDT   (Transmit desc tail)      0x000000A8
    0x03820: TIDV  (Transmit delay timer)    0x00000008
    PHY type:                                unknown


    what is interesting is that the dev flags mark it as a PCI device, yet everywhere else looks like a pcie one (albeit i dont see the pcie registers in the -vvv option
  • final note: disabling GSO/GRO solved it, NIC hasn't resetted ever since
  • Upon a system reboot the changes got reset, how can i make this setting be applied permanently?
  • If you have a commercial license, contact Sophos Support, they likely have a kernel patch that will fix this "permanently" ... for some reason, while this issue appears to be mostly fixed in the 9.1xx branch of code in the latest releases, the 9.2xx branch remains unfixed (completely).

    CTO, Convergent Information Security Solutions, LLC

    https://www.convergesecurity.com

    Advice given as posted on this forum does not construe a support relationship or other relationship with Convergent Information Security Solutions, LLC or its subsidiaries.  Use the advice given at your own risk.

  • and how can i change it myself, i'm on a trial license until the HW unit arrives
Share Feedback
×

Submitted a Tech Support Case lately from the Support Portal?