This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to debug unexpected reboot of XG v18?

Dear all,

we're running SFOS 18.0.1 MR-1-Build396 in a virtual machine on VMware and experienced an unexpected reboot today. The log system log does not show any indication about why the reboot took place:

The machine was up and running fine for two weeks before and we experienced no problems. Is there a logfile we can use to further investigate the cause of the unexpected reboot?

Thanks
Michael



This thread was automatically locked due to age.
Parents
  • Hi  

    For unexpected reboot you may collect below logs and information.

    1) VM console logs.

    community.sophos.com/.../sophos-xg-firewall-how-to-capture-serial-console-logs-for-firewall-installed-on-a-virtual-platform

    2) Unexpected Reboot time and date for next instance.

    3) All log file from VM XG with CSC service in debug.

    I have sent you a PM with detailed steps on how to collect the logs.

  • Hi Vishal,

    thanks for your help. Am I right in assuming that those steps only help in case of another unexpected reboot? Since the serial port was not available before the last reboot and CSC was not in debug either, I guess we don't have a chance to find what caused the last reboot?

    I am also not sure what "Unexpected Reboot time and date for next instance" actually means. 

    Thanks
    Michael

  • Hi  

    Yes your assumption is correct, the steps will be helpful with next instance.

    With current last instance you may check applog, syslog , errorr_log.log and csc.log ( non debug) if there is any suspected and problematic logs then you may share with us to confirm more.

    "Unexpected Reboot time and date for next instance" ==> After 2nd or next instance you may note down the near by time for auto reboot and date so one can check the log file around same time and date.

  • Thank you Vishal. The logs don't show any indication of a failure, for example the syslog.log only shows errors that don't seem to be related to the reboot at around 16:56 (and that occur on a regular basis):

    Jul 8 16:41:08 (none) local7.err wafgr[8420]: failed to convert duration to integer: 29284713992 is out of range
    Jul 8 16:41:16 (none) local7.err wafgr[8420]: failed to convert duration to integer: 26281490037 is out of range
    Jul 8 16:50:50 (none) local7.err wafgr[8420]: failed to convert duration to integer: 31717521096 is out of range
    Jul 8 16:50:50 (none) local7.err wafgr[8420]: failed to convert duration to integer: 31717295908 is out of range
    Jul 8 16:52:33 (none) daemon.info redctl[23270]: key length: 32
    Jul 8 16:56:24 (none) syslog.info syslogd started: BusyBox v1.21.1
    Jul 8 16:56:24 (none) user.notice kernel: klogd started: BusyBox v1.21.1 (2020-06-05 20:27:42 UTC)
    Jul 8 16:56:24 (none) user.notice kernel: [ 0.000000] Linux version 4.14.38 (jenkins@ci-16) (gcc version 7.3.0 (OpenWrt GCC 7.3.0 7340-gf2d738297)) #2 SMP Fri Jun 5 23:01:04 UTC 2020
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] Command line: BOOT_IMAGE=/18_0_1_396 quiet console=tty0 console=ttyS0,38400n8 maxcpus=6 memlimit=8G
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] Disabled fast string operations
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] x86/fpu: x87 FPU will use FXSAVE
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] e820: BIOS-provided physical RAM map:

    I'll keep an eye on it. Thank you for your help.

Reply
  • Thank you Vishal. The logs don't show any indication of a failure, for example the syslog.log only shows errors that don't seem to be related to the reboot at around 16:56 (and that occur on a regular basis):

    Jul 8 16:41:08 (none) local7.err wafgr[8420]: failed to convert duration to integer: 29284713992 is out of range
    Jul 8 16:41:16 (none) local7.err wafgr[8420]: failed to convert duration to integer: 26281490037 is out of range
    Jul 8 16:50:50 (none) local7.err wafgr[8420]: failed to convert duration to integer: 31717521096 is out of range
    Jul 8 16:50:50 (none) local7.err wafgr[8420]: failed to convert duration to integer: 31717295908 is out of range
    Jul 8 16:52:33 (none) daemon.info redctl[23270]: key length: 32
    Jul 8 16:56:24 (none) syslog.info syslogd started: BusyBox v1.21.1
    Jul 8 16:56:24 (none) user.notice kernel: klogd started: BusyBox v1.21.1 (2020-06-05 20:27:42 UTC)
    Jul 8 16:56:24 (none) user.notice kernel: [ 0.000000] Linux version 4.14.38 (jenkins@ci-16) (gcc version 7.3.0 (OpenWrt GCC 7.3.0 7340-gf2d738297)) #2 SMP Fri Jun 5 23:01:04 UTC 2020
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] Command line: BOOT_IMAGE=/18_0_1_396 quiet console=tty0 console=ttyS0,38400n8 maxcpus=6 memlimit=8G
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] Disabled fast string operations
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] x86/fpu: x87 FPU will use FXSAVE
    Jul 8 16:56:24 (none) user.info kernel: [ 0.000000] e820: BIOS-provided physical RAM map:

    I'll keep an eye on it. Thank you for your help.

Children