This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

XG 125 - SFOS 18.0.2 MR-2 Random Rebooting

Hi,

I have one device XG125 with SFOS 18.0.2 MR-2 that re rebooting randomly.

I already checked infrastructure, power cords, no-break, cables but the device still rebooting.

Is there any log file for debug these events? I would check it before open a case.

Regards

Carlos



This thread was automatically locked due to age.
Parents
  • There is a Bug fix for this issue in Mr3. https://community.sophos.com/xg-firewall/b/blog/posts/xg-firewall-v18-mr3

    • NC-58402 [Firewall] Firewall reboots randomly.
  • Hi ,

    Even after apply the last firmware v18-mr3 after 5 days the XG device rebooted :/

    Have you any idea about where log file can I find more info about these reboots ?

    Regards

  • Hello Carlos,

    Please open a case with Support and provide me the Case ID.

    The logs you want to check are the following:

    csc.log, applog.log, syslog.log, msync.log and networkd.log

     If possible, memory and CPU graph and all this detail with exact date and time when issue observed.

    Additionally to this please check for any coredump under /var/cores

    And please confirm what is the output of this command:

    console> system auto-reboot-on-stall show

    And provide the logs mentioned and output of the command on the Case.

    Regards,

  • Hi ,

    Thanks fo your reply and the tips about logs.

    I already opened a case, and the RMA will be provided.

    Im verifying the logs and I did not found not relevant  I will attach parts of logs in exactly momment that reboot happened.

    csc.log

    applog.log

    syslog.log

    msync.log and networkd.log nothing relevant

    nothing in /var/cores

     

    the command  system auto-reboot-on-stall show  does not exists

    but this one show

    console> system auto-reboot-on-hang show

    Auto reboot system when kernel gets into a hang state is enabled

    console>

    And the grpahs are normal, the reboot happen in the period that I mark as black line, the increase CPU after this it was caused by initiate process.

    Best regards

    Carlos

  • Hello Carlos,

    Thank you for the logs. As per the graphs, seems not related to what my original guess was.

    May have the Case ID, please.

    regards,

  • Hi folks.

    Even after RMA process, and hardware replacement, the device still restarting randomly.
    There are several devices in the same Datacenter, using the same infrastructure and only XG restart. Already it was checkedall points about energy, network and the device still restarting randonly.

    I already opened a ticket in sophos, but maybe someone already got this behaivior and has solution or any debug, trace process.

    regards

    Carlos

  • Hello Carlos,

    Please provide me with the Case ID.

    Regards,

  • Hello ,

    The Case ID is 03418309

    And after spend a looong long time debugging the log files, I found a thing that can help in something.

    In syslog.log file  I found these events in exactly momment os rebbot.

    Dec 6 19:08:16 (none) user.info kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
    Dec 6 19:08:16 (none) user.info kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
    Dec 6 19:08:16 (none) user.info kernel: [ 0.058432] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.

    I still not did a big reasearch about it, but this can be a path about identify the problem.

    regards

    Carlos

Reply
  • Hello ,

    The Case ID is 03418309

    And after spend a looong long time debugging the log files, I found a thing that can help in something.

    In syslog.log file  I found these events in exactly momment os rebbot.

    Dec 6 19:08:16 (none) user.info kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
    Dec 6 19:08:16 (none) user.info kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
    Dec 6 19:08:16 (none) user.info kernel: [ 0.058432] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.

    I still not did a big reasearch about it, but this can be a path about identify the problem.

    regards

    Carlos

Children