Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[INFO-152] Network Monitor not running - restarted

Been receiving a few webadmin info-152 e-mails from the UTM
Did a dmesg...

[1123325.232580] nwd[10968]: segfault at 2e323931 ip 000000002e323931 sp 00000000ffd900c0 error 14


Current software version...: 9.310009
Hardware type..............: Software Appliance
Installation image.........: 9.308-16.1
Installation type..........: asg
Installed pattern version..: 78536
Downloaded pattern version.: 78536
Up2Dates applied...........: 2 (see below)
                             sys-9.308-9.309-16.3.1.tgz (Mar 15 19:20)
                             sys-9.309-9.310-3.9.4.tgz (Mar 27 05:29)
Up2Dates available.........: 1
Factory resets.............: 0
Timewarps detected.........: 1

Any ideas?


This thread was automatically locked due to age.
  • Service Monitor not running - restarted

     

    I had this problem appear on version - 9.506-2 Pattern version - 135858

    It was also present on the previous version. A reboot and a disk check have failed to

    resolve the issue. The "service Monitor" service is stopping around every 30 minutes.

    The log indicates that the service stops when a reverse DNS is attempted by the system

    or a check that the target responds to an ICMP ping. The service then terminates.

    This is an exert from my "service monitor" log.

    ..............................................................................................................................

    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Starting real server checker with 17 threads"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Open ICMPv4 socket"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Open ICMPv6 socket"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaTrsSecurTime ICMP 46.101.55.10 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaTrsSecurTime to 46.101.55.10"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 90.207.238.97 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 90.207.238.97"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 8.8.8.8 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 208.67.222.222 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 208.67.222.123 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaMultiDnsResol ICMP 8.8.8.8 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaMultiDnsResol to 8.8.8.8"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaMultiDnsResol ICMP 90.207.238.97 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 90.207.238.99 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaTrsSecurTime ICMP 192.146.137.13 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 8.8.4.4 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 208.67.220.220 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaSecurDnsResol ICMP 208.67.220.123 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaMultiDnsResol ICMP 8.8.4.4 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaMultiDnsResol ICMP 90.207.238.99 changed state to ONLINE"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaTrsSecurTime to 192.146.137.13"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 8.8.8.8"
    2017:11:29-00:02:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaMultiDnsResol to 90.207.238.97"
    2017:11:29-00:02:06 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 208.67.222.222"
    2017:11:29-00:02:06 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaMultiDnsResol to 90.207.238.97"
    2017:11:29-00:02:06 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 208.67.222.123"
    2017:11:29-00:02:06 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaMultiDnsResol to 90.207.238.97"
    2017:11:29-00:02:06 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 208.67.222.123"
    2017:11:29-00:02:07 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 208.67.222.123"
    2017:11:29-00:02:07 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 208.67.222.123"
    2017:11:29-00:02:08 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaSecurDnsResol to 208.67.222.123"
    2017:11:29-00:02:10 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaTrsSecurTime ICMP 95.215.175.2 changed state to OFFLINE"
    2017:11:29-00:02:10 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaTrsSecurTime to 192.146.137.13"
    2017:11:29-00:02:10 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaTrsSecurTime ICMP 139.143.5.31 changed state to OFFLINE"
    2017:11:29-00:02:10 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="REF_NetAvaTrsSecurTime ICMP 139.143.5.30 changed state to OFFLINE"
    2017:11:29-00:02:10 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaTrsSecurTime to 192.146.137.13"
    2017:11:29-00:02:11 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Set Availability Group REF_NetAvaTrsSecurTime to 192.146.137.13"
    2017:11:29-00:07:05 router service_monitor[30184]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Exiting..."
    2017:11:29-00:07:05 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Waiting for thread 3599"
    2017:11:29-00:07:06 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Waiting for thread 3587"
    2017:11:29-00:07:07 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Waiting for thread 3587"
    2017:11:29-00:07:07 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Waiting for thread 3587"
    2017:11:29-00:07:08 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Waiting for thread 3587"
    2017:11:29-00:07:09 router service_monitor[30184]: id="4002" severity="info" sys="System" sub="loadbalancing" name="Waiting for thread 3587"

    The next entry is the service restarting...

    2017:11:29-00:07:10 router service_monitor[30526]: id="4000" severity="info" sys="System" sub="loadbalancing" name="Starting real server checker with 17 threads"


    I am going to try changing the NTP and DNS clients but I doubt that will make any difference.

    A rather irritating problem and it appears I am not alone in having this issue. Has anyone found a solution to this problem?

    Stuart.




  • Update.....

    I changed the hdd in the system and reloaded from the latest image. System status is...

    Firmware version:9.506-2

    Pattern version:136417

    Configured with one lan interface and one wan interface.

    Standard NAT configured and firewall rule of   Any ----> Any ----> Any

    No other services active not even DHCP.

     

    That and I only just realised I posted the wrong log file entries. Oops [:$]

    This is the correct log AFAIK....

    2017:12:11-08:34:35 router selfmonng[3935]: I check Failed increment service_monitor_running counter 1 - 3
    2017:12:11-08:34:40 router selfmonng[3935]: I check Failed increment service_monitor_running counter 2 - 3
    2017:12:11-08:34:45 router selfmonng[3935]: W check Failed increment service_monitor_running counter 3 - 3
    2017:12:11-08:34:45 router selfmonng[3935]: [INFO-181] Service Monitor not running - restarted
    2017:12:11-08:34:45 router selfmonng[3935]: W NOTIFYEVENT Name=service_monitor_running Level=INFO Id=181 sent
    2017:12:11-08:34:45 router selfmonng[3935]: W triggerAction: 'cmd'
    2017:12:11-08:34:45 router selfmonng[3935]: W actionCmd(+):  '/var/mdw/scripts/service_monitor restart'
    2017:12:11-08:34:45 router selfmonng[3935]: W child returned status: exit='0' signal='0'

    I noticed this on the " Up to Date Log " .....

    2017:12:11-08:28:11 router audld[2722]: Could not connect to Authentication Server 79.125.21.244 (code=500 500 Internal Server Error). 2017:12:11-08:28:20 router audld[2722]: id="3701" severity="info" sys="system" sub="up2date" name="Authentication successful" 2017:12:11-08:43:01 router audld[4029]: no HA system or cluster node 2017:12:11-08:43:01 router audld[4029]: Starting Up2Date Package Downloader 2017:12:11-08:43:02 router audld[4029]: patch up2date possible 2017:12:11-08:43:18 router audld[4029]: id="3701" severity="info" sys="system" sub="up2date" name="Authentication successful"

    I dont know if its related but this coincides with a NTP update ????

    These emails are still coming in every 30 minutes to an hour and are starting to bear a close relationship with the mother in-law.

    Any idea's anyone ?

    !! I used to think I was indecisive but now I am not so sure !!

  • After the latest update it stopped until... I updated a week later the second machine (HA).
    Now it's all the same, and I'm getting 5-6 "Network Monitor not running – restarted" every day.

    I have to say that Sophos team are not much of a help here.

    I'm having this bug for nearly a year now.
    For my opinion, it's unacceptable that a big name like Sophos, during all those months, couldn't find and fix this issue, or at least send me an RPM for resolving this issue.
    That is rally annoying, and all I get from Sophos is that they thanks me for my patient.
    Come on guys, i'm sure yo could do better?

  • Goldy, out of the dozens of UTMs I've worked on and from which I still receive notifications, I'm not seeing this from any of them.  I checked the logs since 1/1/2018.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob.

    Are they having HA?

    Thank.

  • At least 4 are in HA with SG appliances.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • same problem for me. Every day 1 or 2 times.

    Seen first after firmware update 9.508010 on two SG115w with HA (Hot-Standby).

    Restart doesn't fix it for me. Currently we are on 9.509003 and still the same problem.

  • Same problem for me after update to 9.509003. Nearly every hour on Slave in HA active- / passive-Mode.

  • Same here: Sophos UTM 9.508-10 in HA - about once a day on HA Slave. This is getting old.

  • I've seen this on our lab machine last December and again in March and April.  It happens about every 55 minutes.  I can see no reason why it starts or stops.  Just for grins, would one of you try /etc/init.d/postgresql92 rebuild (you must do it on both devices if you're in HA)?  CAUTION - this re-initializes all of the PostgreSQL databases, so you lose all reporting data and (maybe) graphs.   Logs are not affected.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Share Feedback
×

Submitted a Tech Support Case lately from the Support Portal?