This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Weirdness with EC 5.0

I've noticed with moving to EC5.0 (fresh build, fresh server) that once a new IDE update is sent out to my workstations, they all report back "Unknown" in Up to date, but the Detection data and IDE columns show the most current. After like 45min-1 hour up to date changes to "Yes"


This is annoying because i have my threshold for out of date alerts set to like 75% and it always triggers when a new IDE is available. I'm only using one update (Primary) and no secondary.

Thoughts?

:19953


This thread was automatically locked due to age.
  • Hello,

    what about the server itself? Does its state also lag behind?

    Christian

    :19957
  • I'll have to wait for the next IDE update, but as far as i remember it did. All systems report that way.

    This new server is a fresh Windows 2008 server using a SQL database on another server (in the same physical building).

    :19965
  • So an update to IDE's is going on now, and here is what I am talking about:

    :19975
  • Hi,

    An "Unknown" state basically means the "Packages" table in the SOPHOS datatabse (SOPHOS50 in this case) is not being updated by SUM before the client that shows as "Unknown" updates and sends in a status messages.  This is possible based on timings.

    That's to say, when SUM updates, at the end of the update (has to be successful) it sends in a status message, this is processed by the management service and the packages table gets updates with the latest package information.  This update comprises of the following combination of information: ProductID, SAV Version, VirusDataVersion, IDEChecksum .  You can identify information in the packages table as having come in from SUM rather than a client as the records have a rollout number of 99999999 .

    So as SUM updates the share before the client can update, the SUM status usually beats the client's status.  If the client's status arrives first, there is no record in the packages table that matches what the client has so a new record is created for the clients combination.  As the server can't compare it against the "authority" of a SUM maintained record it has to say it is 'Unknown' at least until the SUM status comes in, matches the clients combination and then "takes over" the record by setting the rolloutnumber to 99999999.

    Now if you have a SUM which updates a number of shares,  The SUM only sends in a status update at the end of the deployment to the shares, so it is possible for a client who updates from share1 before SUM has populated share 5 and sent in a status message.

    If you think this is happening you could create a registry DWORD called ‘‘‘‘UpToDateLatencyMins’’’’ in the registry entry [HKEY_LOCAL_MACHINE\SOFTWARE\Sophos\EE\Management Tools] , maybe set it to 60 and see how things change.  You would need to restart the Sophos Management Service after creating the key. 

    Do you have multiple subscriptions and multiple distributions locations?

    Regards,

    Jak




     

    :19983
  • I have 1 SUM that updates about 10 shares globally. It can take up to 45min for an update to complete. Maybe I should have multiple SUMs.
    :19985
  • Hi,

    That's interesting and sounds like it could be the cause for the "unknown" state, I suspect that the machines that show as unknown before being automatically corrected are the ones that update from the distribution points at the top of the distribution list in the SUM configuration (they are pushed sequentially) and therefore those locations offer up the latest software to the clients before SUM has finished on the later locations.  If you order the machines by their primary CID location, that could prove the theory.  Unless of course you have the updating policeis set with a long interval as this would randomise the interval more than if the clients all check for updates every 10 mins for example.

    That all being said, it sounds like you have a few options:

    1. Install an additional SUM, maybe at the SEC site which just maintains all the subscriptions but just creates one local distribution point, i.e.. the local SophosUpdate share.  Then, make that new SUM the authoritative SUM using the registry key:
    http://www.sophos.com/support/knowledgebase/article/57638.html

    Updates on that SUM will happen quickly and the status messages will be sent soon after.  The only thing to consider with this approach is that this SUM would have to update from Sophos, if it updates from the other SUM, you start to add in latency on the update and therefore the status message.


    2. Install another SUM at the SEC site, make that SUM push to all the remote shares.

    Remove the distribution locations from the SEC SUM, that would still remain authoritative.  The original SUM would complete quicker, and guarantee the status message arrived in the DB faster and before any clients had a chance to update from the CIDs.  

    3. Install SUMs at the remote sites, subscribe them to the same subscriptions and let them push local distribution points for the clients at each site to use.  Not sure how your sites are configured, but each of these remote SUMs could point to Sophos as a secondary for resilience at least for updating.  This way, if the site became cut off from the main site (I assume SEC is at the main site), but still had internet access, it could still fetch updates and provide updates to the clients.  Plus I find that a pull is probably more reliable than a push, plus you can pull over HTTP as well if needed.

    This would be my favored approach, the only downside being that it requires a Windows server at each site, which is fine if you have them, but if you currently push to a Linux box or filer this would not be an option.  The other benefit of this approach if you have mutiple products slected in a subscription is that some of the files are common,  SUM at the remote site would pull the files once and then distribute them locally.  Virus data is shared and quite large so it adds up to quite a saving.
     

    4. Another option would be to increase the update interval on the clients, so they don't find the update so quickly but this doesn't seem the right approach, as you'd be sacrificing protection for reporting.

    5. Leave everything as it is, if it's just the dashboard/email alerts you want to correct there is a registry key the management service can read in to change the allowed latency.   It's a DWORD called ‘‘‘‘UpToDateLatencyMins’’’’ in the registry entry:


    [HKEY_LOCAL_MACHINE\SOFTWARE\[wow6432node]Sophos\EE\Management Tools] 

    If it doesn't exist, 60 minutes it the latency, so you could create that and make it 120 minutes and see how that changes the behaviour. It will offset the "Not since" times in SEC but if you know what that's fine.  If you were to create this key, you would need to restart the management service.  This should cause the dashboard to be more lenient.

    I hope this offers some ideas.

    One thing to be mindful of is installing a new SUM at a remote site to take over the maintenance of an existing SophosUpdate share created by the main SUM.  This seems to cause problems, at least it used to.  I would be tempted to take a backup of the SOPHOS50 database just incase and maybe remove the distribution point from the first SUM before re-creating it with the new SUM.  Something to test.

    Regards,

    Jak

    :19987
  • Thanks for everything Jak, i've redone our sophos infrastrucutre.

    I now have 5 SUM servers globally, each updating local repositories in that region. All my update policies reflect the changes. Updates to my one SUM went from 40-50min to complete now to under 3minutes (for all SUMS). I no longer have the problem. 

    I configured each of my SUM servers to update from Sophos directly rather than from the primary SUM server as well. 

    :20155