This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Opinion on the updater snafu from a Senior Admin

I've been in this business over 30 years, and I have, believe me, gone through Hell with a lot of previous anti-virus products in the past. Sophos was, and still is in my opinion, the best and easiest to administer. So, it's amusing to read some of the posts on the Internet from rookie admins doing a good imitation of a Drama Queen. Knee-jerk reactions won't do you any good when something like this happens. I too, could not get through to Sophos yesterday afternoon. But, I understood that their phone lines were probably overtaxed, so I just waited for them to fix the problem, applied the fixes from other admins I found on the Net, and by the time I went home to a cold supper, things were more or less back to normal.

Now, having said all that, and holding Sophos in such high esteem for so long, I'm as disappointed as anyone over this. I expect, in fact I demand, more from Sophos. If they want to be held up as the gold standard in this business, they'd better review procedures and try to make sure this doesn't happen again. Furthermore, a supreme gesture of good faith would be some sort of discount on their loyal customer's next maintenance contract.

:31353


This thread was automatically locked due to age.
  • It's not big deal if you administer 500 computers at a central location. I have a footprint of over 20,000 computers, at over 50 locations, in over 600 square miles. 

    The point is, there's absolutely no way any testing was done for an update to IDENTIFY ITSELF as a virus. I don't care if it's happened before, it's not an excuse for laziness and complacency. The fact is, any number of us would be held accountable and to the fire by our colleagues if we let something like this slip through, as we should.

    :31375
  • To a certain extent - agreed with both. Roughly the same time spent in this business as Pagjsp.

    Some remarks though:

    Really bad is not the issue with AutoUpdate but with all the other updaters. If you had Move or Delete in your policy recovery will be very painful. But IMHO there was an overreaction by some (so called) rookie admins.

    I've learnt that the first thing to do when disaster strikes is: unplug your phone (or turn off your mobile), get yourself a coffee, lean back and assess the situation: AutoUpdate is broken. Impact? - No identity updates until it is fixed. Scanner? - Works. Slighlty increased risk that brand-new threats slip in, otherwise no immediate danger. Similar to your managment server going belly up. No need to panic.

    there's absolutely no way any testing was done

    Agreed that generic detections have to be tested very very carefully. I can imagine though that testing has been done but that it was flawed. for use with our Live Protection system suggests a rather complex identity, furthermore identities (or IDEs) are not idependent. The slip-up might as well have happened when the update was finally released - guess many of us know such situations. May those of us who never made a (potentially disastrous) mistake speak up :smileyhappy:.

    I'm not cynical - had my share of horror and anger when I came in at 8am and saw about a third of my endpoints (not 20.000 but several thousand anyway) having sent in an alert - and we don't have administrative access to most of them ...

    Don't expect detailed explanations and heads rolling at this point - it'd be just to please the masses. And don't forget - it might be much worse for all the support staff any many others at Sophos than for us.

    Christian

    :31477
  • In our case we have over 10,000 endpoints affected, with deny and move enabled so I spent a good portion of the night preparing a vbs to deploy in the morning. Its certainly not the first time something has happened like this, other endpoint management software packages have done similar things to us, and this is certainly not the worst damage ive seen. At least people can pretty much still use their comptuers as normal. Though it will take months to clean up most of our workstations, and I doubt we will ever see 100% of them fixed since so many are mobile, life goes on.

    But this is a massive failure in proceedure. It certainly could have been worse. This same type of failure in quailty control could have caused massive damage to the OS and we could have 10,000 systems down. I'm greatful that didn't happen but the point is it easily could have.

    So from here, whether we stay with Sophos as our contract comes up for renewal will depend entirely on how they smooth things over with their customers. I know this kind of thing could happen again with an other AV products if we switch. I wont switch for that reason. I will switch if they cannot redeem themselves. And so, I await their offer of amends.

    :31815
  • Is it just me or does anyone else find it hard to believe that something like this (deleting your own program's files) make it through thier QA. I mean really, they didn't see this?

    I'm starting to get paranoid and wondering if they are not telling us something major.

    :31837
  • Even though FPing their own files is bad - that Sophos don't exempt their own files is actually a good thing IMO.

    At this point it's still mere speculation. And like probably many others I'm trying to imagine a probable scenario. Right now I'm leaning to a slip-up during release, an "experimental generic but targeted detection". Looks like it is/was supposed to verify the integrity of various vendor's updating components (updaters, callers and supporting libraries) with additional emphasis on AutoUpdate. In addition it's instructed to call home (viz: to use Live Protection). Now I think that it (at least the version distributed) has not passed QA. Presumably it wasn't even intended for release but  "somehow" managed to slip-in.

    What's rarely to never seen from "our side" is the struggle to provide timely and effective protection. Malware is a highly organized and profitable business and the "products" undergo regular and sophisticated development and testing. AV companies naturally don't wait for final versions to be released but prepare for expected (and sometimes announced) new features in the malware. New proactive detections might catch alpha or beta versions and thus facilitate further specific measures. Labs is not just sitting and waiting for samples to be submitted and writing a detection in response. That's no excuse for what we have seen but maybe it puts the whole "mess" into perspective.  

    Christian

    :32153
  • One thing I will say is at least Sophos are well publicising this on their website. When McAfee did something similar (deleted half of the windows\system32 directory) there wasn’’’’t a shred of info on the front page of their site. 

    It didn’’’’t really affect me as all our clients are set to block and not delete files. 

    Don’’’’t get me wrong, this is completely and utterly inexcusable but from what I can see its been handled as well as it possibly could have been!

    :32227
  • I would like to know how the problem actually occured. Are there live protection updates not sent to a test environment first? I would hope that due to this problem more tighter controls are brought on to the liveprotection IDE updates (if that is what caused this issue).

    The problem was not that bad because it just broke the update mechanism and on access scanning was still working. Compared to a bad mcafee dat update a few years ago that left me with a few 1000 client pcs with blue screens that had to be manual fix via safemode, this was mild.

    But if sophos starts suffering from these sorts of problems again i will be very dissapointed because there is no one else to turn to that is any good imo. So if sophos could please outline what they are going to do to prevent this from ocurring again then i can at least go to my boss with some sort of information that might restore confidence.

    :32229