This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Sophos Duplicate IDs

I just wanted to share the perl script i wrote to find duplciate unique IDs in Sophos. It scans the IIS logs looking for duplicate GUIDs.

#Stephen
#Check for Duplicates
use Data::Dumper;
$file = "\\\\sophos-c108-01\\W3SVC1\\u_ex110822.log";
my %hash = ();
my %hDup = ();
open FILE, $file or die $!;
while (<FILE>)
{ 
@data = ($_ =~ /(\b143\.55\.\d{1,3}\.\d{1,3}\b).*?(\b143\.55\.\d{1,3}\.\d{1,3}\b).*?(\{{0,1}[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}\}{0,1})/);
#print $data[0] . "\n";
if ((exists $hash{$data[2]}) && ($hash{$data[2]} ne $data[1]))
{
if(not exists $hDup{$data[1]})
{
print $data[1] . "\n";
$hDup{$data[1]} = $data[1];
}
}
else
{
$hash{$data[2]} = $data[1];
}
}
close(FILE); 

I have more info on my blog about it.

http://www.stephenjc.com/2011/08/23/sophos-duplicate-ids/

:20691


This thread was automatically locked due to age.
  • Thanks very much for that  script.

    As I know nothing in Perl, could you or somenone explain (decipher !) what the script is looking for because I feel I could find those duplicate ID by searching in the log file with some text editor (sorting IDs should do the trick).

    Thanks in advance !

    PJ

    :36865
  • Could someone give me a clue about the log file the script is using.

    I can't even find which log could contain the information !

    Thanks.

    PJ

    :36929
  • Hello PJ,

    it's using $file = "\\\\sophos-c108-01\\W3SVC1\\u_ex110822.log - in other words, the (an to be exact) IIS log (usually in \system32\LogFiles\W3SVC1). Basically (did not try to understand it, just gave it a glance) it checks if the same ID is used from different IP addresses (he is searching for 143.55.x.x) and if prints the "offending" IP.

    Christian

    :36935
  • Thanks for your answer.

    I found some Sophos logs in my W3SVC1 folder (not with the same name, obviously) but I still can't figure out where the ID is supposed to be...

    As the analysis script is only one - crypted - line, I thought it would have been quite obvious when looking inside the log.:smileysad:

    :36947
  • Hello PJ,

    As the analysis script is only one - crypted - line

    it's not crypted, it's a RegEx - just extracting the client's IP and ID. The log is not too hard to decode, there's a line starting with #Fields:, their names describing the contents (s - means server, c - client, cs - client to server and sc, well, you probably guess). Here's a short explanation of the log format:

    2012-12-31 23:00:25 W3SVC1
    >>> timestamp and site
    111.222.333.444 
    >>> server IP
    GET /InterChk/SophosUpdate/MUW/CIDs/S000/SAVSCFXP/master.upd -
    >>> request, URI and (optional) query [- means: no query]  
    80 
    >>> server port
    SophosUpdate
    >>> authenticated user (- if there is none) [1]
    111.222.333.400 
    >>> client IP
    SophosAutoUpdate/2.7.8.335.....)
    >>> user agent string from the HTTP request header, details below 
    200 0 0
    >>> returncodes [2]

    [1] AutoUpdate makes each request first without authentication, so if one is required (as is usually the case) you get one line where this field is blank followed by an almost identical with the username from the updating policy

    [2] the first request fails with 401.2 (Unauthorized), the authenticated should return 200.

    Now let's look at the User Agent field in detail:

    SophosAutoUpdate/2.7.8.335+
    >>> Component (AutoUpdate or SUM) and version [1]
    SDDS/1.0+
    >>> SDDS version
    (u="SophosUpdate"+
    >>> update user
    c="a731ab2f-d2d5-41fb-8a1a-2fc9ab0fa58f")
    >>> and the computer's ID (machine_ID)

     [1] for client updates the component is SophosAutoUpdate, for a downstream SUM accessing the Warehouse it is SophosUpdateManager

    The last item is the machine_ID. Unfortunately it does not necessarily map to the IdentityTag column in the database (please see machine_ID.txt - guid in the enterprise console). Thus you might see different IPs using the same machine_ID will still having their own entry in the database. Also note that if clients change their IP (e.g. switching from cable to WLAN) during the day you'll also get "fake duplicates". Of course the IP is meaningless if a proxy is involved.

    Dunno if this helps at all therefore - what exactly are you interested in?

    Christian

    :36955
  • Thanks VERY MUCH for this precise answer.

    We have a bunch of clones that keep appearing in the console under different names, at the same place, and we'd like to find all of them and apply the ID fix.

    When i said "crypted", I was joking like I can't decipher what the line of code is supposed to do (and I asked developer colleagues !).

    As for the log file, it seems what I have has not enough information. I searched through the whole log folder but there's no file that contains c="...".

    I even converted files to CSV to get data sorted by columns.

    The best piece of information I can get is :

    s-port cs-username c-ip                     cs(User-Agent)                                          sc-status

    80       -                        192.168.1.173 Microsoft-WebDAV-MiniRedir/6.1.7601 404

    Do I need to activate full logs somewhere ? Am I looking in the wrong place ?

    Thanks.

    PJ

    :36963
  • Hello PJ,

    the explained format applies only if your clients are updating via HTTP. The WebDAV requests are often ... err ... noise :smileywink:

    You did not answer if you "suffer" from duplicate IDs

    Christian

    :36965
  • Er, yes I think I did ! :smileywink:

    We have machines that keep appearing with different names in the same folder in the console (out of place, then).

    If I look at the Web events in the log of those machines, they originate from different users like they would take turn on the same machine albeit working hundred kilometers from each other.

    So, do I need to get ID info from the database then ?

    PJ

    :36967
  • Must have missed it, PJ

    Well, I had this duplicate problem for a long time (dunno if I still have but it appears I've got all of them). The procedure is a little bit arcane (I did not consult support). But first of all - do you observe the name of the machines changing or not? 

    Christian

    :36969
  • Yes, I did folders just on that purpose because I saw computers appearing and reaapearing in the console althought I was sure they were online.

    We have groups according to locations and computers are name based on their location so I got some folders where a computer changes its name during the day.

    It's the list of Web events that gave it away. I can see 5 different users, so it's at least 5 machines with the same ID.

    :36971