[9.171][BUG]AD Group Sync Fails

Not sure which log to grab and post, but I've configured a pair of AD servers as authentication sources (within the same domain). Both are local to the UTM. I've also configured group membership sync.

About every 2 hours I get the following alert:

There was an error synchronizing subscribed groups. The Sophos UTM will
continue to operate with a locally cached copy of the data but will be
unable to update from Directory Services until the issue is resolved.

Error was:
failed to run samba command on ACTIVE.DIRECTORY, exiting now

--
System Uptime      : 1 day 2 hours 51 minutes
System Load        : 0.83
System Version     : Sophos UTM 9.171-2

Please refer to the manual for detailed instructions.

I believe I can force the error if I go to Definitions & Users > Authentication Services > Advanced > Active Directory Group Membership Synchronization > Synchronize Now.

If someone can suggest logs / config data to post to diagnose, I'm more than happy to do so.

0 Tinkerbell over 11 years ago

Hi there,

Can you please provide the fallback.log?

Also did you get the errors via a notification mail?
We tried to reproduce it with the steps you mentioned but without success.
What happens in your case if you go to Definitions & Users > Authentication Services > Advanced > Active Directory Group Membership Synchronization > Synchronize Now. Is the error also thrown in frontend (webadmin)?
Thanks,
Bianca
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 DavidRa over 11 years ago in reply to Tinkerbell

Thanks for the update. I'll post fallback.log tonight (approx now+9h).

I do receive the error as an email, about every 2 hours. I did run the Synchronise Now task in the Web UI, and I received the same alert email, but I'm not sure if the error popped in the UI. I'll check that one tonight.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 MarkMurphy over 11 years ago

I get this also. Here is the log with the errors

[daemon[:D]ebug] rrdcached[3674]:  rotating journals
2013:11:25-22:01:07 edge [daemon[:D]ebug] rrdcached[3674]:  started new journal /var/log/reporting/rrd/rrd.journal.1385434867.121781
2013:11:25-22:01:07 edge [daemon[:D]ebug] rrdcached[3674]:  removing old journal /var/log/reporting/rrd/rrd.journal.1385427667.121795
2013:11:25-22:01:38 edge [daemon:notice] rrdcached[3674]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.
2013:11:25-22:01:44 edge [local0:info] [ctipd] [5096]: CEnginesContainer::UpdateSettings() - Updating
2013:11:25-22:01:44 edge [local0:info] [ctipd] [5096]: CEnginesContainer::UpdateSettings() - Updating
2013:11:25-22:01:46 edge [local0:info] [ctipd] [5096]: CIpRepCache::Save() - Saved to file /tmp/ctipd.cache
2013:11:25-22:01:46 edge [local0:info] [ctipd] [5096]: CIpRepCache::Save() - Saved to file /tmp/ctipd.cache_v6
2013:11:25-22:03:32 edge [daemon:info] admin-reporter[3971]:  Successful WebAdmin login
2013:11:25-22:06:38 edge [daemon:notice] rrdcached[3674]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.
2013:11:25-22:11:38 edge [daemon:notice] rrdcached[3674]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.
2013:11:25-22:16:38 edge [daemon:notice] rrdcached[3674]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.
2013:11:25-22:18:15 edge [daemon:info] admin-reporter[3971]:  Successful WebAdmin login
2013:11:25-22:19:01 edge [daemon:notice] ad-sid-sync.pl[21318]:  [ad-sid-sync] Started in full mode
2013:11:25-22:19:01 edge [daemon:notice] ad-sid-sync.pl[21318]:  [ad-sid-sync] Syncing 1 server(s)
2013:11:25-22:19:01 edge [daemon:notice] ad-sid-sync.pl[21318]:  [ad-sid-sync] Syncing 2 adirectory group(s)
2013:11:25-22:19:55 edge [daemon:notice] ad-sync.pl[21415]:  [ad-sync] started
2013:11:25-22:20:25 edge [daemon:err] ad-sync.pl[21415]:  [ad-sync] error returned from samba command on HOMEUSE.LOCAL
2013:11:25-22:20:55 edge [daemon:err] ad-sync.pl[21415]:  [ad-sync] error returned from samba command on HOMEUSE.LOCAL
2013:11:25-22:20:55 edge [daemon:err] ad-sync.pl[21415]:  [ad-sync] failed to run samba command on HOMEUSE.LOCAL, exiting now
2013:11:25-22:21:38 edge [daemon:notice] rrdcached[3674]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.
2013:11:25-22:26:38 edge [daemon:notice] rrdcached[3674]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 DavidRa over 11 years ago in reply to MarkMurphy

jz - perhaps there's something specific in the groups we've selected for Sync. In particular, if you are synchronising Domain Users, it's a "special" group in AD (it's generally managed as a property of the user account, the Primary Group, and its member list is generated dynamically).

So if your primary group is Domain Users (99.99% of accounts) you're not actually listed as a group member of that group. Instead, IIRC, the group ID is written to a user account property.

I'll document tonight the groups I'm syncing but I believe offhand it's Domain Admins and Domain Users.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michael Dunn over 11 years ago
On the command line can you please run this and then post the output:

/var/chroot-http/usr/bin/run-two -w -l /var/run/ad-sync.lock /var/chroot-http/usr/bin/ad-sync.pl -v
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 MarkMurphy over 11 years ago

This is funny, Put this is the output

edge:/home/login # /var/chroot-http/usr/bin/run-two -w -l /var/run/ad-sync.lock /var/chroot-http/usr/bin/ad-sync.pl -v
1314: I'm the waiter
1314: I'm now the runner
1314: Running
edge:/home/login #
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michael Dunn over 11 years ago

Oops that put the -v on run-two command rather than the ad-sync.pl command. [:)]

Instead, try the following

1. put an "#" at the beginning of "ad-sync" line in the /etc/crontab file (line 8) to comment it out
2. then run "/var/chroot-http/usr/bin/ad-sync.pl -v" as root user.
3. then remove the "#" added in 1) to restore the crontab file

Steps 1 and 3 and the run-two are all about making sure that only one ad-sync is running. Best to disable it while running it separately.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 MarkMurphy over 11 years ago

Output below. I removed the PW of course but it is clear that there is an issue with KDC

edge:/etc # /var/chroot-http/usr/bin/ad-sync.pl -v
started
running: /usr/sbin/net ads -w 'HOMEUSE.LOCAL' -U '******************' dn '' 'defaultNamingContext' -p 3268
kerberos_kinit_password ******@HOMEUSE.LOCAL failed: Cannot contact any KDC for requested realm
ads_connect: Cannot contact any KDC for requested realm
kerberos_kinit_password *****@HOMEUSE.LOCAL failed: Cannot contact any KDC for requested realm
ads_connect: Cannot contact any KDC for requested realm
error returned from samba command on HOMEUSE.LOCAL
running: /usr/sbin/net ads -w 'HOMEUSE.LOCAL' -U '*************' dn '' 'defaultNamingContext' -p 389
kerberos_kinit_password *****@HOMEUSE.LOCAL failed: Cannot contact any KDC for requested realm
ads_connect: Cannot contact any KDC for requested realm
kerberos_kinit_password ******@HOMEUSE.LOCAL failed: Cannot contact any KDC for requested realm
ads_connect: Cannot contact any KDC for requested realm
error returned from samba command on HOMEUSE.LOCAL
failed to run samba command on HOMEUSE.LOCAL, exiting now
edge:/etc #
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 MarkMurphy over 11 years ago

the firewall is resolving the correct ip for the DC, See below 192.168.21.5 is the only DC in the forest.

edge:/etc # ls krb*
krb5.conf  krb5.keytab
edge:/etc # cat krb5.conf
[libdefaults]
        default_realm = HOMEUSE.LOCAL

edge:/etc # nslookup homeuse.local
Server:         127.0.0.1
Address:        127.0.0.1#53

Non-authoritative answer:
Name:   homeuse.local
Address: 192.168.21.5

edge:/etc #
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

0 DavidRa over 11 years ago

OK my apologies for tardiness. Herewith the relevant part of an unsanitised but relatively boring fallback.log:

2013:11:27-22:07:46 fortress [daemon:notice] rrdcached[3403]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.

2013:11:27-22:12:46 fortress [daemon:notice] rrdcached[3403]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.

2013:11:27-22:17:46 fortress [daemon:notice] rrdcached[3403]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.

2013:11:27-22:19:01 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Started in full mode

2013:11:27-22:19:01 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Syncing 2 server(s)

2013:11:27-22:19:01 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Syncing 4 adirectory group(s)

2013:11:27-22:19:01 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Syncing 1 Active Directory sid(s)

2013:11:27-22:19:01 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Found sid for CN=Domain Admins,CN=Users,DC=middle,DC=earth => S-1-5     -21-385347296-2415236108-505384329-512

2013:11:27-22:19:01 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Syncing 4 adirectory group(s)

2013:11:27-22:19:01 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Syncing 1 Active Directory sid(s)

2013:11:27-22:19:02 fortress [daemon:notice] ad-sid-sync.pl[16709]:  [ad-sid-sync] Found sid for CN=Domain Admins,CN=Users,DC=middle,DC=earth => S-1-5     -21-385347296-2415236108-505384329-512

2013:11:27-22:22:46 fortress [daemon:notice] rrdcached[3403]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.

2013:11:27-22:27:46 fortress [daemon:notice] rrdcached[3403]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.

2013:11:27-22:32:46 fortress [daemon:notice] rrdcached[3403]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.

2013:11:27-22:35:50 fortress [daemon[:D]ebug] rrdcached[3403]:  flushing old values

2013:11:27-22:35:50 fortress [daemon[:D]ebug] rrdcached[3403]:  rotating journals

2013:11:27-22:35:50 fortress [daemon[:D]ebug] rrdcached[3403]:  started new journal /var/log/reporting/rrd/rrd.journal.1385552150.611337

2013:11:27-22:35:50 fortress [daemon[:D]ebug] rrdcached[3403]:  removing old journal /var/log/reporting/rrd/rrd.journal.1385544950.608813

2013:11:27-22:37:46 fortress [daemon:notice] rrdcached[3403]:  handle_request_update: stat (/var/log/reporting/rrd/atp_named.rrd) failed.

I have attempted to run the ad-sync script, but I note that the UTM has stored the password for the user account used to join the domain (AD SSO). That password has since changed (and I would suggest there's no mention of using a particular account to join the domain - most Samba systems maintain their own secure channel do they not?) - should I post that one as a separate bug/thread?

In any case, updating the SSO password leaves me in the same position as jz. NET ADS TESTJOIN may also suggest a K5 issue:

fortress:/etc # /usr/sbin/net ads testjoin

kerberos_kinit_password FORTRESS$@MIDDLE.EARTH failed: Cannot contact any KDC for requested realm

ads_connect: Cannot contact any KDC for requested realm

Join to domain is not valid: No logon servers

fortress:/etc #

DNS config - all DNS queries are directed to the two AD servers, and there's also a specific entry for the domain in Network Services > DNS > Request Routing. The UTM can resolve both DCs to IPs (10.14.6.11 / 10.14.6.12).

As a side note, something a little strange in this environment is that the NetBIOS name does not correspond to the DNS name (middle.earth --> HOME).

Now in terms of what I thought I was synchronizing - Domain Admins is defined as a group (Definitions & Users > Users & Groups > Groups): Dynamic membership:
Users in these Active Directory groups
CN=Domain Admins,CN=Users,DC=middle,DC=earth

No other groups are defined with AD membership.

Only 1 other group is referenced as far as I know, and that's Domain Users included under Definitions & Users > Authentication Services > Advanced > Prefetch Directory Users.

HTH. One day I'll figure out my VPN access problem and be able to get the info in a more timely manner [:)]

[9.171][BUG]AD Group Sync Fails

Submitted a Tech Support Case lately from the Support Portal?