This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

7.402 filling up the logs

Yesterday my firewall (7.401) saw the new release (7.402), and I authorized it to take it because it is supposed to fix some e1000 driver issues with my Intel NIC.

Now I notice my ha log is filling up like crazy. Every second, it puts 10 lines like below:

2009:04:24-09:29:17 fw-2 ctsyncd: Got SIGUSR1, set status to SLAVE
2009:04:24-09:29:17 fw-2 ctsyncd: Got SIGUSR1, set status to SLAVE
2009:04:24-09:29:17 fw-2 ctsyncd: Got SIGUSR1, set status to SLAVE

fw-2 is the node 2 (slave). Anybody else seeing that?

This thread was automatically locked due to age.

0 BAlfson over 16 years ago

Liug, what's the status on this problem? What did Support tell you?

Sophos UTM Community Moderator
Sophos Certified Architect - UTM
Sophos Certified Engineer - XG
Gold Solution Partner since 2005

MediaSoft, Inc. USA
Cancel
Vote Up 0 Vote Down

Cancel
0 liug over 16 years ago in reply to BAlfson

They are not sure and need to ssh to the box to do some more research. We will arrange for that and see.
Cancel
Vote Up 0 Vote Down

Cancel
0 tkiblin over 16 years ago in reply to liug

They are not sure and need to ssh to the box to do some more research. We will arrange for that and see.

We are seeing the same thing plus a few other issues since the upgrade, like:

FATAL main: Node is not initialized properly - sleep 10s
2009:04:25-01:35:48 dtg1212dtg1208k12cluster-2 slon[3811]: [1-1] CONFIG main: slon version 1.2.15 starting up
2009:04:25-01:35:48 dtg1212dtg1208k12cluster-2 slon[3810]: [1-1] CONFIG main: slon version 1.2.15 starting up
2009:04:25-01:35:48 dtg1212dtg1208k12cluster-2 slon[5604]: [2-1] ERROR cannot get sl_local_node_id - ERROR: schema "_asg_cluster" does not exist
2009:04:25-01:35:48 dtg1212dtg1208k12cluster-2 slon[5604]: [3-1] FATAL main: Node is not initialized properly - sleep 10s
2009:04:25-01:35:48 dtg1212dtg1208k12cluster-2 slon[5605]: [2-1] ERROR cannot get sl_local_node_id - ERROR: schema "_asg_cluster" does not exist
2009:04:25-01:35:48 dtg1212dtg1208k12cluster-2 slon[5605]: [3-1] FATAL main: Node is not initialized properly - sleep 10s
2009:04:25-01:35:58 dtg1212dtg1208k12cluster-2 slon[3811]: [1-1] CONFIG main: slon version 1.2.15 starting up
2009:04:25-01:35:58 dtg1212dtg1208k12cluster-2 slon[3810]: [1-1] CONFIG main: slon version 1.2.15 starting up
2009:04:25-01:35:58 dtg1212dtg1208k12cluster-2 slon[5611]: [2-1] ERROR cannot get sl_local_node_id - ERROR: schema "_asg_cluster" does not exist
2009:04:25-01:35:58 dtg1212dtg1208k12cluster-2 slon[5611]: [3-1] FATAL main: Node is not initialized properly - sleep 10s
2009:04:25-01:35:58 dtg1212dtg1208k12cluster-2 slon[5610]: [2-1] ERROR cannot get sl_local_node_id - ERROR: schema "_asg_cluster" does not exist
2009:04:25-01:35:58 dtg1212dtg1208k12cluster-2 slon[5610]: [3-1] FATAL main: Node is not initialized properly - sleep 10s

let me know if support is able to help you out, we've seen this on every HA pair we upgraded to 7.402 so far.
Cancel
Vote Up 0 Vote Down

Cancel
0 Kestutis over 16 years ago in reply to tkiblin

[FONT=Verdana]I have 2 x ASG320 v.7.402
My problem with HA Cluster is the same.

HA log:
[/FONT][FONT=monospace][FONT=Arial]2009:04:25-09:35:43 iti-vv-2 ctsyncd: Got SIGUSR1, set status to SLAVE [/FONT][/FONT]
[FONT=monospace][FONT=Arial]2009:04:25-09:35:44 iti-vv-2 ctsyncd: Got SIGUSR1, set status to SLAVE [/FONT][/FONT]
[FONT=monospace][FONT=Arial]2009:04:25-09:35:45 iti-vv-2 ctsyncd: Got SIGUSR1, set status to SLAVE [/FONT][/FONT]
[FONT=monospace][FONT=Verdana][FONT=Arial]2009:04:25-09:35:47 iti-vv-2 ctsyncd: Got SIGUSR1, set status to SLAVE[/FONT]

Slave unit not updating patterns.
Up2Date log:
[/FONT][FONT=Arial]2009:04:25-00:01:01 iti-vv-2 auisys[18503]: Starting Up2Date Package Installer (Version 1.65)
2009:04:25-00:01:01 iti-vv-2 auisys[18503]: Searching for available up2date packages for type 'ips'
2009:04:25-00:01:01 iti-vv-2 auisys[18503]: id="371D" severity="info" sys="system" sub="up2date" name="No up2date packages available for installation" status="failed" action="preinst_check" package="ips"
2009:04:25-00:01:06 iti-vv-2 auisys[18503]: Searching for available up2date packages for type 'avira'
2009:04:25-00:01:06 iti-vv-2 auisys[18503]: Installing up2date package file '/var/up2date//avira/u2d-avira-7.617-639.patch.tgz.gpg'
2009:04:25-00:01:06 iti-vv-2 auisys[18503]: Verifying up2date package signature
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: Unpacking installation instructions
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: Unpacking up2date package container
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: Running pre-installation checks
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: Starting up2date package installation
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: >=========================================================================
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: Failed testing RPM installation (command: 'rpm --test -U /var/up2date//avira-install/u2d-avira-7.639/rpms/u2d-avira-7.617-639.patch.rpm')
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  1. Internal::Systemstep::real_installation:2292() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  2. main:[:P]erform_work:919() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  3. main::auisys_prepare_and_work:496() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  4. main::top-level:33() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: |=========================================================================
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: Error details:
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  (stdout):$VAR1 = [];
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  (stderr):$VAR1 = [
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:           'error: Failed dependencies:
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: ',
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:           '    u2d-avira = 7-617 | u2d-avira = 7-618 | u2d-avira = 7-619 | u2d-avira = 7-620 | u2d-avira = 7-621 | u2d-avira = 7-622 | u2d-avira = 7-623 | u2d-avira = 7-624 | u2d-avira = 7-625 | u2d-avira = 7-626 | u2d-avira = 7-627 | u2d-avira = 7-628 | u2d-avira = 7-629 | u2d-avira = 7-630 | u2d-avira = 7-631 | u2d-avira = 7-632 | u2d-avira = 7-633 | u2d-avira = 7-634 | u2d-avira = 7-635 | u2d-avira = 7-636 | u2d-avira = 7-637 | u2d-avira = 7-638 is needed by u2d-avira-7-639
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: '
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:         ];
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  1. Internal::Systemstep::real_installation:2293() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  2. main:[:P]erform_work:919() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  3. main::auisys_prepare_and_work:496() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  4. main::top-level:33() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: |=========================================================================
2009:04:25-00:01:07 iti-vv-2 auisys[18503]: id="371O" severity="error" sys="system" sub="up2date" name="Fatal: Up2Date package installation failed: An error occured during the RPM pre-installation test (1)" status="failed" action="install" code="1" package="avira"
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  1. main::alf:71() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  2. main:[:P]erform_work:965() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  3. main::auisys_prepare_and_work:496() auisys.pl
2009:04:25-00:01:07 iti-vv-2 auisys[18503]:  4. main::top-level:33() auisys.pl
2009:04:25-00:02:02 iti-vv-2 audld[18536]: >=========================================================================
2009:04:25-00:02:02 iti-vv-2 audld[18536]: running on HA slave system or cluster node, exiting
[/FONT][FONT=Arial]
[/FONT][FONT=Arial]
[/FONT][/FONT]
Cancel
Vote Up 0 Vote Down

Cancel
0 WaMaR over 16 years ago in reply to Kestutis

[FONT=Verdana]I have 2 x ASG320 and yesterday upgrade to v.7.402
My problem with HA Cluster is the same.[/FONT]
2009:04:25-00:21:35 fw-2 ctsyncd: Got SIGUSR1, set status to SLAVE2009:04:25-00:21:36 fw-2 ctsyncd: Got SIGUSR1, set status to SLAVE2009:04:25-00:21:36 fw-2 ctsyncd: Got SIGUSR1, set status to SLAVE2009:04:25-00:21:37 fw-2 ctsyncd: Got SIGUSR1, set status to SLAVERegards,
WaMaR
Cancel
Vote Up 0 Vote Down

Cancel
0 trollvottel over 16 years ago in reply to WaMaR

Please ask support for a fix of ctsyncd (Mantis ID #10323). This will be fixed in v7.403.
Cancel
Vote Up 0 Vote Down

Cancel
0 liug over 16 years ago in reply to liug

They are not sure and need to ssh to the box to do some more research. We will arrange for that and see.

They sshed into the box, re-initialized the postgresql database, which got rid of the error:

ERROR cannot get sl_local_node_id - ERROR: schema "_asg_cluster" does not exist

They also installed the new version of ctsyncd, which quieted down the ha logs.

Now looking at the last few days of cpu graph, I can see it is constantly low, except that at around 2:15-2:30AM, it has one big spike node 1 (master), every day. My guess is it must be some scheduled job. Does anybody else notice this? My problem is when this happens, node 1 will miss some heartbeat from node 2 (slave) and thinks node 2 is dead, then it will get the beat again, and re-add node 2 into the cluster. So every night I am still getting those HA alerts.
Cancel
Vote Up 0 Vote Down

Cancel
0 BAlfson over 16 years ago in reply to liug

That is the nightly PostgreSQL database rollup. Have you considered adding RAM to your device?

Sophos UTM Community Moderator
Sophos Certified Architect - UTM
Sophos Certified Engineer - XG
Gold Solution Partner since 2005

MediaSoft, Inc. USA
Cancel
Vote Up 0 Vote Down

Cancel
0 liug over 16 years ago in reply to BAlfson

That is the nightly PostgreSQL database rollup. Have you considered adding RAM to your device?

I have enough RAM, and it is only 20% used, flat. Not even a single spike over the last few weeks.
Cancel
Vote Up 0 Vote Down

Cancel
0 liug over 16 years ago in reply to liug

Last night, it was crazy. I got 30+ email alerts.
I guess it must be the month end report generating that caused the cpu spike, and makes the packet loss for the heartbeat link.
Astaro is a security firewall firstly. Shouldn't the reporting work be in low priority? Letting it affect firewall/cluster is bad.
Maybe astaro should consider to make it run "nicer", or offer an option to offload the postgresql/reporting job to another "management machine", rather than on the live firewall.
Cancel
Vote Up 0 Vote Down

Cancel