We have a 3 node HA cluster that was running 7.502. On Tuesday I applied the 7.503 patch, and that went smooth. After all nodes were synced, I applied 7.504. The master and worker report active, but the slave has been in "syncing" phase for two days now. The HS Live Log reports the errors below:
--------------------------------------------------------------
2010:03:18-08:49:15 secgate-an-2 slon[13662]: [3-1] FATAL main: Node is not initialized properly - sleep 10s
2010:03:18-08:49:16 secgate-an-3 slon[7669]: [25070-1] ERROR slon_connectdb: PQconnectdb("dbname=pop3 host=198.19.250.2 user=ha_sync password=slony") failed - could not create
2010:03:18-08:49:16 secgate-an-3 slon[7669]: [25070-2] socket: Too many open files
2010:03:18-08:49:16 secgate-an-3 slon[7669]: [25071-1] WARN remoteListenThread_2: DB connection failed - sleep 10 seconds
2010:03:18-08:49:25 secgate-an-2 slon[11691]: [1-1] CONFIG main: slon version 1.2.20 starting up
2010:03:18-08:49:25 secgate-an-2 slon[13904]: [2-1] ERROR cannot get sl_local_node_id - ERROR: schema "_asg_cluster" does not exist
--------------------------------------------------------------
Any recommendations for troubleshooting?
This thread was automatically locked due to age.