This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

High CPU-load with SFOS 17.1.2 MR-2 and HA

Hi,

since updating to SFOS 17.1.2 MR-2 a customers XG HA-cluster in A/P mode is ~90% cpu load.

Its a small XG125 setup. 

Anybody seen this before?:

 



This thread was automatically locked due to age.
Parents
  • do the logs provide any details?

  • Nothing, thats what makes me so helpless.

    I can also delete HA config, but the processes are still there.

    Reboot wont't help. Seems if got to go back to MR1.

     

     

     

    EDIT:

    After some investigation in seems like the Msyncd service causing this behavior.

    I never seen this before. 

     

    Log showing:

    Thu Sep 6 15:33:49 2018:519367:26908:MAST:MAST:DEBUG:sync.c:829sesid:901:readApiInterface
    Thu Sep 6 15:33:49 2018:519446:26908:MAST:MAST:DEBUG:sync.c:811sesid:901 readApiInterface 3
    Thu Sep 6 15:33:53 2018:626892:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:33:53 2018:627055:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:902: opcode getappkey
    Thu Sep 6 15:33:53 2018:895767:26908:MAST:MAST:DEBUG:sync.c:829sesid:902:getappkey
    Thu Sep 6 15:33:53 2018:895852:26908:MAST:MAST:DEBUG:sync.c:811sesid:902 getappkey 3
    Thu Sep 6 15:33:55 2018:572904:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:33:55 2018:573073:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:903: opcode lic_substatus
    Thu Sep 6 15:33:55 2018:834550:26864:MAST:MAST:DEBUG:worker.c:587 idle workers 10.
    Thu Sep 6 15:33:55 2018:834592:26864:MAST:MAST:DEBUG:worker.c:628 worker_num 14.
    Thu Sep 6 15:33:55 2018:841181:26908:MAST:MAST:DEBUG:sync.c:829sesid:903:lic_substatus
    Thu Sep 6 15:33:55 2018:841236:26908:MAST:MAST:DEBUG:sync.c:811sesid:903 lic_substatus 3
    Thu Sep 6 15:33:55 2018:867882:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:33:55 2018:868009:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:904: opcode readApiInterface
    Thu Sep 6 15:33:56 2018:478815:26908:MAST:MAST:DEBUG:sync.c:829sesid:904:readApiInterface
    Thu Sep 6 15:33:56 2018:478890:26908:MAST:MAST:DEBUG:sync.c:811sesid:904 readApiInterface 3
    Thu Sep 6 15:34:01 2018:015950:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:01 2018:194002:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :2
    Thu Sep 6 15:34:01 2018:194154:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:906: opcode garnerevent
    Thu Sep 6 15:34:01 2018:267592:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :3
    Thu Sep 6 15:34:01 2018:279379:26908:MAST:MAST:DEBUG:sync.c:829sesid:905:
    Thu Sep 6 15:34:01 2018:279438:26908:MAST:MAST:DEBUG:sync.c:811sesid:905 3
    Thu Sep 6 15:34:01 2018:457749:26908:MAST:MAST:DEBUG:sync.c:829sesid:906:garnerevent
    Thu Sep 6 15:34:01 2018:457814:26908:MAST:MAST:DEBUG:sync.c:811sesid:906 garnerevent 3
    Thu Sep 6 15:34:01 2018:530231:26908:MAST:MAST:DEBUG:sync.c:829sesid:907:
    Thu Sep 6 15:34:01 2018:530296:26908:MAST:MAST:DEBUG:sync.c:811sesid:907 3
    Thu Sep 6 15:34:07 2018:713782:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:07 2018:713929:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:908: opcode lic_substatus
    Thu Sep 6 15:34:07 2018:981884:26908:MAST:MAST:DEBUG:sync.c:829sesid:908:lic_substatus
    Thu Sep 6 15:34:07 2018:981959:26908:MAST:MAST:DEBUG:sync.c:811sesid:908 lic_substatus 3
    Thu Sep 6 15:34:08 2018:001467:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:08 2018:001626:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:909: opcode readApiInterface
    Thu Sep 6 15:34:08 2018:613194:26908:MAST:MAST:DEBUG:sync.c:829sesid:909:readApiInterface
    Thu Sep 6 15:34:08 2018:613264:26908:MAST:MAST:DEBUG:sync.c:811sesid:909 readApiInterface 3
    Thu Sep 6 15:34:13 2018:679027:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:13 2018:946962:26908:MAST:MAST:DEBUG:sync.c:829sesid:910:
    Thu Sep 6 15:34:13 2018:947032:26908:MAST:MAST:DEBUG:sync.c:811sesid:910 3
    Thu Sep 6 15:34:15 2018:874547:26864:MAST:MAST:DEBUG:worker.c:587 idle workers 10.
    Thu Sep 6 15:34:15 2018:874601:26864:MAST:MAST:DEBUG:worker.c:628 worker_num 14.
    Thu Sep 6 15:34:17 2018:002205:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:19 2018:705657:26908:MAST:MAST:DEBUG:sync.c:829sesid:911:
    Thu Sep 6 15:34:19 2018:705723:26908:MAST:MAST:DEBUG:sync.c:811sesid:911 3
    Thu Sep 6 15:34:19 2018:985850:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:19 2018:985982:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:912: opcode lic_substatus
    Thu Sep 6 15:34:20 2018:253087:26908:MAST:MAST:DEBUG:sync.c:829sesid:912:lic_substatus
    Thu Sep 6 15:34:20 2018:253165:26908:MAST:MAST:DEBUG:sync.c:811sesid:912 lic_substatus 3
    Thu Sep 6 15:34:20 2018:284612:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:20 2018:284731:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:913: opcode readApiInterface
    Thu Sep 6 15:34:20 2018:899570:26908:MAST:MAST:DEBUG:sync.c:829sesid:913:readApiInterface
    Thu Sep 6 15:34:20 2018:899614:26908:MAST:MAST:DEBUG:sync.c:811sesid:913 readApiInterface 3
    Thu Sep 6 15:34:23 2018:635882:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:23 2018:636027:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:914: opcode getappkey
    Thu Sep 6 15:34:23 2018:903331:26908:MAST:MAST:DEBUG:sync.c:829sesid:914:getappkey
    Thu Sep 6 15:34:23 2018:903372:26908:MAST:MAST:DEBUG:sync.c:811sesid:914 getappkey 3
    Thu Sep 6 15:34:24 2018:642114:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:24 2018:642254:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:915: opcode lic_substatus
    Thu Sep 6 15:34:24 2018:908636:26908:MAST:MAST:DEBUG:sync.c:829sesid:915:lic_substatus
    Thu Sep 6 15:34:24 2018:908705:26908:MAST:MAST:DEBUG:sync.c:811sesid:915 lic_substatus 3
    Thu Sep 6 15:34:24 2018:927888:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:24 2018:928042:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:916: opcode readApiInterface
    Thu Sep 6 15:34:25 2018:538136:26908:MAST:MAST:DEBUG:sync.c:829sesid:916:readApiInterface
    Thu Sep 6 15:34:25 2018:538223:26908:MAST:MAST:DEBUG:sync.c:811sesid:916 readApiInterface 3

     

     

     

    EDIT AGAIN:

    Tested with XG210 cluster -> no problems. No high cpu load.

    Tested SFOS 17.0.8-209 performance indicator and load is still high, but not 100% cpu like 17.1.2MR2

    Maybe something like a firmware update came with the latest MR2 and now slows down the appliances? 

    It really nice to see process GETHAINFO is so hungry in MR2 and in MR1 it's abot 30% less.

     

     

     

    This is the result of a fresh installed XG125Rev3 with default config:

     

    Now it gets really funny. this is the detailed load view of the newly installed machine:

     

     

     

    EDIT:

    Found the process, can't kill it because the monitor process is restarting it again and again.

    After I killed the PPID is load is ~0-2

Reply
  • Nothing, thats what makes me so helpless.

    I can also delete HA config, but the processes are still there.

    Reboot wont't help. Seems if got to go back to MR1.

     

     

     

    EDIT:

    After some investigation in seems like the Msyncd service causing this behavior.

    I never seen this before. 

     

    Log showing:

    Thu Sep 6 15:33:49 2018:519367:26908:MAST:MAST:DEBUG:sync.c:829sesid:901:readApiInterface
    Thu Sep 6 15:33:49 2018:519446:26908:MAST:MAST:DEBUG:sync.c:811sesid:901 readApiInterface 3
    Thu Sep 6 15:33:53 2018:626892:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:33:53 2018:627055:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:902: opcode getappkey
    Thu Sep 6 15:33:53 2018:895767:26908:MAST:MAST:DEBUG:sync.c:829sesid:902:getappkey
    Thu Sep 6 15:33:53 2018:895852:26908:MAST:MAST:DEBUG:sync.c:811sesid:902 getappkey 3
    Thu Sep 6 15:33:55 2018:572904:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:33:55 2018:573073:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:903: opcode lic_substatus
    Thu Sep 6 15:33:55 2018:834550:26864:MAST:MAST:DEBUG:worker.c:587 idle workers 10.
    Thu Sep 6 15:33:55 2018:834592:26864:MAST:MAST:DEBUG:worker.c:628 worker_num 14.
    Thu Sep 6 15:33:55 2018:841181:26908:MAST:MAST:DEBUG:sync.c:829sesid:903:lic_substatus
    Thu Sep 6 15:33:55 2018:841236:26908:MAST:MAST:DEBUG:sync.c:811sesid:903 lic_substatus 3
    Thu Sep 6 15:33:55 2018:867882:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:33:55 2018:868009:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:904: opcode readApiInterface
    Thu Sep 6 15:33:56 2018:478815:26908:MAST:MAST:DEBUG:sync.c:829sesid:904:readApiInterface
    Thu Sep 6 15:33:56 2018:478890:26908:MAST:MAST:DEBUG:sync.c:811sesid:904 readApiInterface 3
    Thu Sep 6 15:34:01 2018:015950:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:01 2018:194002:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :2
    Thu Sep 6 15:34:01 2018:194154:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:906: opcode garnerevent
    Thu Sep 6 15:34:01 2018:267592:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :3
    Thu Sep 6 15:34:01 2018:279379:26908:MAST:MAST:DEBUG:sync.c:829sesid:905:
    Thu Sep 6 15:34:01 2018:279438:26908:MAST:MAST:DEBUG:sync.c:811sesid:905 3
    Thu Sep 6 15:34:01 2018:457749:26908:MAST:MAST:DEBUG:sync.c:829sesid:906:garnerevent
    Thu Sep 6 15:34:01 2018:457814:26908:MAST:MAST:DEBUG:sync.c:811sesid:906 garnerevent 3
    Thu Sep 6 15:34:01 2018:530231:26908:MAST:MAST:DEBUG:sync.c:829sesid:907:
    Thu Sep 6 15:34:01 2018:530296:26908:MAST:MAST:DEBUG:sync.c:811sesid:907 3
    Thu Sep 6 15:34:07 2018:713782:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:07 2018:713929:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:908: opcode lic_substatus
    Thu Sep 6 15:34:07 2018:981884:26908:MAST:MAST:DEBUG:sync.c:829sesid:908:lic_substatus
    Thu Sep 6 15:34:07 2018:981959:26908:MAST:MAST:DEBUG:sync.c:811sesid:908 lic_substatus 3
    Thu Sep 6 15:34:08 2018:001467:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:08 2018:001626:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:909: opcode readApiInterface
    Thu Sep 6 15:34:08 2018:613194:26908:MAST:MAST:DEBUG:sync.c:829sesid:909:readApiInterface
    Thu Sep 6 15:34:08 2018:613264:26908:MAST:MAST:DEBUG:sync.c:811sesid:909 readApiInterface 3
    Thu Sep 6 15:34:13 2018:679027:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:13 2018:946962:26908:MAST:MAST:DEBUG:sync.c:829sesid:910:
    Thu Sep 6 15:34:13 2018:947032:26908:MAST:MAST:DEBUG:sync.c:811sesid:910 3
    Thu Sep 6 15:34:15 2018:874547:26864:MAST:MAST:DEBUG:worker.c:587 idle workers 10.
    Thu Sep 6 15:34:15 2018:874601:26864:MAST:MAST:DEBUG:worker.c:628 worker_num 14.
    Thu Sep 6 15:34:17 2018:002205:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:19 2018:705657:26908:MAST:MAST:DEBUG:sync.c:829sesid:911:
    Thu Sep 6 15:34:19 2018:705723:26908:MAST:MAST:DEBUG:sync.c:811sesid:911 3
    Thu Sep 6 15:34:19 2018:985850:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:19 2018:985982:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:912: opcode lic_substatus
    Thu Sep 6 15:34:20 2018:253087:26908:MAST:MAST:DEBUG:sync.c:829sesid:912:lic_substatus
    Thu Sep 6 15:34:20 2018:253165:26908:MAST:MAST:DEBUG:sync.c:811sesid:912 lic_substatus 3
    Thu Sep 6 15:34:20 2018:284612:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:20 2018:284731:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:913: opcode readApiInterface
    Thu Sep 6 15:34:20 2018:899570:26908:MAST:MAST:DEBUG:sync.c:829sesid:913:readApiInterface
    Thu Sep 6 15:34:20 2018:899614:26908:MAST:MAST:DEBUG:sync.c:811sesid:913 readApiInterface 3
    Thu Sep 6 15:34:23 2018:635882:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:23 2018:636027:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:914: opcode getappkey
    Thu Sep 6 15:34:23 2018:903331:26908:MAST:MAST:DEBUG:sync.c:829sesid:914:getappkey
    Thu Sep 6 15:34:23 2018:903372:26908:MAST:MAST:DEBUG:sync.c:811sesid:914 getappkey 3
    Thu Sep 6 15:34:24 2018:642114:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:24 2018:642254:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:915: opcode lic_substatus
    Thu Sep 6 15:34:24 2018:908636:26908:MAST:MAST:DEBUG:sync.c:829sesid:915:lic_substatus
    Thu Sep 6 15:34:24 2018:908705:26908:MAST:MAST:DEBUG:sync.c:811sesid:915 lic_substatus 3
    Thu Sep 6 15:34:24 2018:927888:26908:MAST:MAST:DEBUG:event.c:492 ses_cnt :1
    Thu Sep 6 15:34:24 2018:928042:26908:MAST:MAST:DEBUG:sync_entity.c:938sesid:916: opcode readApiInterface
    Thu Sep 6 15:34:25 2018:538136:26908:MAST:MAST:DEBUG:sync.c:829sesid:916:readApiInterface
    Thu Sep 6 15:34:25 2018:538223:26908:MAST:MAST:DEBUG:sync.c:811sesid:916 readApiInterface 3

     

     

     

    EDIT AGAIN:

    Tested with XG210 cluster -> no problems. No high cpu load.

    Tested SFOS 17.0.8-209 performance indicator and load is still high, but not 100% cpu like 17.1.2MR2

    Maybe something like a firmware update came with the latest MR2 and now slows down the appliances? 

    It really nice to see process GETHAINFO is so hungry in MR2 and in MR1 it's abot 30% less.

     

     

     

    This is the result of a fresh installed XG125Rev3 with default config:

     

    Now it gets really funny. this is the detailed load view of the newly installed machine:

     

     

     

    EDIT:

    Found the process, can't kill it because the monitor process is restarting it again and again.

    After I killed the PPID is load is ~0-2

Children
No Data