Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SFOS 16.01.2 snort high cpu even with None in policy

Not sure if this is related to 16.01.2, or some pattern update, but shortly after I updated on 11/29 my CPU usage has more than doubled with no changes to configuration other than the 16.01.2 update (and probably some behind-the-scenes pattern updates).

 

 

I didn't even know the CPU was under load until the effects yesterday 12/7 when my traffic was screeching slow. When I logged onto the console snort was taking 100% CPU!

I checked a few links from the board and found my maxpxts was 80 so I adjusted that to 8 which has helped a lot keeping snort to around 60-70% CPU but the system is definitely running hotter than usual (compare to the previous SFOS 16.01.1).

It also seems like vlan routing (zone-to-zone) policies influence snort (some sort of pre-filtering?) even though IPS policy for that rule is set to None. Is there a way to exclude pre-filter snort traffic if the rule defines it as none?

Thanks



This thread was automatically locked due to age.
Parents
  • I have the same issue, and open case already.

    Stop IPS service can resolve the issus.

    But what should I do when user want to enable IPS function on some firewall rules?

    I am waiting for Sophos response...

  • Dear All,

    After support team working for two months, they resolved the issue of mine finally.

    They add a command 'set ips ips-instance add IPS cpu 1'.
     
    The command basically creates 2 instances of IPS, allowing the IPS to use the 2nd core of the appliance for processing of IPS traffic.

    And I also found that the snort procedure doesn't appear in CPU utility.

    Support team reply as following.

    Ans-> There can be multiple reason why the snort is not listed on the CPU utility, but at the same time, if it is not listed doesn't mean there is any problem.

     

    The issue resolved finally...

  • Although the command 'set ips ips-instance add IPS cpu 1' can resolve the issue.

    But I found that the setting will disappear after reboot.

    I can't ask my customer to do this every time when the appliance reboot.

    Is there a way to set it permanently?

    It so sad...

  • Hi Shunzeelee,

    If the changes are reverted after a reboot, contact support to inspect it and escalate it to the developers.

    Thanks

  • Support team give us the following command ,
    set ips ips-instance apply

    And it works finally!

    The setting will not lost after reboot.

    thanks~

  • Thanks to sharing your issue and support to everyone on community. Many users will benefit of your commitment and patience!

    Regards

  • Hello ShunzeLee,

    Could you please let me know approx. how many Mbps/Gbps. pass through your XG and approx. number of users? Sophos support also increased the number of CPUs  in my XG-430 (in my case up to 3), but didn't have that much luck, so far a little over 3 week waiting...

    Thanks,

    R.

  • 30 users and it's throughput as following


    Support team used about 2 months to resolve the issue...

  • Same issue for 2+ month but zero suggestions from support in our case.  I dont understand why I should experiment with our in-production box (experiment=try different suggestions from forum guys). In my vision sophos should have their test stand with all the SG/XG models and firmwares available to reproduce this issue (or not) and test their suggestions on the test lab.

    Last week they switched us to new L1 support (4th one to be exact)... this is such a bollocks.

    p.s. and every new L1 guy starts same song "Hello sir, let me connect to your box using your desktop/teamviewer or whatever ****". This is such annoying situation, I dont understand what im paying for.

Reply
  • Same issue for 2+ month but zero suggestions from support in our case.  I dont understand why I should experiment with our in-production box (experiment=try different suggestions from forum guys). In my vision sophos should have their test stand with all the SG/XG models and firmwares available to reproduce this issue (or not) and test their suggestions on the test lab.

    Last week they switched us to new L1 support (4th one to be exact)... this is such a bollocks.

    p.s. and every new L1 guy starts same song "Hello sir, let me connect to your box using your desktop/teamviewer or whatever ****". This is such annoying situation, I dont understand what im paying for.

Children
  • I agree with you all.

    I have seen several users and threads about ips performance and issue that I really hope v17 will fix all of them.

    Into v17 ips will be improved.

    Using XG without ips inside certain environments is almost useless.

    Apart logging, XG is suffering for ips performance.

    Let's wait v17!

  • I can't agree with you anymore.

  • Has anyone tried 16.05.1MR1?,  I was just told that "development" is suspecting that it will solve my issues... (we can have up to 3K devices concurrently and push > 1Gbps)

    As per the release notes (community.sophos.com/.../sfos-16-05-1-mr1-released)

    • NC-14599 [IPS] IPS constantly taking high CPU when deployed in discover mode

    ... this is he only entry that is IPS related, and I am not using discovery mode.

    Since we are in bridge mode, testing is disruptive, and most likely won't have time to do it in the near future.

    Just in case someone wants to give it a try, please let us know how it went.

    R.

  • I have seen this IPS issue on every firmware version that has been released.  I have over 30 units deployed at various clients and I have tested and verified this issue on several models including the XG85W, XG105W, XG115W, XG125W, and the XG210.

    I have tested and re-tested every firmware version that has been released, and the only way to "fix" the issue is to either disable ATP or to stop the IPS service completely.

    I have a case open with support, so hopefully they can get this resolved soon.  

    I am interested to see if everyone can reproduce this issue on their units by following my testing process.  Please give it a shot and post if you have similar results.

    NOTE: All my testing was done with one LAN-to-WAN rule and no web filtering, application filtering, or IP rules configured.

    1. Install Google Chrome and the "Page Load Time" plugin found here - https://chrome.google.com/webstore/detail/page-load-time/fploionmjgeclbkemipmkogoaohcdbig?hl=en

    2. In google chrome open a tab and enable developer mode by hitting F12

    3. In developer mode panel hit the menu button (the three vertical dots) and click on settings.  Then scroll to the "Network" section and check the box next to "Disable cache (while DevTools is open)" This will cause Chrome not to cache any pics, etc... which will be crucial to do the testing correctly.

    4. Disable ATP on the Sohpos firewall and wait 5 minutes or so just to make sure all the services have stopped.

    5. Load and refresh the following websites to establish a base line for how long it takes web pages to load.  (Remember this MUST be done with the developer menu open so that the images do not cache when you reload the web pages)....   The "Page Load Time" plugin will give you the load times.  Take a note of the load times for the following sample sites:  nba.com, amc.com, msnbc.com, cnn.com, foxnews.com, 

    6. Open an SSH session and go the advanced shell prompt (option 5 then 3)

    7. enter the "top -d 1" command to see the CPU refresh every second

    8. Start the ATP service and wait 5 or so minutes to make sure that the services have started and with the developer mode still enabled (F12) and the "Disable cache (while DevTools is open)" still enabled as well, reload those same webpages.

    9. In my experience, the pages hang for 3-4 seconds, start loading, hang again for 3-4 seconds and finish loading.  And what I find is that some pages that take 4 seconds to load with ATP turned off, will take up to 9-20 seconds to load with it turned on.  And while the pages hang you will see the snort process spike to 99% - 100%

    Please share your findings.

  • This is some serious approach to this problem :)

    My 2 cents  - simple "ping" is enough to see that things are fu*** up. Btw turning off ATP is not solving issue completely. ATP is just a marketing "thing" and it still relies on Snort (the IPS software inside sophos box). 

  • Hi AleksandrIvanov,

    I am sorry for the experience you faced from support. Please provide me a case# to look into it further. If you need any help with the case, please PM me so that I can assist you. 

    Thanks

  • Hi Fitzroy,

    I think you have a different issue associated with IPS. Can you please PM me your query regarding this issue so that we can work together. 

    Please create a new thread on this query and let us help you.

    Thanks

  • Case 6697114

    I dont think that I need help with this case :) I think your L1 support need help with it.

    I only need proper firmware update in stable-brach (not development, im not hired by sophos for tests) which will solve my problem with IPS

    p.s. from my perspective the cause is simple (i can be wrong of course) - there was massive IPS core update when you switched from v15 to v16 and new IPS version doesnt work 'well' with most of entry XG boxes + for most of ppl who doesnt run live traffic (like voip or video conference) there is no problem because there is no situation when they can experience it.

    Thanks

     

    Regards,

    Aleksandr

  • Thanks,

    I have created a new thread and I will PM you as well.