This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

RegEx URL: Exactly what implementation of regular expressions is used on Sophos XG (SFOS15)?

Is it Posix, Extended RegExp, Perl, ECMAscript or other?
I have had a hard time finding the correct syntax for HTTP bypass rules. It does not appear clear from documentation...

It would also be very nice, to have a RegEx tester built in, to check if you syntax actually matches what you want - and not by mistake maches every URL!
(Is there somewhere in the logs to check this?)

- Martin

EDIT:
And what is the sane explanation, that it is not possible to use RegEx bypass rules for HTTPS scanning?!?
This does not make any sense... 



This thread was automatically locked due to age.
Parents
  • Hi Martin,

    Here's an update-

    We have different check for RegEx at multiple location. The RegEx should be Perl and Java compatible and Max no of URL in Exception list should be < 128 and length of URL is < 100.

    HTTP Proxy:

    The proxy compiles the RegExes in the UI using pcre_compile which is “Perl-compatible regular expressions”

    API:

    URL RegExs can’t start with ^https:// or ^http://


    RegExes are not automatically anchored and must be if desired (example: ^microsoft\.com/ will matchhttp://microsoft.com/ but not http://www.microsoft.com/. If anchor is missing like: microsoft\.com/ then bothhttp://microsoft.com and http://www.microsoft.com will match)

    The max length of URLRegEx is 100, this is restricted by DB schema

    URL RegEx is validated by Perl compiler

    UI:

    URL RegExs can’t start with ^https:// or ^http:// (there's a bug though, see NC-11547)

    The max length of urlregex is 100

    We use the Java Script library RegExp to validate the syntax of the Regexes
    1. Check # of groups (e.g. if \2 is used, there must be at least 2 groups)
    2. Check [] content (e.g. [] should not be allowed because it's empty)

    total # of URL RegExes in an exception < 128

    Hope that helps :)

  • "RegExes are not automatically anchored and must be if desired (example: ^microsoft\.com/ "

     

    We just implemented the XG coming off the UTM and not anchoring the regex made the appliance unusable, the processor would maxed out under any kind of load.  

    With the current design, it should probably be required.

     

    Our support person is suppose to be writing a KBA about it shortly.

  • Please be aware of this KB

    https://community.sophos.com/kb/en-us/127270

    Summary:  In URL Groups and in Categories there is no RegEx, the KB describes what substring matching is done.

     

    In XG Web \ Exceptions we do not automatically anchor on left side.  This gives more flexibility to admins.  Yes that includes flexibility to be inefficient.  We will not be change this because it would affect existing customers.

     

    In both XG and in the UTM (and I would argue any computer system anywhere) - when you create a new object you should copy the existing out of box objects as much as possible.

    In the XG one of the OOB exception is:

    ^([A-Za-z0-9.-]*\.)?apple\.com\.?/

    So if you want to create a new exception you match that style.

  • The KB doesnt mention the performance hit of not using the ^.

    Coming from the UTM, we copied our existing regex over, but removed the ^https:// since it wasnt allowed.  Not realizing the ^ had a huge performance hit.  I started reviewing everything trying to find what was maxing the processor and this thread in particular talked about not using the ^ would allow subdomains, which is what we were trying to do anyway. 

    I see your point about not requiring it, but I would caution people to use it very sparingly.

  • On the Sophos XG 17/18 I can use a regular expression for block?

    Example, I need to block a word "Anime" on my enviorment, how the best form do this?

Reply Children