Guest User!

You are not Sophos Staff.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

HTTP and majority of HTTPS returns 502 error or will not connect

Hi there,

Twice in three days I've had a network issue caused by Sophos XG210 and I would like some feedback on how to prevent it re-occurring or, at the very least, how to solve it in the future.

In both cases, connections to HTTP do not work, giving intermittent 502 errors on Chrome and Safari whilst only some connections to HTTPS work including Google (and its derivatives) and Facebook. Failed HTTPS connections would also give intermittent 502 errors. Pings and other protocols still worked as expected during the outage.

These issues were experienced regardless of the policies the traffic was governed by and were found with HTTP and HTTPS decryption turned on and off, and web and application filters turned on or off. I even set up a policy for a small Lab network with a *shudder* "Permit Any" policy and even this didn't solve the issue. 

The device was fully updated. The last successful update occurred around 45 minutes before the last issue arose and involved the Avira and Sophos AV. This may be the place to start as another Sophos governed network in our company also experienced issues around the same time. Unfortunately there is no network admin at this location and I haven't been able to get a hold of the logs yet so I cannot compare the two.

The outages both resolved themselves after 1-1.5 hours. During this time I had to connect our internal network directly to the WAN links due to connectivity being more of an issue than security (not my words!). This situation cannot be allowed to become a common occurrence. 

What troubleshooting steps can I now follow to find the source of this problem? Any help is greatly appreciated!

Tom



This thread was automatically locked due to age.
Parents
  • Hi Tom,

    Check #1 in the Analysis Guide.

    Post the logs.

    Thanks

  • Hi there,

    Logs for the latest occurrence can be found here: https://www.dropbox.com/s/j2zom91sik0uwde/drop%20packet%20capture%2031-08.txt?dl=0

    The frequency of these issues is increasing and I'm getting a bit panicky about what to do here short of reinstalling the old device. Any help is appreciated

    Tom

  • Hi Tom,

    Thanks for the logs and need not worry we are here to rescue. :)

    Date=2016-08-31 Time=18:39:25 log_id=0102021 log_type=Firewall log_component=Invalid_Traffic log_subtype=Denied log_status=N/A

    The above log line states that the packet is denied for being invalid. Please show us a picture of Firewall Rule for LAN_WAN. Also, did you take a packet capture and TCP dump on this matter? I have suggested a packet capture in my referred guide at #1.

    Thanks

  • Hi Sachin,

    This is the rule that the issue is occuring under. It is not the production rule, it is just shown to illustrate that a virtually completely open rule gives me the same issues as the more strict production rule that I normally implement:

    www.dropbox.com/.../Policy 99.jpg

    I ran a second drop-packet-capture at the same time as a packet capture. The following is a selected output from the drop-packet-capture:

    2016-09-01 08:49:20 0102021 IP 172.20.99.23.2492 > 134.170.3.183.2492 : proto TCP: P 2246625031:2246625038(7) win 16522 checksum : 1289
    0x0000: 4500 002f 40e3 4000 7f06 2159 ac14 6317 E../@.@...!Y..c.
    0x0010: 86aa 03b7 09bc 09bc 85e8 c707 0dfa 523c ..............R<
    0x0020: 5018 408a 0509 0000 1007 0000 0000 00 P.@............
    Date=2016-09-01 Time=08:49:20 log_id=0102021 log_type=Firewall log_component=Invalid_Traffic log_subtype=Denied log_status=N/A log_priority=Alert duration=N/A in_dev=Port1 out_dev= inzone_id=0 outzone_id=0 source_mac=5c:45:27:db:0d:81 dest_mac=00:1a:8c:51:70:ac l3_protocol=IP source_ip=172.20.99.23 dest_ip=134.170.3.183 l4_protocol=TCP source_port=2492 dest_port=2492 fw_rule_id=0 policytype=0 live_userid=0 userid=0 user_gp=0 ips_id=0 sslvpn_id=0 web_filter_id=0 hotspot_id=0 icap_id=0 app_filter_id=0 app_category_id=0 app_id=0 category_id=0 bandwidth_id=0 up_classid=7133754792471429120 dn_classid=0 source_nat_id=0 cluster_node=0 inmark=0 nfqueue=0 scanflags=0 gateway_offset=0 max_session_bytes=0 drop_fix=0 ctflags=0 connid=0 masterid=0 status=0 state=0 sent_pkts=N/A recv_pkts=N/A sent_bytes=N/A recv_bytes=N/A tran_src_ip=N/A tran_src_port=N/A tran_dst_ip=N/A tran_dst_port=N/A

    The following is the associated packet from packet capture:

    Ethernet Header
    Source MAC Address:5c:45:27:db:0d:81
    Destination MAC Address: 00:1a:8c:51:70:ac
    Ethernet Type IPv4 (0x800)
     
    IPv4 Header
    Source IP Address:172.20.99.23
    Destination IP Address:134.170.3.183
    Protocol: TCP
    Header:20 Bytes
    Type of Service: 0
    Total Length: 47 Bytes
    Identification:16611
    Fragment Offset:16384
    Time to Live: 127
    Checksum: 8537
     
    TCP Header:
    Source Port: 2492
    Destination Port: 2492
    Flags: PSH
    Sequence Number: 2246625031
    Acknowledgement Number: 234508860
    Window: 16522
    Checksum: 1289

    It indicates it was governed by rule-id 0. I assume this means it hasn't matched a rule?

    I have experienced this issue with a "permit any" filter enabled on the firewall so I'm not sure what role the policies play in this.

    Further, this only happens using one specific gateway and is only a recent occurrence. It happened roughly the same time as a network change where we put in a router upstream of Sophos between it and the VSAT modem (ISP request).

    • If I connect to the modem directly - No issue
    • If I connect to the modem via the new router without Sophos - No Issue
    • If i connect Sophos directly to the modem (uninstall new router)  - Issue
    • If i connect Sophos to the modem via the router - Issue

    The ISP says it is experiencing no issue with the link. 

    Im really confused with why a select number of sites can be accessed such as facebook and google and its derivates. I thought it may be due to an ipv6 issue but I have tried other sites with ip6 addresses and had no luck. DNS resolves without issue both from Sophos and from the internal network. 

    Thanks,

    Tom

  • Hi Tom,

    In the firewall rule, is there any specific need to use gateway specific default NAT? Can you configure that to MASQ and disable the specific NATing. Further, check #4.1 in the guide to verify DNS settings. 

    Hope that helps.

  • Hi Sachin,

    Unfortunately our ISP demarc point is with a router between the Sophos XG and the modem. This NATs our private addresses to our public IP. It is the only one of our WAN links with this configuration therefore this gateway has a gateway specific NAT of NONE. 

    DNS settings are all fine. Both Sophos and devices on our local network can resolve DNS queries using nslookup. Sophos gets its DNS information from 8.8.4.4 and 8.8.8.8 and our internal network queries Sophos.

    I switched our internet to our 4g gateway and reapplied our normal policy. I immediately started receiving HTTP502 errors again. I am now going to troubleshoot this gateway and isolating the problem to IPS or Web Filtering.

    EDIT: Turning off HTTP scanning removes the 502 errors on our 4g gateway for the time being. As this has been an intermittent issue I cannot be sure this is a permanent fix. Can someone tell me why HTTP scanning gives 502 errors? I thought this was only the case for https scanning without installing the Sophos certificate as a trusted certificate on user devices.

Reply
  • Hi Sachin,

    Unfortunately our ISP demarc point is with a router between the Sophos XG and the modem. This NATs our private addresses to our public IP. It is the only one of our WAN links with this configuration therefore this gateway has a gateway specific NAT of NONE. 

    DNS settings are all fine. Both Sophos and devices on our local network can resolve DNS queries using nslookup. Sophos gets its DNS information from 8.8.4.4 and 8.8.8.8 and our internal network queries Sophos.

    I switched our internet to our 4g gateway and reapplied our normal policy. I immediately started receiving HTTP502 errors again. I am now going to troubleshoot this gateway and isolating the problem to IPS or Web Filtering.

    EDIT: Turning off HTTP scanning removes the 502 errors on our 4g gateway for the time being. As this has been an intermittent issue I cannot be sure this is a permanent fix. Can someone tell me why HTTP scanning gives 502 errors? I thought this was only the case for https scanning without installing the Sophos certificate as a trusted certificate on user devices.

Children
  • I'm concerned that this last post was Sep 01 and then there's no followup from Sophos or anyone.  Did you determine this is a firmware bug and has it since been resolved with an update to which firmware version?

    Just put in an XG115 for an SMB and they are getting the sporadic 502 errors.  But this was the first morning since installing the device and then they resolved and an hours has gone by and no issue.

    I'm wondering if in the LAN to WAN policy, if you have filtering set to "Allow All" then what's the point? Why not set it to "None" as I would guess that fixed the issue. 

  • Hi Jeff,

    I didn't receive the notification mail on this post which is why I missed to follow up. I am not able to recall this but is your XG on the latest firmware and is this still observed? I am sure if it is a common issue it must have been worked upon by now.

    Thanks

  • [moving reply to the new thread that Jeff created]