This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Load Balancing HTTP Check

Hello forum,

I've bumped into an issue that puzzles me.

I have a client that has a load balancing rule active for years, to 4 backend servers. It uses a TCP health check.

This has been running fine until they did an upgrade of the backend last night, and the new backend doesn't seem to like the TCP connects from the UTM. So they've installed lighttpd on the backend servers, and hacked a cgi script together to check the status of the application, and return a HTTP 200 or 503 depending on the operational status.

Checking the status manually works fine:

[server]$ curl http://127.0.0.1
HTTP/1.1 200 Ok
Content-Type: text/html
Content-Length: 65

<html><body>Application Cluster Node is operational</body></html>

[server]$ 

It also works fine from other servers in the subnet, so no server based firewall rules are in the way.

However, as soon as I change the load balancer rule from TCP to HTTP ( with either leaving the URL field empty or entering "index.php" ), all nodes go down.

The server monitor logs:

2022:07:31-16:43:53 firewall-1 service_monitor[29121]: id="4003" severity="error" sys="System" sub="loadbalancing" name="error reading HTTP response: 1/-1"
2022:07:31-16:43:53 firewall-1 service_monitor[29121]: id="4003" severity="error" sys="System" sub="loadbalancing" name="error reading HTTP response: 1/-1"
2022:07:31-16:43:53 firewall-1 service_monitor[29121]: id="4003" severity="error" sys="System" sub="loadbalancing" name="error reading HTTP response: 1/-1"
2022:07:31-16:43:53 firewall-1 service_monitor[29121]: id="4003" severity="error" sys="System" sub="loadbalancing" name="error reading HTTP response: 1/-1"

but there are no requests logged in the lighttpd logs on the 4 backend servers.

I must be missing something obvious here, but I'm staring at it for 2 hours and getting nowhere.

Any tips on where I go wrong?



This thread was automatically locked due to age.

Top Replies

  • /etc/service_monitor.conf contains:

    [REF_PacLoaCitriIcaTo 0]
      #REF_NetHosDatabNode1
      service http://172.18.5.11:1494 /"index.php"
      interval 5
      timeout 3
    
      action proc REF_PacLoaCitriIcaTo 0
      action confd_status REF_PacLoaCitriIcaTo REF_NetHosDatabNode1
    
    
    [REF_PacLoaCitriIcaTo 1]
      #REF_NetHosDatabNode2
      service http://172.18.5.12:1494 /"index.php"
      interval 5
      timeout 3
    
      action proc REF_PacLoaCitriIcaTo 1
      action confd_status REF_PacLoaCitriIcaTo REF_NetHosDatabNode2
    
    
    [REF_PacLoaCitriIcaTo 2]
      #REF_NetHosDatabNode3
      service http://172.18.5.13:1494 /"index.php"
      interval 5
      timeout 3
    
      action proc REF_PacLoaCitriIcaTo 2
      action confd_status REF_PacLoaCitriIcaTo REF_NetHosDatabNode3
    
    
    [REF_PacLoaCitriIcaTo 3]
      #REF_NetHosDatabNode4
      service http://172.18.5.14:1494 /"index.php"
      interval 5
      timeout 3
    
      action proc REF_PacLoaCitriIcaTo 3
      action confd_status REF_PacLoaCitriIcaTo REF_NetHosDatabNode4

    So we can conclude that my assumption was right, HTTP Check does a HTTP request to the defined service, so it is only useful for webservers, not for any other service you can have a health check page for.

    Bummer, as that only leaves a ping check, which says precisely zero about the availability of the service.... Angry

    Jump to answer
Parents Reply Children
No Data