This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Sophos XG in Azure Backups Failing

We have Sophos XG as a VM in Microsoft Azure.  The firmware version is: SFOS 16.05.4 MR-4

Our backups are failing.  The error we get is: 

Backup failed with an internal error.  Please retry the operation in a few minutes.  If the problem persists, please contact Microsoft Support.

After calling MS support, they offered the following two actions to try:

What does the community think about these actions?  Especially installing a new Linux agent since I cannot find instructions on how to do this on Sophos?

James



This thread was automatically locked due to age.
Parents
  • Hello James, 

    We are having the same issue with our VM firewall appliance in Azure. Firmware version SFOS 17.0.2 MR-2

     

    I was looking into the logs that were generated from the Azure Backup Agent on the sophos appliance and found these:

     

    2017/12/27 06:58:59.556669 INFO Event: name=WALA, op=HeartBeat, message=
    2017/12/27 18:50:13.967168 INFO Azure Linux Agent Version:2.1.3
    2017/12/27 18:50:14.195338 INFO OS: sfos 17
    2017/12/27 18:50:14.196828 INFO Python: 3.5.1
    2017/12/27 18:50:14.198584 INFO Run daemon
    2017/12/27 18:50:14.200267 INFO Detect protocol endpoints
    2017/12/27 18:50:14.201177 INFO WireServer endpoint is not found. Rerun dhcp handler
    2017/12/27 18:50:14.206819 INFO Send dhcp request
    2017/12/27 18:50:14.327049 INFO Configure routes
    2017/12/27 18:50:14.328563 INFO Gateway:10.3.0.1
    2017/12/27 18:50:14.329659 INFO Routes:None
    2017/12/27 18:50:14.330629 INFO Request to install route: 0 0 10.3.0.1
    2017/12/27 18:50:14.331552 INFO Wire server endpoint:168.63.129.16
    2017/12/27 18:50:14.619054 INFO Fabric preferred wire protocol version:2015-04-05
    2017/12/27 18:50:14.630500 INFO Wire protocol version:2012-11-30
    2017/12/27 18:50:14.631822 WARNING Server prefered version:2015-04-05
    2017/12/27 18:50:18.549316 WARNING Socket IOError [Errno 101] Network is unreachable, args:(101, 'Network is unreachable')
    2017/12/27 18:50:18.553391 INFO Retry=0, GET 168.63.129.16:80/.../<xxxxxxxxxxxxx>.<VMName>
    2017/12/27 18:50:29.881964 INFO Start env monitor service.
    2017/12/27 18:50:29.884259 INFO Configure routes
    2017/12/27 18:50:29.886142 INFO Gateway:10.3.0.1
    2017/12/27 18:50:29.887008 INFO Routes:None
    2017/12/27 18:50:29.902138 INFO Request to install route: 0 0 10.3.0.1
    2017/12/27 18:50:29.980362 INFO Event: name=WALA, op=HeartBeat, message=
    2017/12/27 18:50:29.992694 INFO Set block dev timeout: sda with timeout: 300
    2017/12/27 18:50:29.996017 INFO Handle new ext handler config
    2017/12/27 18:50:30.005986 INFO Set block dev timeout: sdb with timeout: 300
    2017/12/27 18:50:30.029992 INFO Set block dev timeout: sdc with timeout: 300
    2017/12/27 18:50:30.100319 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Expected handler state: enabled
    2017/12/27 18:50:30.154309 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Decide which version to use
    2017/12/27 18:50:30.582956 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Use version: 1.0.9124.0
    2017/12/27 18:50:30.772901 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Current handler state is: NotInstalled
    2017/12/27 18:50:30.776217 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Download extension package
    2017/12/27 18:50:42.697933 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Unpack extension package
    2017/12/27 18:50:43.048976 ERROR run cmd 'find /var/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0 -type f | xargs chmod u+x' failed
    2017/12/27 18:50:43.050904 ERROR Error Code:127
    2017/12/27 18:50:43.055061 ERROR Result:/bin/sh: xargs: not found
    2017/12/27 18:50:43.073813 INFO Event: name=Microsoft.Azure.RecoveryServices.VMSnapshotLinux, op=Download, message=Download succeeded
    2017/12/27 18:50:43.074498 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Initialize extension directory
    2017/12/27 18:50:43.096091 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Update settings file: 8.settings
    2017/12/27 18:50:43.098857 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Enable extension.
    2017/12/27 18:50:43.121900 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Launch command:main/handle.sh enable
    /bin/sh: /var/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0/main/handle.sh: Permission denied
    2017/12/27 18:50:43.282518 ERROR Event: name=Microsoft.Azure.RecoveryServices.VMSnapshotLinux, op=Enable, message=(000003)Non-zero exit code: 126, main/handle.sh enable
    2017/12/27 18:50:43.736475 INFO Successfully reported vm agent status
    2017/12/27 18:52:55.380458 INFO Handle new ext handler config
    2017/12/27 18:52:55.382962 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Expected handler state: enabled
    2017/12/27 18:52:55.384919 INFO [Microsoft.Azure.RecoveryService

    From my understand it looks like there are a few issues with the Azure Backup agent on the Sophos XG VM. 

        - INFO Gateway:10.3.0.1 -> INFO Routes:None -> INFO Request to install route: 0 0 10.3.0.1

        - WARNING Socket IOError [Errno 101] Network is unreachable, args:(101, 'Network is unreachable')

        - INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Launch command:main/handle.sh enable
    /bin/sh: /var/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0/main/handle.sh: Permission denied

     

    It appears that the WAagent from Azure is intended for typical Linux distros and the SFOS 17 does not have the CLI arguments that is expected by the WAagent. Also the default gateway creates an issue for the program as it does not seem routable. 

     

    I'd be curious to see what the community's thoughts are on this. I understand that the best way to backup the Sophos XG Firewall is to use the backup config but it seems like a full VM backup should be an option as well. One example that would require full disk VM backups is using a tool like terraform. If I want to  quickly build out a staging or dev environment that replicates my prod environment I dont want to have to restore a backup config every time I want to bring up a staging/dev environment. 

Reply
  • Hello James, 

    We are having the same issue with our VM firewall appliance in Azure. Firmware version SFOS 17.0.2 MR-2

     

    I was looking into the logs that were generated from the Azure Backup Agent on the sophos appliance and found these:

     

    2017/12/27 06:58:59.556669 INFO Event: name=WALA, op=HeartBeat, message=
    2017/12/27 18:50:13.967168 INFO Azure Linux Agent Version:2.1.3
    2017/12/27 18:50:14.195338 INFO OS: sfos 17
    2017/12/27 18:50:14.196828 INFO Python: 3.5.1
    2017/12/27 18:50:14.198584 INFO Run daemon
    2017/12/27 18:50:14.200267 INFO Detect protocol endpoints
    2017/12/27 18:50:14.201177 INFO WireServer endpoint is not found. Rerun dhcp handler
    2017/12/27 18:50:14.206819 INFO Send dhcp request
    2017/12/27 18:50:14.327049 INFO Configure routes
    2017/12/27 18:50:14.328563 INFO Gateway:10.3.0.1
    2017/12/27 18:50:14.329659 INFO Routes:None
    2017/12/27 18:50:14.330629 INFO Request to install route: 0 0 10.3.0.1
    2017/12/27 18:50:14.331552 INFO Wire server endpoint:168.63.129.16
    2017/12/27 18:50:14.619054 INFO Fabric preferred wire protocol version:2015-04-05
    2017/12/27 18:50:14.630500 INFO Wire protocol version:2012-11-30
    2017/12/27 18:50:14.631822 WARNING Server prefered version:2015-04-05
    2017/12/27 18:50:18.549316 WARNING Socket IOError [Errno 101] Network is unreachable, args:(101, 'Network is unreachable')
    2017/12/27 18:50:18.553391 INFO Retry=0, GET 168.63.129.16:80/.../<xxxxxxxxxxxxx>.<VMName>
    2017/12/27 18:50:29.881964 INFO Start env monitor service.
    2017/12/27 18:50:29.884259 INFO Configure routes
    2017/12/27 18:50:29.886142 INFO Gateway:10.3.0.1
    2017/12/27 18:50:29.887008 INFO Routes:None
    2017/12/27 18:50:29.902138 INFO Request to install route: 0 0 10.3.0.1
    2017/12/27 18:50:29.980362 INFO Event: name=WALA, op=HeartBeat, message=
    2017/12/27 18:50:29.992694 INFO Set block dev timeout: sda with timeout: 300
    2017/12/27 18:50:29.996017 INFO Handle new ext handler config
    2017/12/27 18:50:30.005986 INFO Set block dev timeout: sdb with timeout: 300
    2017/12/27 18:50:30.029992 INFO Set block dev timeout: sdc with timeout: 300
    2017/12/27 18:50:30.100319 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Expected handler state: enabled
    2017/12/27 18:50:30.154309 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Decide which version to use
    2017/12/27 18:50:30.582956 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Use version: 1.0.9124.0
    2017/12/27 18:50:30.772901 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Current handler state is: NotInstalled
    2017/12/27 18:50:30.776217 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Download extension package
    2017/12/27 18:50:42.697933 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Unpack extension package
    2017/12/27 18:50:43.048976 ERROR run cmd 'find /var/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0 -type f | xargs chmod u+x' failed
    2017/12/27 18:50:43.050904 ERROR Error Code:127
    2017/12/27 18:50:43.055061 ERROR Result:/bin/sh: xargs: not found
    2017/12/27 18:50:43.073813 INFO Event: name=Microsoft.Azure.RecoveryServices.VMSnapshotLinux, op=Download, message=Download succeeded
    2017/12/27 18:50:43.074498 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Initialize extension directory
    2017/12/27 18:50:43.096091 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Update settings file: 8.settings
    2017/12/27 18:50:43.098857 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Enable extension.
    2017/12/27 18:50:43.121900 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Launch command:main/handle.sh enable
    /bin/sh: /var/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0/main/handle.sh: Permission denied
    2017/12/27 18:50:43.282518 ERROR Event: name=Microsoft.Azure.RecoveryServices.VMSnapshotLinux, op=Enable, message=(000003)Non-zero exit code: 126, main/handle.sh enable
    2017/12/27 18:50:43.736475 INFO Successfully reported vm agent status
    2017/12/27 18:52:55.380458 INFO Handle new ext handler config
    2017/12/27 18:52:55.382962 INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Expected handler state: enabled
    2017/12/27 18:52:55.384919 INFO [Microsoft.Azure.RecoveryService

    From my understand it looks like there are a few issues with the Azure Backup agent on the Sophos XG VM. 

        - INFO Gateway:10.3.0.1 -> INFO Routes:None -> INFO Request to install route: 0 0 10.3.0.1

        - WARNING Socket IOError [Errno 101] Network is unreachable, args:(101, 'Network is unreachable')

        - INFO [Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0] Launch command:main/handle.sh enable
    /bin/sh: /var/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9124.0/main/handle.sh: Permission denied

     

    It appears that the WAagent from Azure is intended for typical Linux distros and the SFOS 17 does not have the CLI arguments that is expected by the WAagent. Also the default gateway creates an issue for the program as it does not seem routable. 

     

    I'd be curious to see what the community's thoughts are on this. I understand that the best way to backup the Sophos XG Firewall is to use the backup config but it seems like a full VM backup should be an option as well. One example that would require full disk VM backups is using a tool like terraform. If I want to  quickly build out a staging or dev environment that replicates my prod environment I dont want to have to restore a backup config every time I want to bring up a staging/dev environment. 

Children
  • There are different options to achieve what you described. An option will be to automate post configuration application to your test/dev environment using an Azure automation account or terraform via the SSH provider or using the XG's web API. This can be implemented easily. Feel free to reach out via private message if you need suggestions on how to implement. The other concern is around licensing as even your Dev environment needs a separate license from production which is why this is a better way of doing this.

  • Hey David, 

    Totally understand the other options and what can be done for getting backups. Unfortunately, that's what we have had to do as a work around for this feature missing. I think this issue is more important since this product is supposed to be fully support on the Azure platform (having customer build their own automated work around is not supplying a fully supported product). Looking at the logs I've posted earlier, we see the default gateway being pulled by the Azure Backup extension is incorrect:

     

    2017/12/27 18:50:14.206819 INFO Send dhcp request
    2017/12/27 18:50:14.327049 INFO Configure routes
    2017/12/27 18:50:14.328563 INFO Gateway:10.3.0.1
    2017/12/27 18:50:14.329659 INFO Routes:None
    2017/12/27 18:50:14.330629 INFO Request to install route: 0 0 10.3.0.1
    2017/12/27 18:50:14.331552 INFO Wire server endpoint:168.63.129.16
    2017/12/27 18:50:14.619054 INFO Fabric preferred wire protocol version:2015-04-05
    2017/12/27 18:50:14.630500 INFO Wire protocol version:2012-11-30
    2017/12/27 18:50:14.631822 WARNING Server prefered version:2015-04-05
    2017/12/27 18:50:18.549316 WARNING Socket IOError [Errno 101] Network is unreachable, args:(101, 'Network is unreachable')
    2017/12/27 18:50:18.553391 INFO Retry=0, GET 168.63.129.16:80/.../<xxxxxxxxxxxxx>.<VMName>
    2017/12/27 18:50:29.881964 INFO Start env monitor service.

    2017/12/27 18:50:29.884259 INFO Configure routes
    2017/12/27 18:50:29.886142 INFO Gateway:10.3.0.1
    2017/12/27 18:50:29.887008 INFO Routes:None
    2017/12/27 18:50:29.902138 INFO Request to install route: 0 0 10.3.0.1
    2017/12/27 18:50:29.980362 INFO Event: name=WALA, op=HeartBeat, message=
    2017/12/27 18:50:29.992694 INFO Set block dev timeout: sda with timeout: 300
    2017/12/27 18:50:29.996017 INFO Handle new ext handler config
    2017/12/27 18:50:30.005986 INFO Set block dev timeout: sdb with timeout: 300
    2017/12/27 18:50:30.029992 INFO Set block dev timeout: sdc with timeout: 300

     

    Azure Backup on linux or windows requires outbound internet access for the IaaS Azure Backup product to work but the "network unreachable" is what causes this product to fail because of an incorrect default route.  

    I believe the correct solution is to not have customers build a workaround by creating an automated pipeline (ie. jenkins, terraform) but instead Sopho's product should be fully supported and compatible with azure. And for the last year it has not been fully compatible with Azure. I recommend to development team at Sophos Cloud should work with the Azure Backup product team to correct the code to pull the correct IP for the default gateway. It does not appear to be a difficult fix for this type of feature.