[BUG][8.001] NMAP DOS of Astaro

Let me explain our setup really quick before I launch into what is going on. We have Astaro, running on VMWare, serving as the gateway for a very small (~8 machines) internal network and very, very low network traffic (almost none). The few of us outside the network that need access into the network use a very simple PPTP setup where we login and get an internal IP. All pretty standard.
Anyways, up until yesterday, we were running the Astaro v8 beta software, but we decided to upgrade all the way up to 8.001. Today, I needed to do a port scan on the internal network while I was outside of it, so I fired up the VPN, and ran the following nmap:

> nmap -sS -p1-65535 192.168.0.0/24

To my surprise, about halfway through my NMAP I got a notice that my connection to the VPN had been dropped.  I assumed that either I had triggered some security feature or this had been a coincidence, so I asked a coworker to attempt to VPN to the machine; no connection.  About this time Nagios (sitting outside the VPN) started freaking out about all of the machines inside the VPN (NRPE - using simple port forwarding).  I went to vSphere and hit Astaro's console only to find that it wouldn't respond to any key presses at all (it was at the Astaro splash screen).  I finally had to reboot it in vSphere.
I believed, at the time, that the problem was just coincidental to my scan, after all we had done the same exact scan many times in the past, but a second attempt caused the same behavior.  On a third attempt, I tail -f'd all of my log files and, other than a rapidly exploding packet filter log, nothing unusual was reported up to the very moment that everything just froze up.
I used vSphere's Performance tab and noticed a few things.  First, around the time my nmap started, the CPU usage on Astaro jumped to 100% and stayed there even after the machine froze.  Now, the server that this is on has two 2ghz quadcore CPUs and Astaro has full access to all of those resources as it needs it, so I doubt the box's performance is an issue.  Disk space is not large at all (~62% of the allocated space).
I'm not ready to rule out VMWare as the culprit here, but I do want to remind anyone reading this that we never had this problem with the v8 beta versions even when we were running much higher network traffic.

0 Enekk over 15 years ago in reply to RFCat_vk_01

Hi,
have a look at your packet filter rule to se if logging is enabled on it.

Ian M
I've looked through all of our rules (very few of them) and have not seen any that have logging enabled.  I also went to the advanced tab of the packet filter stuff and made sure all of the logging options were disabled there.

As to the "Can I crash things on the internal network" question.  The answer is not with only one machine as I can from the external network.  The ulogd process keeps coming up and pulling a ton of CPU, but the total usage jumps all over the place (at one point 90% - oddly this was postgres going nuts), but ulogd keeps releasing its CPU allocation.

The machine outside the firewall is running nmap 5 while the one inside is running 5.21 and it seems like they scan a bit differently.  5 seems to cast a much wider net while scanning (i.e. it looks at a larger IP space at a time) while 5.21 seems more conservative in the number of IPs it scans at a time.

I'm at a loss, I'd be willing to be that doing this on one or two more machines internally (i.e. moving to a DDOS) would bork things, but perhaps it has more to do with the combination of traffic coming over the VPN and routing rules.  Really wish we had the money for a support contract so we could look into this more.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Billybob over 15 years ago in reply to RFCat_vk_01

Also forgot to mention the VPN Remote Access Reporting which supposedly uses more cycles. But that has been available in earlier betas so wouldn't affect just this version in particular.

If this thing crashes from the internal LAN, there is something definitely wrong. Do you have huge I/O wait times.

@Ian, forgot about the per rule logging option but he has been running the same setup in earlier betas I assume.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Enekk over 15 years ago in reply to Billybob

Just wanted to add that work hours are rapidly dwindling here and I will have to leave, but I'll be back on this in the morning if anyone has any other tests/ideas.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Billybob over 15 years ago in reply to Enekk

I am at a loss at this point but here is what I would look for:

I/O waiting time, if it is large enough and your hard drive is spinning wildly, then astaro is to blame. Otherwise its not playing nice with the virtual environment.

Best of luck and sorry about the rant earlier, you weren't supposed to see it[;)]
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Enekk over 15 years ago in reply to Billybob

I am at a loss at this point but here is what I would look for:

I/O waiting time, if it is large enough and your hard drive is spinning wildly, then astaro is to blame. Otherwise its not playing nice with the virtual environment.
I did notice spikes in the I/O access speeds that might correlate to the crashes. I'll test more tomorrow, but I'm sure we all know the old maxim about correlation and causation.

Best of luck and sorry about the rant earlier, you weren't supposed to see it[;)]
No biggie, thanks for your help so far.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 BarryG over 15 years ago

Hi, I've had problems with the IPS notifications causing a bit of a DOS, but I'm guessing you're not using the IPS if you're using the Essential license.

How is the RAM usage?

What is showing up in the PacketFilter log?

Barry
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 RFCat_vk_01 over 15 years ago in reply to BarryG

Hi,
try reducing the number of CPUs available to the ASG guest down to 1.

Regards

Ian M
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Enekk over 15 years ago in reply to BarryG

How is the RAM usage?

Very manageable and reasonable

What is showing up in the PacketFilter log?
Tons and tons of dropped packets

try reducing the number of CPUs available to the ASG guest down to 1.
I'm certainly willing to try this a bit later, but why do you suggest doing this?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Billybob over 15 years ago in reply to Enekk

Essential firewall is a recent offering by astaro so I haven't seen the available tabs on webadmin. The problem is that most of us have SYN protection with limited logging/ logging disabled altogether available with our subscriptions. Even with the IPS system off, you can easily manage the simple SYN attack that you are simulating by limiting logging and dropping packets beyond a certain threshold. Those options are not available in essential edition[:(]

Are you using the same version of nmap that you tested against beta versions. I can't figure out what has changed that is causing a production release to act worse than a beta release.

I will still be interested to see your I/O results along with what you actually observe with your hard drives.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Enekk over 15 years ago in reply to Billybob

Even with the IPS system off, you can easily manage the simple SYN attack that you are simulating by limiting logging and dropping packets beyond a certain threshold. Those options are not available in essential edition[:(]
I understand that there needs to be feature differences, but I really wish customized logging was in all version.

Are you using the same version of nmap that you tested against beta versions. I can't figure out what has changed that is causing a production release to act worse than a beta release.
Same version.  Only change was from the beta to 8.001

I will still be interested to see your I/O results along with what you actually observe with your hard drives.
Yeah, so the Disk Write Rate jumps from an average of less than 50kbps to about 1,500 kbps until things froze up and it drops to zero.

Turns out there was already only one virtual CPU assigned to the machine, so no luck with that solution.  I wish that we had the resources to put Astaro on its own machine instead of on a virtual machine, but, as of right now, I'm going to say that the problem has something to do with CPU usage and disk write speeds.  I don't really know why I can't replicate the problem on the internal network, perhaps the load of the VPN traffic is bad enough to cause the difference.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel