What is the biggest attack in GBPS you stopped

Harvy66 · May 3, 2015, 4:09 PM

What is "KERN.IPC.NMBUF"? I can't find anything about it?

Supermule · May 3, 2015, 4:30 PM

Kernel buffers.

https://www.google.dk/search?q=KERN.IPC.NMBUF&ie=UTF-8

Supermule · May 3, 2015, 4:31 PM

It goes down so fast you dont see the utilization…

@Supermule:

8 cores

http://youtu.be/-xTtzLEQx08

Not as good as hoped but not running 100% CPU like all the others. It seems that the response on the WAN graph are related to the PING on WAN.

It seems that the 2 CORE setup is the one that performs best in beginning until around 35 seconds into the attack. Then crash. 4 and 8 cores keep the GUI online.

You may be at 100% cpu, but according to the dashboard, you're running at 311mhz even when at 100%.

Supermule · May 3, 2015, 5:23 PM

4mbps attack and 40% packetloss.

Netstat -L doesnt see any exhaustion of queues.

Anybody know how to change the backlog to 1024??

Just to see if it matters.

Supermule · May 3, 2015, 5:48 PM

Here is the output of vmstat -z

Anybody find something unusual in this?

![pfsense.22tv - Diagnostics_ Execute command_Page_1.png](/public/imported_attachments/1/pfsense.22tv - Diagnostics_ Execute command_Page_1.png)
![pfsense.22tv - Diagnostics_ Execute command_Page_1.png_thumb](/public/imported_attachments/1/pfsense.22tv - Diagnostics_ Execute command_Page_1.png_thumb)
![pfsense.22tv - Diagnostics_ Execute command_Page_2.png](/public/imported_attachments/1/pfsense.22tv - Diagnostics_ Execute command_Page_2.png)
![pfsense.22tv - Diagnostics_ Execute command_Page_2.png_thumb](/public/imported_attachments/1/pfsense.22tv - Diagnostics_ Execute command_Page_2.png_thumb)

dennypage · May 3, 2015, 9:36 PM

Guys, you need to be much more rigorous in collecting data. You are trying to diagnose a network packet processing problem. Using the web interface to execute shell commands will not produce a consistent and reliable result. Not only is the web interface heavy weight, it is lower priority than kernel packet processing. And most importantly, your diagnostic data collection is dependent upon the behavior of the system you are trying to diagnose.

Let's assume you don't want to build a custom kernel…

You need to shed as many variables as possible and get as close to real data as you can. Turn Snort off for crying out loud. And anything else optional that might interfere with metrics. If you want to use command line tools, execute them outside of network processing. This means using the console, not ssh. Create a shell script that collects information on a periodic basis. Elevate the priority of the script to ensure timely execution. And save the output for every run.

Here is a sample script:

#!/bin/sh
ps -axuwww
While true
do
/bin/date
/usr/bin/netstat -m
sleep 2
done

Here is a sample execution:

/usr/bin/nice -n -19 myscript

Supermule · May 4, 2015, 3:34 AM

Done it at the console at no useful output was generated for people to see.

I stopped Snort running and here is the output from the DoS.

First 2 is idle and next 2 is under DoS.

Supermule · May 4, 2015, 7:05 AM

Done some more testing this morning.

2-3mbps is all it takes. Has downscaled the Mbufs and state max a little.

http://youtu.be/NPtDnM8ixXs

Dennypage. Thanks for the info. Want to help diagnose then contact me on PM.

tim.mcmanus · May 6, 2015, 5:09 PM

This link is probably important to note the differences between versions: https://doc.pfsense.org/index.php/Does_pfSense_support_SMP_(multi-processor_and/or_core)_systems

2.1 was single-threaded and 2.2 is multi-threaded. That's why you're seeing an impact/performance difference between the two; it's not hard to extrapolate how and why.

I think what you're trying to determine, and this is based on my review of the thread, is which part of pf is choking. In order to determine this you need to debug each component in the chain from the NIC to the CPU and back out as well as the code. I'm not entirely sure you know programmatically where and which networking event triggers the issue inside pf, only that a large volume of data of a specific type starts the event.

You've moved beyond evaluating pf from a networking perspective and more into evaluating the codebase. This requires a different kind of data collection and troubleshooting. It also take an excruciatingly long time to identify and resolve these kinds of issues. It's a lot more than just tweaking a setting in some cases.

Best of luck in determining the root cause and solution to this issue.

Supermule · May 6, 2015, 5:12 PM

Thanks Tim.

You are 100% correct.

kejianshi · May 6, 2015, 10:10 PM

Silly question - Is it possible to set a max cpu % that may be used by the packet filter? Keep some in reserve for other processes?

Harvy66 · May 7, 2015, 1:45 PM

Someone needs to use DTrace and make a flame-graph of what methods are being called in the kernel.

Supermule · May 7, 2015, 1:59 PM

Shall we test again Harvy??

Harvy66 · May 7, 2015, 2:09 PM

I don't know how to do flame graphs, I've only seen them in Netflix presentations talking about optimizing FreeBSD.

Supermule · May 13, 2015, 6:07 AM

Tested Fortigate Virtual Server and after enabling Flood Protection, it ran perfectly during all tests.

1CPU and 1GB RAM in a VmWare VM.

Had an email conversation with a guy named Dave Huffman and he was able to replicate the scenario but only using DoS and not DDoS. Not that important, but it seems pfSense is not able to handle legitimate traffic vs. offending IP's.

It chokes somewhere in the stack.

Nullity · May 13, 2015, 9:41 AM

Do you need help figuring out how to enable profiling or some other debugging software?

Supermule · May 13, 2015, 9:43 AM

Yes because I need to get to the bottom of this.

tim.mcmanus · May 13, 2015, 12:43 PM

I would start here. http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html

You can do remote kernel debugging as an option. It's not for the faint of heart. Debugging never is.

firewalluser · May 13, 2015, 4:43 PM

@Supermule:

Yes because I need to get to the bottom of this.

Try a USB nic on the wan, see how the data is handled differently.

You'll find these useful as well.
https://software.intel.com/sites/default/files/profiling_debugging_freebsd_kernel_321772.pdf
https://2008.asiabsdcon.org/papers/P8B-paper.pdf

Mutexs can catch some people out, but they are just locks to ensure the code doesnt deadlock in a multicore environment.

IMO profiling is better than retrospectively debugging crash dumps as you can make the crash dumps misleading in some situations masking the real root cause of the problem. I've found bugs in programming languages that have existed for over 15 years, thats how difficult some of these bugs are to find, even though it took me less than a week to find, theres a lot of exposed software out there.

Main difficultly are multi cores when it comes to debugging, you could be looking at the code running on one core whilst a bug in code running on another core creates the problem which crashes the code running on the core you are looking at. You can mask some problems by running on a single core but you will still have the problem, just less often as these things are just inglorious clockwork turkmachines.

Supermule · May 13, 2015, 4:57 PM

Yes but doesnt crash as in crash….

It just goes to a standstill and is unresponsive. You dont see anything on the console and in the logs besides excessive traffic.

I dont see any queueing on the NIC's as well so its pretty odd.

firewalluser · May 13, 2015, 5:30 PM

Maybe something is getting out of order, perhaps a queue/buffer is being processed LIFO when it should be FIFO which would generally only show up under load considering the speeds of todays CPU's.

Supermule · May 13, 2015, 5:42 PM

Could be.

The fact is that it cannot sort traffic in legitimate and non-legit traffic.

Its pretty weird that a lot of excess ressources is not used on the webserver to keep the GUI alive at least.

When the flood control setting is activated in the Fortinet VM appliance then it can handle everything we threw at it.

It can sort traffic at wirespeed, but pfsense cannot.

It sortof feels like pfsense is responding to packets that shouldnt get a response and holds everything else…

I dont know where to start.

A guy called Dave Huffmann contacted and has written a script that can replicate most of what we see. I dont know if he has come up with an answer yet on whats causing it to slow down.

I am yet to receive news from him.

Harvy66 · May 13, 2015, 7:14 PM

Lookup flame graphs and Netflix. They're very useful for looking for offending code paths.

firewalluser · May 14, 2015, 2:55 AM

http://techblog.netflix.com/2014/11/nodejs-in-flames.html

Supermule · May 14, 2015, 9:19 AM

Looks really interesting and is a good read.

How to impelement it in pfsense??

Can this be implemented somehow in a package?

http://www.brendangregg.com/blog/2015-03-10/freebsd-flame-graphs.html

There is code in there…

Supermule · May 14, 2015, 12:31 PM

This is how my logs look like…
May 14 14:27:51 php-fpm[1097]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:27:41 php-fpm[1097]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:27:29 php-fpm[71742]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:27:19 php-fpm[71742]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:27:07 php-fpm[71510]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:26:56 php-fpm[71510]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:26:45 php-fpm[53920]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:26:35 php-fpm[53920]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:26:23 php-fpm[1097]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:26:13 php-fpm[1097]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:26:01 php-fpm[71742]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:25:51 php-fpm[71742]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:25:39 php-fpm[71510]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:25:29 php-fpm[71510]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:25:27 check_reload_status: Reloading filter
May 14 14:25:27 check_reload_status: Restarting OpenVPN tunnels/interfaces
May 14 14:25:27 check_reload_status: Restarting ipsec tunnels
May 14 14:25:27 check_reload_status: updating dyndns Yousee
May 14 14:25:23 check_reload_status: Reloading filter
May 14 14:25:23 check_reload_status: Restarting OpenVPN tunnels/interfaces
May 14 14:25:23 check_reload_status: Restarting ipsec tunnels
May 14 14:25:23 check_reload_status: updating dyndns Yousee
May 14 14:25:17 php-fpm[53920]: /rc.filter_configure_sync: Could not find IPv6 gateway for interface (wan).
May 14 14:25:13 check_reload_status: Reloading filter
May 14 14:25:13 check_reload_status: Restarting OpenVPN tunnels/interfaces
May 14 14:25:13 check_reload_status: Restarting ipsec tunnels
May 14 14:25:13 check_reload_status: updating dyndns Yousee
May 14 14:25:11 check_reload_status: Reloading filter
May 14 14:25:11 check_reload_status: Restarting OpenVPN tunnels/interfaces
May 14 14:25:11 check_reload_status: Restarting ipsec tunnels
May 14 14:25:11 check_reload_status: updating dyndns Yousee
May 14 14:25:03 php-fpm[53920]: /rc.filter_configure_sync: Not installing NAT reflection rules for a port range > 500
May 14 14:24:55 check_reload_status: Reloading filter
May 14 14:24:55 check_reload_status: Restarting OpenVPN tunnels/interfaces
May 14 14:24:55 check_reload_status: Restarting ipsec tunnels
May 14 14:24:55 check_reload_status: updating dyndns Yousee

tim.mcmanus · May 14, 2015, 3:34 PM

@Supermule:

Looks really interesting and is a good read.

How to impelement it in pfsense??

Can this be implemented somehow in a package?

http://www.brendangregg.com/blog/2015-03-10/freebsd-flame-graphs.html

There is code in there…

IMHO, I wouldn't put this into a package or integrate it into pfSense. It's a development debugging tool that ideally should be installed on dev or test machines.

Supermule · May 14, 2015, 3:56 PM

Hmmm.

EDIT:

Thinking it would work as a great tool in debugging issues related to pfsense and give the IT-admins a better insight of whats going on in their environments.

lowprofile · May 14, 2015, 9:49 PM

Hi guys

Just want to share some info.

I am going away from pfsense. PFsense is a great firewall but all these buggy versions out there have cost me $$$$ - 2.2* has been so unstable and i've lost the trust to pfsense.

Regarding DDoS, pfsense cannot handle a simple flood SYN. In 2.2* it got more worse. You can spend many weeks in tuning, tweaking etc, but then you will then have a system which is unreliable at the end. Too much core changing.

I managed to get it somehow 80% resistent to SYN floods in 2.1.5, but it had its sideaffects. I now experienced unstability generel.

I've tried fortigate VM appliance with 1gb ram and 1core (trial) - i was surprised how stable it was with same hardware (virtual) You have a special option to block SYN/ICMP/FIN etc floods. Very simple option. See screenshot.

I used 10min to install it and further 10min to set it up. Activated the ddos policy. and bingo i had a stable setup. I know fortigate cost much more, but most of the appliances are built on linux or freebsd. I have concluded the pfsense does lack this crucial "feature" and protection.

No more packet drop, even the attack was on +100mbit SYN flood. Very stable, not a single drop in ping.
Sad to say, but this has proven the source to the issue = PFsense.

I am now investing in a proper firewall. VM or box, doesnt matter, it just wont be PFsense. I liked pfsense untill i got these issues and some serious stability issues in newer versions. Time to move on. ;)

Nullity · May 14, 2015, 11:22 PM

@lowprofile:

Hi guys

Just want to share some info.

I am going away from pfsense. PFsense is a great firewall but all these buggy versions out there have cost me $$$$ - 2.2* has been so unstable and i've lost the trust to pfsense.

Regarding DDoS, pfsense cannot handle a simple flood SYN. In 2.2* it got more worse. You can spend many weeks in tuning, tweaking etc, but then you will then have a system which is unreliable at the end. Too much core changing.

I managed to get it somehow 80% resistent to SYN floods in 2.1.5, but it had its sideaffects. I now experienced unstability generel.

I've tried fortigate VM appliance with 1gb ram and 1core (trial) - i was surprised how stable it was with same hardware (virtual) You have a special option to block SYN/ICMP/FIN etc floods. Very simple option. See screenshot.

I used 10min to install it and further 10min to set it up. Activated the ddos policy. and bingo i had a stable setup. I know fortigate cost much more, but most of the appliances are built on linux or freebsd. I have concluded the pfsense does lack this crucial "feature" and protection.

No more packet drop, even the attack was on +100mbit SYN flood. Very stable, not a single drop in ping.
Sad to say, but this has proven the source to the issue = PFsense.

I am now investing in a proper firewall. VM or box, doesnt matter, it just wont be PFsense. I liked pfsense untill i got these issues and some serious stability issues in newer versions. Time to move on. ;)

I love you to bro.

Thanks for trying to spread some negativity and get a pfSense vs Fortigate fight going on your way out. You will be missed.

NOYB · May 14, 2015, 11:43 PM

@lowprofile:

Hi guys

Just want to share some info.

I am going away from pfsense. PFsense is a great firewall but all these buggy versions out there have cost me $$$$ - 2.2* has been so unstable and i've lost the trust to pfsense.

Regarding DDoS, pfsense cannot handle a simple flood SYN. In 2.2* it got more worse. You can spend many weeks in tuning, tweaking etc, but then you will then have a system which is unreliable at the end. Too much core changing.

I managed to get it somehow 80% resistent to SYN floods in 2.1.5, but it had its sideaffects. I now experienced unstability generel.

I've tried fortigate VM appliance with 1gb ram and 1core (trial) - i was surprised how stable it was with same hardware (virtual) You have a special option to block SYN/ICMP/FIN etc floods. Very simple option. See screenshot.

I used 10min to install it and further 10min to set it up. Activated the ddos policy. and bingo i had a stable setup. I know fortigate cost much more, but most of the appliances are built on linux or freebsd. I have concluded the pfsense does lack this crucial "feature" and protection.

No more packet drop, even the attack was on +100mbit SYN flood. Very stable, not a single drop in ping.
Sad to say, but this has proven the source to the issue = PFsense.

I am now investing in a proper firewall. VM or box, doesnt matter, it just wont be PFsense. I liked pfsense untill i got these issues and some serious stability issues in newer versions. Time to move on. ;)

If I were using pfSense in a business environment I'd be right behind you.

That a disgruntled employee, dissatisfied customer, or unscrupulous competitor, could take a business behind pfSense offline with such a small amount of traffic would be a really scary and unacceptable risk. And that the pfSense team doesn't really seem to be very engaged in figuring it out so it can be fixed doesn't instill any confidence that it will be fixed anytime soon. In fact it indicates that they either have no idea what's causing the issue, or that they do know and know there is no timely fix on the horizon. So down play it. Just multiplies the sentiment.

Harvy66 · May 15, 2015, 12:16 AM

You may want to re-evaluate PFSense in the future. Big performance changes in 3.0, hopefully this stuff will get fixed and it will be done with.

Nullity · May 15, 2015, 1:05 AM

@NOYB:

If I were using pfSense in a business environment I'd be right behind you.

That a disgruntled employee, dissatisfied customer, or unscrupulous competitor, could take a business behind pfSense offline with such a small amount of traffic would be a really scary and unacceptable risk. And that the pfSense team doesn't really seem to be very engaged in figuring it out so it can be fixed doesn't instill any confidence that it will be fixed anytime soon. In fact it indicates that they either have no idea what's causing the issue, or that they do know and know there is no timely fix on the horizon. So down play it. Just multiplies the sentiment.

I have been assuming that this is not actually a problem, and that FreeBSD/pfSense is fully capable of withstanding this attack if configured properly.

Honestly, when I first found pfSense I bought into the "omg, SuperNetAdmin must use this!", but after a few months, the GUI's limitations were obvious even to a networking newbie like me.

I like this community though. I hope it doesn't crumble…

NOYB · May 15, 2015, 1:47 AM

I wouldn't run a business network on assumptions.

Harvy66 · May 15, 2015, 2:07 AM

FreeBSD is a good platform. Even PFSense moves forward slowly, as long as it keeps moving forward. It works good enough for me. If I ever stopped using PFSense, I'd probably just switch to FreeBSD/PCBSD and learn how to configure things directly or use packages if they exist.

tim.mcmanus · May 15, 2015, 3:33 AM

I run pfSense on multiple WANs and LANs on the same box for my business. I'm also hosting multiple web servers and mail servers. Never once taken down by any attacks.

What are you doing to subject yourself to these kinds of attacks, and why hasn't your ISP done anything to mitigate them?

I used to work for an MSP that resold and supported FortiGates and thought they sucked. I tore most of them out and replaced them with pfSense.

I just received some additional equipment to build a Security Onion appliance that I'm going to mirror two WAN ports into. I could easily integrate Snort on pfSense to use the barnyard database on the SO appliance to most likely mitigate the whole SYN attack. After I'm done building it, it would be interesting to test. Plus I'd be able to capture and inspect every packet coming in. I'm also interested to see if it takes out the MikroTik switch in front of pfSense first.

NOYB · May 15, 2015, 4:59 AM

@tim.mcmanus:

What are you doing to subject yourself to these kinds of attacks, and why hasn't your ISP done anything to mitigate them?

Wow! Yeah it must be the attackie's fault. After all certainly no one would do such a thing without provocation.

Certainly if a business has an employee that becomes disgruntled it is without a doubt the businesses fault.
Certainly if a business has a customer that becomes dissatisfied it is without a doubt the businesses fault.
Certainly if a business has an unscrupulous competitor it is without a doubt the businesses fault.

Really? You expect a business to rely on ISP to protect against a low bandwidth attack such as this. A business could be down for days before being able to get an ISP to take meaningful action. Sure hope that it is not pfSense position that an ISP should protect a business from such a low bandwidth attack so their product doesn't have to.

kejianshi · May 15, 2015, 5:15 AM

I'm really seeing the logic in the point that others talked about, several times actually…

The end users firewall really isn't the place to stop or mitigate a DDOS.

Supermule · May 15, 2015, 6:16 AM

CMB accused me of beeing the guy behind the forum downtime yesterday.

That saddened me…

Just because I say and keep saying that there is a major flaw in the way pf handles packets, then I must be the guy taking the forum down. :(

He asked for pcaps and I told him they were available for download in this thread. Didnt hear from him again.

I need to get somebody with the right debugging tools involved in this.

kejianshi · May 15, 2015, 6:54 AM

Hard to know who is doing it now isn't it?