What is the biggest attack in GBPS you stopped
-
Ran 8M states and it performed well during the tests when it stayed up.
I dont see interrupt storms on the console. I have had a kernel taskq on the core that rises to 100% usage and then packet loss.
-
This is one of the links I sent out last night. http://bsdrp.net/documentation/technical_docs/performance
Use this more for the monitoring tools and the data they capture. It should help immensely. You may have to modify pfSense as some of the tools/modules may not be loaded by default in the distribution. You can grab them from FreeBSD as noted in the article.
-
I've been thinking about this, considering it was the RRD graphs which took out my fw straight away, I dont think the drivers are the issue really but more the SMP support in freebsd 10.1 like Harvy66 mentions in the FreeBSD 11 thread and in post 14 on this thread.
One way I can test this theory is to fire up an old single core AMD laptop with the same usb nics on the wan and see how it performs.
Laptop spec is not identical admittedly to the intel D847 ie only 2 or 4Gb of ram, spin disk instead of msata, but the nics will be the same as will pfsense.
Failing that, if I can get pfsense running on the raspberry pi's I have here, I can test on a RPi1 b (single core) as well as the new RPi2 which is quad core but same 1Ghz, same ram and will have same usb nics although the Arm cpu instruction sets are different.
Worth trying the single core laptop next instead?
-
Just let me know :)
-
Current IP address is 92.28.255.154
I've created an allow ICMP (ping) rule through to the wan so can you give it some pings first, then when I say we can test the EMx intel nic whilst I dust off this laptop at the same time.
I'm not going to attempt accessing the RRD graphs again for now, but everything else will be as before ie
dashboard running, system activity window, dynamic fw logs running along with me navigating around the fw changing rules etc to check responsiveness is still there.Edit. Can you let us know when you have done the pings as I have not seen anything yet.
-
It says no answer to ping here…
-
You can stop it now, I've changed ip address and I hadnt swapped the cables over.
Let me check the rules as I can see one ICMP came in which was allowed but I wasnt sure if that was you or anyone else reading the thread pinging me.
I also found a crash report from lunch time which I'll be submitting but it looks like the fw ran out of memory around the time I tried loading the RRD graphs.
Edit. Give me a few minutes as Unbound (the default DNS server setup in 2.2) doesnt like running with Firewall Optimisation (state timeouts) set to aggressive ie it timeouts the states quickly before the various name servers get a chance to respond when the fw reboots. I've swapped it over to High Latency for now and will leave it on that for the duration of the tests, but if unbound is running when the fw reboots and the state timeouts is too short, that will screw you over as well.
-
Feel free to post findings :)
-
Are you sure the ICMP isnt working as it was around 15:00 BST (14:00 UTC) I saw the single ICMP packet come in?
What time did you do an ICMP test? Try it now and I'll keep an eye out on the logs as theres no traffic going out except when I refresh the forum webpages.
Edit : I can see one ping come in just a moment ago from an ip address .94, so is that you SM, if so I'll swap the cables over to test the EMx intel nic on the fw.
-
Mine is 80.197.148.74 right now :)
I am pinging 92.28.255.154 and requests is timing out
-
92.28.252.160
-
Reply :)
Should I hit that one?
-
Hold off for a moment as something screwing is going on with the fw rules (PF) at the moment, but I got a brief DDOS if that was you but no ICMP showed up in the fw logs from your ip address.
I need to check out whats going on with the fw rules as something isnt right this end so I'm going to reboot as resetting all states didnt reset the rules surprisingly.
edit.
The logs run about 2mins behind on this fw so any change I make and reset of states takes a few minutes longer before I can confirm its ok in the log. -
Ready?
-
I am now but had to reboot as the fw went slow when I reassigned the nics, I could still navigate around but it was sooo slow so I did a reboot and unbound takes about 10mins and increasing all the time, after a reboot before I can get back out and all the while the cpu is maxed out.
Anyway new ip is 92.24.143.49
DDOS away….
Edit. Ping rule is still in effect on the wan interface so you should be able to ping away on the ip address as well.
Edit2. This is the EMx Intel nic we are testing now as well, still dusting off the old single core amd laptop atm.
Edit2. CPU is currently around 38% ticking over with just me accessing this site at the moment and I've got the console visible so I can see if there any Interrupt Storms showing up with this nic.
-
Well I sat through that one no problem as it finished just a moment ago.
I didnt touch the RRD graph though, snort was slower but generally all ok and responsive. What I noticed with the EMx Intel nic is I did edge up a bit on bandwidth. Where as the usb nics saw a consistent 2.42Mbps this morning, the Intel nic varied between 2.42-2.45Mbps not alot more but as its got a faster connection to the MB and CPU unlike my usb nics sharing bandwith over usb hubs, again thats to be expected.
Other thing I noticed and wonder about is in the system activity for the memory was the Wired & Buf MB's increased over time (100x above avg) and this was with high latency throughout for the state time out (system:advanced:Firewall & NAT, Firewall Optimisation Option). This is with no tuning ie its pretty much out of the box pfsense so the states might be an issue if someone has a faster net connection than I do and dont tweak this.
Unbound continued to send more requests out over time (to be expected) but as the inbound was swamped so the unbound states timed out as genuine comms couldnt get through again as its a pot luck what does get though.
When I was fiddling with snort I did notice zombie processes pop up, most of the time it was 1 or 2 zombie processes but I did see once 8 zombie processes.
I will be increasing the browser timeouts so they stay trying to connect before the pf states time out as this might also make it more likely I can stay connected to the web throughout such an event.
Its certainly an interesting test.
I think next time I will have the RRD graphs running as this took the fw down on the first test this morning but I dont know why yet.
Thanks SM its been educational! :D
-
Youre welcome mate :)
-
RRD graphs use a process called php-fpm. You'll see it in top.
Also, when you run top, use the following arguments: top -HSP
It'll look like the attachment, which gives you some good resolution into what's running on the system.
![Screen Shot 2015-06-01 at 10.49.39 PM.png](/public/imported_attachments/1/Screen Shot 2015-06-01 at 10.49.39 PM.png)
![Screen Shot 2015-06-01 at 10.49.39 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2015-06-01 at 10.49.39 PM.png_thumb) -
How do you get a list that long Tim??
Mine has maybe 10 processes listed on the page
I am beginning to see this in top
0 root -92 0 0K 176K CPU4 4 1:17 98.97% kernel{em0 taskq}
-
It looks like the same output seen in the Diagnostics: System Activity webpage and on mine even throughout the ddos I had no more than 20 processes running unlike your screen shot which has a few more running.
I wonder if I load up my fw with more services to increase the processes running might increase the system exhaustion.
I'm still undecided whether this is an SMP issue or an exhaustion of some system resources, the reason I say the latter is I'm reminded of MS Small Business Server which I'd maintained since 2000, as windows has grown so has the hw requirements to the point in 2008R2 and later you needed MS SQL server running on its own box as the software components have grown in size and functionality. My favourite was SBS2003 as I had those running better than MS could, ie they could stay up for months if it wasnt the need to reboot after some sort of windows update that needed a reboot. I did have to reset the odd counter but they ran sweet, however when underload you could get those machines to bog down eventually grinding to a halt and thus a reboot, so I wonder if something similar is happening.
I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.
Still lots to learn.