What is the biggest attack in GBPS you stopped
-
RRD graphs use a process called php-fpm. You'll see it in top.
Also, when you run top, use the following arguments: top -HSP
It'll look like the attachment, which gives you some good resolution into what's running on the system.
![Screen Shot 2015-06-01 at 10.49.39 PM.png](/public/imported_attachments/1/Screen Shot 2015-06-01 at 10.49.39 PM.png)
![Screen Shot 2015-06-01 at 10.49.39 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2015-06-01 at 10.49.39 PM.png_thumb) -
How do you get a list that long Tim??
Mine has maybe 10 processes listed on the page
I am beginning to see this in top
0 root -92 0 0K 176K CPU4 4 1:17 98.97% kernel{em0 taskq}
-
It looks like the same output seen in the Diagnostics: System Activity webpage and on mine even throughout the ddos I had no more than 20 processes running unlike your screen shot which has a few more running.
I wonder if I load up my fw with more services to increase the processes running might increase the system exhaustion.
I'm still undecided whether this is an SMP issue or an exhaustion of some system resources, the reason I say the latter is I'm reminded of MS Small Business Server which I'd maintained since 2000, as windows has grown so has the hw requirements to the point in 2008R2 and later you needed MS SQL server running on its own box as the software components have grown in size and functionality. My favourite was SBS2003 as I had those running better than MS could, ie they could stay up for months if it wasnt the need to reboot after some sort of windows update that needed a reboot. I did have to reset the odd counter but they ran sweet, however when underload you could get those machines to bog down eventually grinding to a halt and thus a reboot, so I wonder if something similar is happening.
I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.
Still lots to learn.
-
I can ssh into the box from another source, and I have a 27" screen, so that gives me better screen output on the CLI.
-
Didnt think of that. :D
![top -HSP.PNG](/public/imported_attachments/1/top -HSP.PNG)
![top -HSP.PNG_thumb](/public/imported_attachments/1/top -HSP.PNG_thumb) -
Whilst I remember the crash reports.
Crash report begins. Anonymous machine information:
amd64
10.1-RELEASE-p9
FreeBSD 10.1-RELEASE-p9 #0 57b23e7(releng/10.1)-dirty: Mon Apr 13 20:30:25 CDT 2015 root@pfs22-amd64-builder:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10Crash report details:
PHP Errors:
[02-Jun-2015 12:13:04 Europe/London] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456Filename: /var/crash/minfree
2048and
Crash report begins. Anonymous machine information:
amd64
10.1-RELEASE-p9
FreeBSD 10.1-RELEASE-p9 #0 57b23e7(releng/10.1)-dirty: Mon Apr 13 20:30:25 CDT 2015 root@pfs22-amd64-builder:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10Crash report details:
PHP Errors:
[02-Jun-2015 17:32:09 UTC] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456Looks like pfsense doesnt like packet capturing a ddos ;D, havent checked the line out yet but I couldnt find anything in /var/crash/, although having the packet capture shut down nicely would be ideal.
It might be useful if the crash reports werent deleted after a reboot and I dont know if this got sent through to ESF or not and only one file mentioned in the first report, no additional file in the 2nd crash report.
I also dont think the Wired & Buf MB's increases were anything to do with the packet capture as it looks the packet cpature crashed well before the end of the test.
-
After fiddling with some settings under system ->tunables i got it to distribute the load better so to speak.
No 100% cpu core and then it works fine.
![top -HSP_1.PNG](/public/imported_attachments/1/top -HSP_1.PNG)
![top -HSP_1.PNG_thumb](/public/imported_attachments/1/top -HSP_1.PNG_thumb) -
I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.
Still lots to learn.
For quick simple stats, man vmstat man procstat
-
PHP Errors:
[02-Jun-2015 12:13:04 Europe/London] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456PHP Errors:
[02-Jun-2015 17:32:09 UTC] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456Those are php crashes. Don't run the webUI and see if it works. Odd that ph would crash the machine. I would have expected it to die as a process and then respawn.
-
After fiddling with some settings under system ->tunables i got it to distribute the load better so to speak.
No 100% cpu core and then it works fine.
snort and the two kernel processes {em0, task}, {em1, task} are still nearly the same between the two screen shots. Interesting that one of those kernel processes isn't sitting on a CPU in the second screen shot.
-
PHP Errors:
[02-Jun-2015 12:13:04 Europe/London] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456PHP Errors:
[02-Jun-2015 17:32:09 UTC] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456Those are php crashes. Don't run the webUI and see if it works. Odd that ph would crash the machine. I would have expected it to die as a process and then respawn.
I've got a permanent packet capture going in so I have records for posterity as the police wont do anything in some circumstances if you get hacked over here, so the more data the better as it will be possible to track down those behind hack attempts with more data, whilst also making it easier to incriminate myself as well I'm sure.
-
After fiddling with some settings under system ->tunables i got it to distribute the load better so to speak.
No 100% cpu core and then it works fine.
So what settings did you add or change?
@mer:
I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.
Still lots to learn.
For quick simple stats, man vmstat man procstat
Thanks, however whats the best way to get all/as much data as possible including rule matches out of pfsense, I've got to check out security onion but if there are other ways I'm all ears. I saw nagios earlier which might be interesting to see what and how it gets data out of pfsense and other platforms, but theres things it doesnt do which I'd like.
-
After fiddling with some settings under system ->tunables i got it to distribute the load better so to speak.
No 100% cpu core and then it works fine.
So what settings did you add or change?
@mer:
I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.
Still lots to learn.
For quick simple stats, man vmstat man procstat
Thanks, however whats the best way to get all/as much data as possible including rule matches out of pfsense, I've got to check out security onion but if there are other ways I'm all ears. I saw nagios earlier which might be interesting to see what and how it gets data out of pfsense and other platforms, but theres things it doesnt do which I'd like.
I have Security Onion set it, and for these attacks, the best IMHO is Wireshark. It'll do more for you in the near term. SO is a good suite of tools, but it's not for the faint of heart. It requires work to set it up properly and to ensure rules are updated. Also, you'll need a decent sized hard drive because it captures and logs every packet. The more storage, the more historical. So also won't get anything out of pfSense, in fact it won't talk to it unless you integrate snort and barnyard from one box to the other.
nagios is also not for the faint of heart. I went with OpenNMS because it's easy to set up, and I already know SNMP. If you want any more granularity into the box, you'll have to go with an agent-based monitoring tool or something like vRealize Hyperic for your VMs, and that ain't cheap.
dtrace is the best place to start if you want to get granular information out of pfSense. It's a FreeBSD debugging tool, and that's exactly what needs to be done to determine the root cause. And there is a somewhat steep learning curve there if you've never written or debugged your own complied code. Not impossible, but it will take time.
-
This is what I have added so far. Its not perfect but way better than out of the box.
Changing the maxlen queue did good in regards to distributing load, but it seems it doesnt survive a reboot.
-
Those are most likely sysctls, worst case is you can put them in /etc/sysctl.conf to survive a reboot.
-
@mer:
Those are most likely sysctls, worst case is you can put them in /etc/sysctl.conf to survive a reboot.
Which if sysctl.conf does need to be edited, then this thread https://forum.pfsense.org/index.php?topic=81174.0 or this thread https://forum.pfsense.org/index.php?topic=94511.0 can help those changes survive the reboot because even though the threads discuss syslog.conf and system.inc files, the principle will be the same, ie the filename is not so important but the methods to keep the changes is.
-
-
For you to enjoy
top -HSP
top -asSHz1
Are you running pfSense in a hypervisor? If so, you're dealing with an additional abstraction layer. What does the hypervisor kernel look like? You should, I think, be able to run top by sshing into ESXi. However, the point is that running it on a hypervisor may mask r present issues differently than running on bare metal.
-
With esxtop running
http://youtu.be/_dimZ1_DO_o
-
With esxtop running
http://youtu.be/_dimZ1_DO_o
Your wait time is insane on the hypervisor.
The way ESXi works is that it needs all 8 cores free before it will allow the CPU in your VM to process data. So if you've over-subcribed CPUs, your wait time goes through the roof while the VM waits for all 8 cores to become free to the VM; hence it's waiting for free cores to process data.
You can impair a VM considerably because of wait time and CPU availability. It's one of the reasons why Oracle wants you to test their products on iron and will not support you if you're not using their hypervisor. If you aren't aware of how wait times work at the hypervisor level, it'll bite you in the ass.
In this case perhaps the hypervisor kernel is using multiple cores on the CPU to accept data from the NIC. As you hammer that NIC, CPU wait times go up because the hypervisor kernel has a higher priority than your VM, so your VM is left waiting.
When we tested FreeBSD and CentOS, my hypervisor was crushed. It wasn't readily apparent if the issue was with the hypervisor or the VM, but I believe it was a combination of the two.
If you can, remove the hypervisor layer so all you have to deal with is the hardware and BIOS.