• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

What is the biggest attack in GBPS you stopped

Scheduled Pinned Locked Moved
General pfSense Questions
33
737
543.3k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    mer
    last edited by Jun 2, 2015, 8:03 PM

    @firewalluser:

    I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.

    Still lots to learn.

    For quick simple stats, man vmstat  man procstat

    1 Reply Last reply Reply Quote 0
    • T
      tim.mcmanus
      last edited by Jun 2, 2015, 8:24 PM

      @firewalluser:

      PHP Errors:
      [02-Jun-2015 12:13:04 Europe/London] PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456

      PHP Errors:
      [02-Jun-2015 17:32:09 UTC] PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456

      Those are php crashes.  Don't run the webUI and see if it works.  Odd that ph would crash the machine.  I would have expected it to die as a process and then respawn.

      1 Reply Last reply Reply Quote 0
      • T
        tim.mcmanus
        last edited by Jun 2, 2015, 8:29 PM

        @Supermule:

        After fiddling with some settings under system ->tunables i got it to distribute the load better so to speak.

        No 100% cpu core and then it works fine.

        snort and the two kernel processes {em0, task}, {em1, task} are still nearly the same between the two screen shots.  Interesting that one of those kernel processes isn't sitting on a CPU in the second screen shot.

        1 Reply Last reply Reply Quote 0
        • F
          firewalluser
          last edited by Jun 2, 2015, 10:03 PM

          @tim.mcmanus:

          @firewalluser:

          PHP Errors:
          [02-Jun-2015 12:13:04 Europe/London] PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456

          PHP Errors:
          [02-Jun-2015 17:32:09 UTC] PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 265551872 bytes) in /usr/local/www/diag_packet_capture.php on line 456

          Those are php crashes.  Don't run the webUI and see if it works.  Odd that ph would crash the machine.  I would have expected it to die as a process and then respawn.

          I've got a permanent packet capture going in so I have records for posterity as the police wont do anything in some circumstances if you get hacked over here, so the more data the better as it will be possible to track down those behind hack attempts with more data, whilst also making it easier to incriminate myself as well I'm sure.

          Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

          Asch Conformity, mainly the blind leading the blind.

          1 Reply Last reply Reply Quote 0
          • F
            firewalluser
            last edited by Jun 2, 2015, 10:10 PM

            @Supermule:

            After fiddling with some settings under system ->tunables i got it to distribute the load better so to speak.

            No 100% cpu core and then it works fine.

            So what settings did you add or change?

            @mer:

            @firewalluser:

            I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.

            Still lots to learn.

            For quick simple stats, man vmstat  man procstat

            Thanks, however whats the best way to get all/as much data as possible including rule matches out of pfsense, I've got to check out security onion but if there are other ways I'm all ears. I saw nagios earlier which might be interesting to see what and how it gets data out of pfsense and other platforms, but theres things it doesnt do which I'd like.

            Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

            Asch Conformity, mainly the blind leading the blind.

            1 Reply Last reply Reply Quote 0
            • T
              tim.mcmanus
              last edited by Jun 2, 2015, 10:27 PM

              @firewalluser:

              @Supermule:

              After fiddling with some settings under system ->tunables i got it to distribute the load better so to speak.

              No 100% cpu core and then it works fine.

              So what settings did you add or change?

              @mer:

              @firewalluser:

              I'm gonna try and find something other than syslog to report back the state of the machine, perhaps security onion which I need to check out still, but I'm keen to find something that can control parts of the fw depending on some situations occurring, ie if I get a ddos, I can resave the WAN interface and get assigned a new ip address as one measure I'd like to implement. I could knock something up but its also a case of getting the data out of pfsense over and above syslog data that would make it even more useful.

              Still lots to learn.

              For quick simple stats, man vmstat  man procstat

              Thanks, however whats the best way to get all/as much data as possible including rule matches out of pfsense, I've got to check out security onion but if there are other ways I'm all ears. I saw nagios earlier which might be interesting to see what and how it gets data out of pfsense and other platforms, but theres things it doesnt do which I'd like.

              I have Security Onion set it, and for these attacks, the best IMHO is Wireshark.  It'll do more for you in the near term.  SO is a good suite of tools, but it's not for the faint of heart.  It requires work to set it up properly and to ensure rules are updated.  Also, you'll need a decent sized hard drive because it captures and logs every packet.  The more storage, the more historical.  So also won't get anything out of pfSense, in fact it won't talk to it unless you integrate snort and barnyard from one box to the other.

              nagios is also not for the faint of heart.  I went with OpenNMS because it's easy to set up, and I already know SNMP.  If you want any more granularity into the box, you'll have to go with an agent-based monitoring tool or something like vRealize Hyperic for your VMs, and that ain't cheap.

              dtrace is the best place to start if you want to get granular information out of pfSense.  It's a FreeBSD debugging tool, and that's exactly what needs to be done to determine the root cause.  And there is a somewhat steep learning curve there if you've never written or debugged your own complied code.  Not impossible, but it will take time.

              1 Reply Last reply Reply Quote 0
              • S
                Supermule Banned
                last edited by Jun 3, 2015, 6:38 AM

                This is what I have added so far. Its not perfect but way better than out of the box.

                Changing the maxlen queue did good in regards to distributing load, but it seems it doesnt survive a reboot.

                system_tunables.PNG
                system_tunables.PNG_thumb

                1 Reply Last reply Reply Quote 0
                • M
                  mer
                  last edited by Jun 3, 2015, 7:48 AM

                  Those are most likely sysctls, worst case is you can put them in /etc/sysctl.conf to survive a reboot.

                  1 Reply Last reply Reply Quote 0
                  • F
                    firewalluser
                    last edited by Jun 3, 2015, 8:21 AM

                    @mer:

                    Those are most likely sysctls, worst case is you can put them in /etc/sysctl.conf to survive a reboot.

                    Which if sysctl.conf does need to be edited, then this thread https://forum.pfsense.org/index.php?topic=81174.0 or this thread https://forum.pfsense.org/index.php?topic=94511.0 can help those changes survive the reboot because even though the threads discuss syslog.conf and system.inc files, the principle will be the same, ie the filename is not so important but the methods to keep the changes is.

                    Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                    Asch Conformity, mainly the blind leading the blind.

                    1 Reply Last reply Reply Quote 0
                    • S
                      Supermule Banned
                      last edited by Jun 3, 2015, 3:16 PM Jun 3, 2015, 2:56 PM

                      For you to enjoy

                      top -HSP

                      Youtube Video

                      top -asSHz1

                      Youtube Video

                      1 Reply Last reply Reply Quote 0
                      • T
                        tim.mcmanus
                        last edited by Jun 3, 2015, 3:22 PM

                        @Supermule:

                        For you to enjoy

                        top -HSP

                        Youtube Video

                        top -asSHz1

                        Youtube Video

                        Are you running pfSense in a hypervisor?  If so, you're dealing with an additional abstraction layer.  What does the hypervisor kernel look like?  You should, I think, be able to run top by sshing into ESXi.  However, the point is that running it on a hypervisor may mask r present issues differently than running on bare metal.

                        1 Reply Last reply Reply Quote 0
                        • S
                          Supermule Banned
                          last edited by Jun 3, 2015, 3:35 PM

                          With esxtop running

                          http://youtu.be/_dimZ1_DO_o

                          1 Reply Last reply Reply Quote 0
                          • T
                            tim.mcmanus
                            last edited by Jun 3, 2015, 4:08 PM

                            @Supermule:

                            With esxtop running

                            http://youtu.be/_dimZ1_DO_o

                            Your wait time is insane on the hypervisor.

                            The way ESXi works is that it needs all 8 cores free before it will allow the CPU in your VM to process data.  So if you've over-subcribed CPUs, your wait time goes through the roof while the VM waits for all 8 cores to become free to the VM; hence it's waiting for free cores to process data.

                            You can impair a VM considerably because of wait time and CPU availability.  It's one of the reasons why Oracle wants you to test their products on iron and will not support you if you're not using their hypervisor.  If you aren't aware of how wait times work at the hypervisor level, it'll bite you in the ass.

                            In this case perhaps the hypervisor kernel is using multiple cores on the CPU to accept data from the NIC.  As you hammer that NIC, CPU wait times go up because the hypervisor kernel has a higher priority than your VM, so your VM is left waiting.

                            When we tested FreeBSD and CentOS, my hypervisor was crushed.  It wasn't readily apparent if the issue was with the hypervisor or the VM, but I believe it was a combination of the two.

                            If you can, remove the hypervisor layer so all you have to deal with is the hardware and BIOS.

                            1 Reply Last reply Reply Quote 0
                            • S
                              Supermule Banned
                              last edited by Jun 3, 2015, 4:25 PM Jun 3, 2015, 4:19 PM

                              Its accumulated idle time. Not the actual VM waiting…

                              Run, %RUN:

                              • This value represents the percentage of absolute time the virtual machine was running on the system.

                              • If the virtual machine is unresponsive, %RUN may indicate that the guest operating system is busy conducting an operation.

                              • When %RUN is near zero and the virtual machine is unresponsive, it means that the virtual machine is idle, blocked on an operation, or is not scheduled due to resource contention. Look at other values (%WAIT, %RDY, and %CSTP) to identify resource contention.

                              • When %RUN is near the value of the number of vCPUS x 100%, it means that all vCPUs in the virtual machine are busy. This is an indicator that the guest operating system may be stuck in a operational loop. To investigate this issue further, you may need to engage the appropriate operating system vendor for assistance in identifying why the guest operating system is using all of the CPU resources.

                              • If you have engaged the guest operating system vendor, and they have determined that the issue is caused by the VMware Tools or the virtual machine hardware, it may be pertinent to suspend the virtual machine to collect additional diagnostic information.

                              Wait, %WAIT:

                              • This value represents the percentage of time the virtual machine was waiting for some VMkernel activity to complete (such as I/O) before it can continue.

                              • If the virtual machine is unresponsive and the %WAIT value is proportionally higher than %RUN, %RDY, and %CSTP, then it can indicate that the world is waiting for a VMkernel operation to complete.

                              • You may observe that the %SYS is proportionally higher than %RUN. %SYS represents the percentage of time spent by system services on behalf of the virtual machine.

                              • A high %WAIT value can be a result of a poorly performing storage device where the virtual machine is residing. If you are experiencing storage latency and timeouts, it may trigger these types of symptoms across multiple virtual machines residing in the same LUN, volume, or array depending on the scale of the storage performance issue.

                              • A high %WAIT value can also be triggered by latency to any device in the virtual machine configuration. This can include but is not limited to serial pass-through devices, parallel pass-through parallel , and USB devices. If the device suddenly stops functioning or responding, it can result in these symptoms. A common cause for a high %WAIT value is ISO files that are left mounted in the virtual machine accidentally are either deleted or moved to an alternate location. For more information, see Deleting a datastore from the Datastore inventory results in the error: device or resource busy (1015791).

                              • If there does not appear to be any backing storage or networking infrastructure issue, it may be pertinent to crash the virtual machine to collect additional diagnostic information.

                              Ready, %RDY:

                              • This value represents the percentage of time that the virtual machine is ready to execute commands, but has not yet been scheduled for CPU time due to contention with other virtual machines.

                              • Compare against the Max-Limited, %MLMTD value. This represents the amount of time that the virtual machine was ready to execute, but has not been scheduled for CPU time because the VMkernel deliberately constrained it. For more information, see the Managing Resource Pools section of the vSphere Monitoring and Performance Guide or Resource Management Guide.

                              • If the virtual machine is unresponsive or very slow and %MLMTD is low, it may indicate that the ESX host has limited CPU time to schedule for this virtual machine.

                              Co-stop, %CSTP:

                              • This value represents the percentage of time that the virtual machine is ready to execute commands but that it is waiting for the availability of multiple CPUs as the virtual machine is configured to use multiple vCPUs.

                              • If the virtual machine is unresponsive and %CSTP is proportionally high compared to %RUN, it may indicate that the ESX host has limited CPU resources, simultaneously co-schedule all vCPUs in this virtual machine.

                              • Review the usage of virtual machines running with multiple vCPUs on this host. For example, a virtual machine with four vCPUs may need to schedule 4 pCPUs to do an operation. If there are multiple virtual machines configured in this way, it may lead to CPU contention and resource starvation.

                              1 Reply Last reply Reply Quote 0
                              • T
                                tim.mcmanus
                                last edited by Jun 3, 2015, 4:24 PM

                                @Supermule:

                                Its accumulated idle time. Not the actual VM waiting…

                                I disagree.  Try dropping the cores down to two and see if that wait time drops.

                                I actually had a conversation with VMware about this yesterday.  Our SQL farm is getting crushed by wait time due to CPU over-subscription.  We're about to start a VM right-sizing exercise to address the issue.

                                1 Reply Last reply Reply Quote 0
                                • S
                                  Supermule Banned
                                  last edited by Jun 3, 2015, 4:46 PM Jun 3, 2015, 4:43 PM

                                  %RUN = 290.51 (could be 400%) so it means not all CPU's are busy
                                  %WAIT=408.70 (Could be my NAS at home where the VMresides, Synology 1813+)
                                  %RDY=0.18 (% time a vCPU was ready to be scheduled on a physical processor but couldn’t due to processor contention. Threshold: 10% per vCPU)
                                  %MLMTD=0.00 (% time the VM was ready to run but wasn’t scheduled because it would violate the CPU Limit set. Threshold: 0%)
                                  %CTSP=0.00% (% time a vCPU in an SMP virtual machine is “stopped” from executing, so that another vCPU in the same virtual machine could be run to “catch-up” and make sure the skew between the two virtual processors doesn’t grow too large. Threshold: 3%)

                                  It doesnt indicate that the hypervisor subsystem suffers in any way, but storage wise(maybe).

                                  CTSP_capture.PNG_thumb
                                  CTSP_capture.PNG

                                  1 Reply Last reply Reply Quote 0
                                  • T
                                    tim.mcmanus
                                    last edited by Jun 3, 2015, 4:54 PM

                                    Yes, not all CPU will be busy, but all CPUs allocated to a VM must be present for the VM to compute.

                                    For example, if you have 16 cores and 12 are busy, and a VM with 8 cores comes along, it will wait until 8 cores are available to process data.  It cannot use the 4 cores available in the example above, it must find 8 available cores and then compute whether or not it needs all 8 cores.  It's an allocation model.  So CPU contention can create excessive wait times while CPU utilization is very low.

                                    1 Reply Last reply Reply Quote 0
                                    • S
                                      Supermule Banned
                                      last edited by Jun 3, 2015, 5:00 PM

                                      Yes but it didnt :)

                                      I can reissue 8 cores and test again. :)

                                      1 Reply Last reply Reply Quote 0
                                      • S
                                        Supermule Banned
                                        last edited by Jun 3, 2015, 5:07 PM

                                        Even with 8 cores it doesnt get past 0.33 %CSTP :)

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          Supermule Banned
                                          last edited by Jun 3, 2015, 5:19 PM

                                          I rebooted the server behind that receives the traffic while attacking.

                                          Packet loss still occured and then I shut it down completely.

                                          In the end you see the traffic settle in on around 7.5mbit/s and become a straight line.

                                          Then packets begin to flow and CPU4 settles below 100%.

                                          Youtube Video

                                          What you reckon Tim?

                                          1 Reply Last reply Reply Quote 0
                                          • S
                                            Supermule Banned
                                            last edited by Jun 3, 2015, 5:34 PM

                                            This is the IOPS I see on the storage attached as NFS during an attack.

                                            iops_SAN.PNG
                                            iops_SAN.PNG_thumb

                                            1 Reply Last reply Reply Quote 0
                                            • T
                                              tim.mcmanus
                                              last edited by Jun 3, 2015, 5:58 PM

                                              @Supermule:

                                              What you reckon Tim?

                                              I reckon it's going to be very complicated troubleshooting an apparent kernel issue in FreeBSD while it's running on a hypervisor.

                                              1 Reply Last reply Reply Quote 0
                                              • F
                                                firewalluser
                                                last edited by Jun 3, 2015, 6:00 PM

                                                @tim.mcmanus:

                                                Yes, not all CPU will be busy, but all CPUs allocated to a VM must be present for the VM to compute.

                                                For example, if you have 16 cores and 12 are busy, and a VM with 8 cores comes along, it will wait until 8 cores are available to process data.  It cannot use the 4 cores available in the example above, it must find 8 available cores and then compute whether or not it needs all 8 cores.  It's an allocation model.  So CPU contention can create excessive wait times while CPU utilization is very low.

                                                Correct. https://communities.vmware.com/message/2275523

                                                IF anything you'd be better off reducing the number of cores a VM needs as minimum rather than give it the most it can use, as the overhead switching or timeslicing at the host (ESXi) level can be further reduced just like sometimes its faster to run software single threaded on a single core than a multithread app across all cores, as the latter introduces more locks & overheads at the sw level as well as the cpu level not to mention having to share the bus to devices like the hw, nics or ram.

                                                Besides running alot of OS tended to be faster as a VM instead of bare metal, eg I can install W7x32 as a VM in just 6mins 23 seconds from hitting play on the VM for the first time, various reboots, assigning username & password, updates settings all the way to being on the desktop for the very first time. I might be able to shave a few more seconds off by not running other VM's at the same time, but I cant do that time installing W7x32 on the same bare metal machine which is a testament to VMware.

                                                Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                                Asch Conformity, mainly the blind leading the blind.

                                                1 Reply Last reply Reply Quote 0
                                                • T
                                                  tim.mcmanus
                                                  last edited by Jun 4, 2015, 2:12 AM

                                                  @Supermule:

                                                  Wait, %WAIT:

                                                  • This value represents the percentage of time the virtual machine was waiting for some VMkernel activity to complete (such as I/O) before it can continue.

                                                  • If the virtual machine is unresponsive and the %WAIT value is proportionally higher than %RUN, %RDY, and %CSTP, then it can indicate that the world is waiting for a VMkernel operation to complete.

                                                  • You may observe that the %SYS is proportionally higher than %RUN. %SYS represents the percentage of time spent by system services on behalf of the virtual machine.

                                                  • A high %WAIT value can be a result of a poorly performing storage device where the virtual machine is residing. If you are experiencing storage latency and timeouts, it may trigger these types of symptoms across multiple virtual machines residing in the same LUN, volume, or array depending on the scale of the storage performance issue.

                                                  • A high %WAIT value can also be triggered by latency to any device in the virtual machine configuration. This can include but is not limited to serial pass-through devices, parallel pass-through parallel , and USB devices. If the device suddenly stops functioning or responding, it can result in these symptoms. A common cause for a high %WAIT value is ISO files that are left mounted in the virtual machine accidentally are either deleted or moved to an alternate location. For more information, see Deleting a datastore from the Datastore inventory results in the error: device or resource busy (1015791).

                                                  • If there does not appear to be any backing storage or networking infrastructure issue, it may be pertinent to crash the virtual machine to collect additional diagnostic information.

                                                  You had an 800%-900% wait time in the video you posted.  Holy crap, that's insane!  See the part above that talks about latency.  This is a smoking gun!

                                                  I've been very consistent when I've said that testing pfSense on a VM isn't going to give you tangible data because you'll also need to troubleshoot the hypervisor layer at the same time.

                                                  Please, please, please stop wasting your time testing this issue on a hypervisor.  Put pfSense on bare metal and test it there.

                                                  1 Reply Last reply Reply Quote 0
                                                  • dennypageD
                                                    dennypage
                                                    last edited by Jun 4, 2015, 4:02 AM

                                                    I couldn't agree more.

                                                    @tim.mcmanus:

                                                    Please, please, please stop wasting your time testing this issue on a hypervisor.  Put pfSense on bare metal and test it there.

                                                    1 Reply Last reply Reply Quote 0
                                                    • N
                                                      NOYB
                                                      last edited by Jun 4, 2015, 5:28 AM

                                                      @dennypage:

                                                      I couldn't agree more.

                                                      @tim.mcmanus:

                                                      Please, please, please stop wasting your time testing this issue on a hypervisor.  Put pfSense on bare metal and test it there.

                                                      Ditto

                                                      Too much shot-gunning going on in this effort instead of a methodical systematic approach.

                                                      Begin with minimalist install / config (bare metal, no packages, no services, etc.) and work up to point of failure.

                                                      If the minimalist install / config fails then go back even further to either earlier pfSense versions, or better yet to FreeBSD itself until the issue does not exist.  Then systematically move forward  adding to that config until the issue appears.

                                                      1 Reply Last reply Reply Quote 0
                                                      • S
                                                        Supermule Banned
                                                        last edited by Jun 4, 2015, 7:57 AM Jun 4, 2015, 7:39 AM

                                                        Jun 4 09:28:36    check_reload_status: Reloading filter
                                                        Jun 4 09:28:36    check_reload_status: Restarting OpenVPN tunnels/interfaces
                                                        Jun 4 09:28:36    check_reload_status: Restarting ipsec tunnels
                                                        Jun 4 09:28:36    check_reload_status: updating dyndns Yousee
                                                        Jun 4 09:06:36    check_reload_status: Reloading filter
                                                        Jun 4 09:06:33    check_reload_status: Syncing firewall
                                                        Jun 4 09:05:08    check_reload_status: Reloading filter
                                                        Jun 4 09:05:06    check_reload_status: Syncing firewall
                                                        Jun 4 09:01:12    kernel: em0: promiscuous mode enabled

                                                        When this happens the firewall encounters packetloss.

                                                        Disabling promiscious mode on em0 and em1 solves it and makes it endure.

                                                        Running CRON job every 60 seconds will make you survive a SYN flood.

                                                        1 Reply Last reply Reply Quote 0
                                                        • F
                                                          firewalluser
                                                          last edited by Jun 4, 2015, 8:02 AM

                                                          So whats the number of CPU's and cores available?

                                                          What VM's are you running and how many CPU's/Cores do each need as a minimum?

                                                          Possibly the best way to visualise this is a bit like a game of tetris but various sized horizontal sized blocks which are your minimum core requirements for the VM OS.

                                                          If you have a 4 core baremetal then your Tetris game is just four blocks wide, if its a 2 cpu, 8 core, the your Tetris game is 16 blocks wide.

                                                          When the blocks reach the base, thats your hypervisor's timeslice to work on the physical cpu's/cores.

                                                          Your task is to try and fit as many horizontal blocks representing the core requirement of your VM OS's to use up all the space. This is why its best to run VM's as a minimum sized cores/vCPU's in most cases.

                                                          eg say you have 4 OS's and 1 cpu/4 cores.
                                                          OS1 needs 2 cores min but can use 10 cores max.
                                                          OS2 needs 4 cores min but can use 8 cores max
                                                          OS3 needs 1 core min but can use 4 cores max
                                                          OS4 needs 1 core min but can use 2 cores max.

                                                          Whats the most efficient way to set these OS's up as VM's in order to maximise the time slices for each?

                                                          If you went:
                                                          OS1 4cores
                                                          OS2 4cores
                                                          OS3 4cores
                                                          OS4 2cores

                                                          When ever OS4 had its time slice you waste 2 physical cores.

                                                          This approach would also make for a clunky setup because if any OS needs to talk to another, then you have no two OS's running in the same time slice to communicate with each other, they will all have to wait 4 time slices to before they can get back and process their stuff.

                                                          If you went
                                                          OS1 2cores
                                                          OS2 4cores
                                                          OS3 1cores
                                                          OS4 1cores

                                                          Then OS3 & OS4 can run when ever OS1 runs so you now have a "block" where you in effect have OS1, OS3 and OS4 that can operate in one timeslice and OS2 can operate in another timeslice. ESXi only has to swap between two different blocks of OS's (OS1,OS3 & OS4) and OS2, which makes for a more responsive setup as the OS's in the first block can communicate between themselves if need be in teh same timeslice and so the only extra wait time for any communication is swapping from block 1(OS1, OS3 & OS4) to block 2(OS2) which is running OS2.

                                                          Does that make sense and easier to understand?

                                                          Its more complicated than that because next you have RAM requirements to consider as well, but a similar principle exists, ie if you err towards less ram, ESXi doesnt have to spend time loading and unloading ram for each VM running in the timeslice. Its best to have the ram requirements for each VM' fit within the physical ram.

                                                          So if you take the 2nd example above.

                                                          OS1 2cores
                                                          OS2 4cores
                                                          OS3 1cores
                                                          OS4 1cores

                                                          You have lets say 32Gb of physical ram.

                                                          OS1 can use 8GB to 32GB
                                                          OS2 can use 4GB to 16GB
                                                          OS3 can use 2GB to 4GB
                                                          OS3 can use 4GB to 16GB

                                                          OS1 2cores/32GB
                                                          OS2 4cores/16GB
                                                          OS3 1cores/4GB
                                                          OS4 1cores/16GB

                                                          Then even though the first block(OS1,OS3&OS4) can share the physical CPU's, they cant share the physical as you would need 42GB of physical ram.

                                                          But if you went

                                                          OS1 2cores/16GB
                                                          OS2 4cores/16GB
                                                          OS3 1cores/4GB
                                                          OS4 1cores/8GB

                                                          Then the first block (OS1,OS3&OS4) can share the physical ram as the total amount is 32GB, and the 2nd block(OS2) will use 16Gb with 16Gb going spare doing nothing.

                                                          If you notice I gave OS1 16GB, this is because even though OS4 can also use 16GB its only got 1 core.

                                                          However you then also need to look at what the tasks are that are going to be running on each VM.

                                                          Databases love ram, the more ram you have the more you can load the DB into ram already sorted into the most popular views that users use the most.

                                                          MS Exchange is similar ie its just a big DB but it offloads alot of its work to the workstation so Outlook will often have a copy of what is stored in Exchange so outlook only has to access the local disk, but Exchange will have lots of connections like keep-alive open for smart phones so that it can "push" emails & other things to the phones.

                                                          Webservers depends on what they are doing, some maybe front for DB's running on the same VM or maybe not, but hopefully that will give you a better overview of whats going on at a lower level.  :)

                                                          Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                                          Asch Conformity, mainly the blind leading the blind.

                                                          1 Reply Last reply Reply Quote 0
                                                          • S
                                                            Supermule Banned
                                                            last edited by Jun 4, 2015, 8:27 AM

                                                            I have divided the 2 running VM's here at home on 2 different sockets with 4 cores each.

                                                            Didnt matter at all until I ran the cron job disabling promiscous mode.

                                                            Its in the reject state on the hypervisor Vswitch allready….

                                                            1 Reply Last reply Reply Quote 0
                                                            • L
                                                              lowprofile
                                                              last edited by Jun 4, 2015, 8:32 AM

                                                              @dennypage:

                                                              I couldn't agree more.

                                                              @tim.mcmanus:

                                                              Please, please, please stop wasting your time testing this issue on a hypervisor.  Put pfSense on bare metal and test it there.

                                                              IT ISN'T BETTER ON BAREMETAL. Problem still exist. I tried several times on my bare metal supermicro.
                                                              Read the thread and the other threads again. You will see the history.

                                                              • i am though not using pfsense anymore. So i can no more test
                                                              1 Reply Last reply Reply Quote 0
                                                              • F
                                                                firewalluser
                                                                last edited by Jun 4, 2015, 8:43 AM

                                                                @Supermule:

                                                                I have divided the 2 running VM's here at home on 2 different sockets with 4 cores each.

                                                                Didnt matter at all until I ran the cron job disabling promiscous mode.

                                                                Its in the reject state on the hypervisor Vswitch allready….

                                                                You know about promiscuous mode can mess up some nics? You'll see this note in the packet capture amongst other places.

                                                                What are your System:Advanced:Networking, Networking tab, Network checkboxes set to?

                                                                How did you install the VM?
                                                                Did you install from iso on the ESXi server or setup pfsense on a different bare metal/VMware host like VMware workstation, cloned it then moved it to the ESXi server in question?

                                                                pfsense is one vm, whats the other VM?

                                                                Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                                                Asch Conformity, mainly the blind leading the blind.

                                                                1 Reply Last reply Reply Quote 0
                                                                • S
                                                                  Supermule Banned
                                                                  last edited by Jun 4, 2015, 8:48 AM

                                                                  1: Yes.

                                                                  2: Picture attached.

                                                                  3: From ISO directly on to the ESXi

                                                                  4: Homeserver (Windows 2008 R2)

                                                                  pfsense_advanced_networking.PNG
                                                                  pfsense_advanced_networking.PNG_thumb

                                                                  1 Reply Last reply Reply Quote 0
                                                                  • S
                                                                    Supermule Banned
                                                                    last edited by Jun 4, 2015, 9:06 AM

                                                                    Disabling Apinger so the interface doesnt get restarted all the time during an attack.

                                                                    IN the traffic graph you are able to see the drop in traffic after the ifconfig em0 -promisc reload in cron.

                                                                    That makes the firewall come alive and start routing packets again.

                                                                    I have attached the screenshot before and after the reload as running top -HSP

                                                                    traffic_drop.PNG
                                                                    traffic_drop.PNG_thumb
                                                                    ![top -HSP before reload of -promisc.PNG](/public/imported_attachments/1/top -HSP before reload of -promisc.PNG)
                                                                    ![top -HSP before reload of -promisc.PNG_thumb](/public/imported_attachments/1/top -HSP before reload of -promisc.PNG_thumb)
                                                                    ![top -HSP after reload of -promisc.PNG](/public/imported_attachments/1/top -HSP after reload of -promisc.PNG)
                                                                    ![top -HSP after reload of -promisc.PNG_thumb](/public/imported_attachments/1/top -HSP after reload of -promisc.PNG_thumb)

                                                                    1 Reply Last reply Reply Quote 0
                                                                    • F
                                                                      firewalluser
                                                                      last edited by Jun 4, 2015, 9:25 AM

                                                                      In your pic, the first ticked checkbox is normally unticked for default.

                                                                      Have you been toggling these?
                                                                      If so notice any difference?

                                                                      Have you been through the pfsense 2 on VMware ESXI 5.5 pfsense docs to check settings?

                                                                      When you run the DDOS is homeserver running as well? I know thats your aim ultimately, but does pfsense perform better without it running?

                                                                      Do you have the Sata driver installed? Check out airvpn.org/topic/11847-pfsense-performance-configs-on-esxi-vmware/
                                                                      that might be a lead?

                                                                      Hows your management channel setup?

                                                                      Apologies if you have posted your VM settings, I dont recall seeing them, but if you havent can you post them as its a case of trying to see if thats been setup properly and not causing the problem which is making pfsense fail under the ddos. We cant rule the ESXi VM guest settings out just yet imo.

                                                                      Got to go out for a couple hours now, but its definately worth going back over all the settings.

                                                                      Have you even setup a basic pfsense with minimal settings, no packages, no config changes other than ip address changes for nics to see how that copes with the DDOS?

                                                                      I think this is a back to basics moment like others have suggested, although I know baremetal isnt an option, but making sure the guest is configured right and then installing a basic pfsense installation would be my next move. If that handles the DDOS, I'd pull the XML backups and compare differences as its easy to miss something when toggling various settings in situations like this.

                                                                      Good luck!  :)

                                                                      @Supermule:

                                                                      Disabling Apinger so the interface doesnt get restarted all the time during an attack.

                                                                      IN the traffic graph you are able to see the drop in traffic after the ifconfig em0 -promisc reload in cron.

                                                                      That makes the firewall come alive and start routing packets again.

                                                                      I have attached the screenshot before and after the reload as running top -HSP

                                                                      Although I could only get 2.42Mbps, apinger was still getting out for me as its only got to ping some ip addresses a couple hops away unlike your ddos traffic which is coming from all around the world and thus further away. The network infrastructure would let ip addresses closer to me get through as the bottle necks would be further away.

                                                                      Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                                                      Asch Conformity, mainly the blind leading the blind.

                                                                      1 Reply Last reply Reply Quote 0
                                                                      • S
                                                                        Supermule Banned
                                                                        last edited by Jun 4, 2015, 9:55 AM

                                                                        @firewalluser:

                                                                        In your pic, the first ticked checkbox is normally unticked for default.

                                                                        Have you been toggling these?
                                                                        If so notice any difference?

                                                                        Not seen any difference at all.

                                                                        Have you been through the pfsense 2 on VMware ESXI 5.5 pfsense docs to check settings?

                                                                        Yes

                                                                        When you run the DDOS is homeserver running as well?

                                                                        Yes.

                                                                        I know thats your aim ultimately, but does pfsense perform better without it running?

                                                                        No difference.

                                                                        Do you have the Sata driver installed?

                                                                        No running on a scsi controller.

                                                                        Hows your management channel setup?

                                                                        Not understood. Using vsphere client, console and Putty.

                                                                        Apologies if you have posted your VM settings, I dont recall seeing them, but if you havent can you post them as its a case of trying to see if thats been setup properly and not causing the problem which is making pfsense fail under the ddos. We cant rule the ESXi VM guest settings out just yet imo.

                                                                        Youtube Video

                                                                        Got to go out for a couple hours now, but its definately worth going back over all the settings.

                                                                        Enjoy.

                                                                        Have you even setup a basic pfsense with minimal settings, no packages, no config changes other than ip address changes for nics to see how that copes with the DDOS?

                                                                        Yes. It didnt do very well.

                                                                        I think this is a back to basics moment like others have suggested, although I know baremetal isnt an option, but making sure the guest is configured right and then installing a basic pfsense installation would be my next move. If that handles the DDOS, I'd pull the XML backups and compare differences as its easy to miss something when toggling various settings in situations like this.

                                                                        I haves asked Tim and Almabes if they have any available. Apparently it does make any difference.

                                                                        Good luck!  :)

                                                                        Thanks :)

                                                                        1 Reply Last reply Reply Quote 0
                                                                        • S
                                                                          Supermule Banned
                                                                          last edited by Jun 4, 2015, 11:10 AM

                                                                          Any way to monitor cron jobs in real time??

                                                                          Tried crontab -l but it says it cannot find any for the user root…

                                                                          Tried changing it to crontab - admin -l but that doesnt work either.

                                                                          I want to have a console running so I can see specifically when cron is run since it doesnt say anything in the system logs.

                                                                          This is a real bitch to trouble shoot internally.................................

                                                                          1 Reply Last reply Reply Quote 0
                                                                          • M
                                                                            mer
                                                                            last edited by Jun 4, 2015, 11:51 AM

                                                                            cat /etc/crontab
                                                                            log for cron is typically /var/log/cron

                                                                            Looks like there is also a few things run by "minicron", look at /etc/rc:

                                                                            rc:cd /tmp && /usr/sbin/cron -s 2>/dev/null
                                                                            rc:/usr/local/bin/minicron 240 $varrunpath/ping_hosts.pid /usr/local/bin/ping_hosts.sh
                                                                            rc:/usr/local/bin/minicron 3600 $varrunpath/expire_accounts.pid '/usr/local/sbin/fcgicli -f /etc/rc.expireaccounts'
                                                                            rc:/usr/local/bin/minicron 86400 $varrunpath/update_alias_url_data.pid '/usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data'
                                                                            rc:    /usr/local/bin/minicron 60 /var/run/gmirror_status_check.pid /usr/local/sbin/gmirror_status_check.php

                                                                            1 Reply Last reply Reply Quote 0
                                                                            • F
                                                                              firewalluser
                                                                              last edited by Jun 4, 2015, 11:56 AM

                                                                              Hows your management channel setup?

                                                                              Not understood. Using vsphere client, console and Putty."

                                                                              Either in the pfsense docs or the AirVPN link it mentions having the management channel setup a particular way, might be worth checking out.

                                                                              Are you still getting the massive waits?

                                                                              www.yellow-bricks.com/2012/07/17/why-is-wait-so-high/

                                                                              Your %wait figure might be a red herring as we say.  :)

                                                                              Basically the link suggests the %wait includes idle time and %vmwait might be a better figure.

                                                                              %wait is like MS including CPU idle time in the CPU processor load.  ::)

                                                                              Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                                                              Asch Conformity, mainly the blind leading the blind.

                                                                              1 Reply Last reply Reply Quote 0
                                                                              • T
                                                                                tim.mcmanus
                                                                                last edited by Jun 4, 2015, 12:02 PM

                                                                                @firewalluser:

                                                                                Hows your management channel setup?

                                                                                Not understood. Using vsphere client, console and Putty."

                                                                                Either in the pfsense docs or the AirVPN link it mentions having the management channel setup a particular way, might be worth checking out.

                                                                                Are you still getting the massive waits?

                                                                                www.yellow-bricks.com/2012/07/17/why-is-wait-so-high/

                                                                                Your %wait figure might be a red herring as we say.  :)

                                                                                Basically the link suggests the %wait includes idle time and %vmwait might be a better figure.

                                                                                %wait is like MS including CPU idle time in the CPU processor load.  ::)

                                                                                %wait is very important.

                                                                                Network->NIC->hypervisor Kernel->VM->VM NIC->VM kernel (and then back down the stack to move a packet)

                                                                                If the hypervisor kernel is consuming resources, the VM isn't going to get any.  So your VM wait time will be high, but the VM kernel activity will be low.  You could also be dropping packets prior to getting to the VM because of high wait times.  This is why enterprises regularly go through right-sizing activities.  You'll see low CPU utilization on your VMs but very high wait times, and your performance will mysteriously suck.

                                                                                1 Reply Last reply Reply Quote 0
                                                                                • F
                                                                                  firewalluser
                                                                                  last edited by Jun 4, 2015, 12:19 PM

                                                                                  Have you tried earlier versions of ESXi and/or pfsense?

                                                                                  Are you still on pfsense 2.1 or did you try 2.2?

                                                                                  It might be a build conflict somewhere, maybe even at the bios level. I've seen bios make some hw drag.

                                                                                  Can you try same setup on different hw maybe with a different provider?

                                                                                  Might be worth spinning something up on the amazon cloud to test although thats got its own custom build of pfsense, so maybe have a look at those settings to see what differences there are. You might get some clues from that as to what settings are important.

                                                                                  Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                                                                  Asch Conformity, mainly the blind leading the blind.

                                                                                  1 Reply Last reply Reply Quote 0
                                                                                  543 out of 737
                                                                                  • First post
                                                                                    Last post
                                                                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.

                                                                                  Looks like your connection to Netgate Forum was lost, please wait while we try to reconnect.