Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    What is the biggest attack in GBPS you stopped

    Scheduled Pinned Locked Moved General pfSense Questions
    737 Posts 33 Posters 817.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      Supermule Banned
      last edited by

      I tested like mad yesterday.

      This is pfsense running stateless:

      Youtube Video

      This is pfsense running SYN proxy

      Youtube Video

      This is pfsense running keep state

      Youtube Video

      This is pfsense running 3 attacks in a row. As you see it handles them differently. Look at top. When core nr.4 hits 100%, packet loss is there. When it doesnt, then it survives and still routes the traffic to webserver behind.

      Youtube Video

      To test some other things, I changed MBUFS to see if it fared batter. Sometimes it did, sometimes it didnt.

      Youtube Video

      To chime in on whats happening here, is I want to know what runs on specific cores and I havent found out yet. I cant find a tool that does that.

      The attached picture is of the 2nd and 3rd attack in 3attacks video. In the video you see packetloss and GW offline because Core4 hits 100%. It doesnt in the 2nd attack in the video.

      I have top -P running and you dont see any interrupt storm and overall load is not very high. You dont even see the core that runs 100% and I have no idea why you dont. Thats why I specifically wants to target core nr. 4 to see any process running of that core.

      That would be the bottleneck in pfsense and something we can deal with.

      My hypervisor (ESXi 4.1U3) doesnt get unstable or unresponsive in any way. The targeted server is handling traffic fine and doesnt become unstable.

      I would really much like some of the dev's to lend a hand for this and get a instance of Dtrace going on the box or any other box we can test.

      I know that the box can handle it if the bottleneck is dealt with.

      About the stateful vs. stateless discussion:

      he biggest difference between simple IP filtering and stateful IP filtering is that simple IP filters have no recollection of packets that have already passed through the filter. Every packet is handled on an individual basis. Previously forwarded packets belonging to a connection have no bearing on the filter's decision to forward or drop the packet.

      Stateful firewall (any firewall that performs stateful packet inspection or stateful inspection) is a firewall that keeps track of the state of network connections (such as TCP streams) travelling across it. The firewall is programmed to distinguish legitimate packets for different types of connections. Only packets matching a known connection state will be allowed by the firewall; others will be rejected.

      stateless firewall is a firewall that treats each network frame (or packet) in isolation. Such a firewall has no way of knowing if any given packet is part of an existing connection, is trying to establish a new connection, or is just a rogue packet.

      It doesnt matter to pfsense if its one or the other (look at the video). Packet loss still occurs and it goes offline.

      Snort as tested works in stateless mode as well. If it was PF itself that was the culprit, I believe it should fare better running stateless than with stateful inspection. It didnt.

      ![2nd and 3rd.PNG](/public/imported_attachments/1/2nd and 3rd.PNG)
      ![2nd and 3rd.PNG_thumb](/public/imported_attachments/1/2nd and 3rd.PNG_thumb)

      1 Reply Last reply Reply Quote 0
      • N Offline
        Nullity
        last edited by

        Gonna try to get DTrace running?

        Please correct any obvious misinformation in my posts.
        -Not a professional; an arrogant ignoramous.

        1 Reply Last reply Reply Quote 0
        • F Offline
          firewalluser
          last edited by

          I'll submit a little pfsense box running on a 5Mbit ADSL residential (not business) broadband line for testing later on today if theres enough bandwidth to get it to fall over?

          Who should I PM the ip address to?

          Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

          Asch Conformity, mainly the blind leading the blind.

          1 Reply Last reply Reply Quote 0
          • S Offline
            Supermule Banned
            last edited by

            Yes but we need a developer to get it going me thinks.

            Its above my head to do that.

            So if anybody with deep knowledge of freebsd/pfsense want to chime in then it would be great!

            @Nullity:

            Gonna try to get DTrace running?

            1 Reply Last reply Reply Quote 0
            • S Offline
              Supermule Banned
              last edited by

              Me.

              IP and port number is needed.

              My skype is : kontaktnetsupply if you need to chat. :)

              Pls. make sure it answers to ping as well since I can monitor the result from here.

              @firewalluser:

              I'll submit a little pfsense box running on a 5Mbit ADSL residential (not business) broadband line for testing later on today if theres enough bandwidth to get it to fall over?

              Who should I PM the ip address to?

              1 Reply Last reply Reply Quote 0
              • S Offline
                Supermule Banned
                last edited by

                But we have come a lot closer to finding out what could be done.

                Youtube Video

                By tweaking system tunables the box is way tougher than out of the box. I will come back to that later.

                The 4 types of states all tested sequentially and the video shows how the box responds.

                Picture is how the load is on the VM.

                In order it is:

                Keep state
                Sloppy State
                Syn Proxy state
                Stateless

                AS you can see the overall load is smallest on the SYN Proxy setting. As it should be.

                In addition to this, I would like to add that the sweetspot for pfsense is 4cores on 1 socket and 4GB ram on this specific system.

                I have 2 Quads in 2 sockets and it could easily be the answer to that. (Intel(R) Xeon(R) CPU          E5420  @ 2.50GHz )

                Why I dont know.

                State_settings_testing.PNG
                State_settings_testing.PNG_thumb

                1 Reply Last reply Reply Quote 0
                • N Offline
                  Nullity
                  last edited by

                  @Supermule:

                  Yes but we need a developer to get it going me thinks.

                  Its above my head to do that.

                  So if anybody with deep knowledge of freebsd/pfsense want to chime in then it would be great!

                  @Nullity:

                  Gonna try to get DTrace running?

                  I had it 90% functioning yesterday on 2.2.2. Maybe try the development build/snapshot?

                  You need like a half-dozen modules from FreeBSD, and "kldload dtraceall" loads without error.

                  Please correct any obvious misinformation in my posts.
                  -Not a professional; an arrogant ignoramous.

                  1 Reply Last reply Reply Quote 0
                  • S Offline
                    Supermule Banned
                    last edited by

                    Keep going at let us knowq when you have it going 100% and can trace the actual cpu's

                    Then we would be very close to getting this upstream.

                    1 Reply Last reply Reply Quote 0
                    • M Offline
                      mer
                      last edited by

                      @Supermule:

                      Keep going at let us knowq when you have it going 100% and can trace the actual cpu's

                      Then we would be very close to getting this upstream.

                      A couple of posts ago, you had "3 attacks in a row, fine unless core 4 hit 100%", yes?  In between, the box was not rebooted or was it rebooted?  If not rebooted, that becomes interesting, perhaps a process migrating across cores and in the bad case maybe it winds up doing a lot of "interprocessor" locking.

                      1 Reply Last reply Reply Quote 0
                      • S Offline
                        Supermule Banned
                        last edited by

                        No reboot.

                        On 4 cores it moves between cores. I can get it to lock on 1 core using an 8 core system.

                        Thats why debugging that one core could reveal tyhe issue or get closer to it.

                        1 Reply Last reply Reply Quote 0
                        • F Offline
                          firewalluser
                          last edited by

                          @Supermule:

                          Me.

                          IP and port number is needed.

                          My skype is : kontaktnetsupply if you need to chat. :)

                          Pls. make sure it answers to ping as well since I can monitor the result from here.

                          Is the ping essential as its not something I normally allow and the ping could be part of the problem?

                          I'll set up a skype account and pm it to you so we can hook up to do the test although maintaining a skype chat would still amount to the same thing as a ping test if its taking the fw out as the skype chat would fail and you wouldnt get second or ten second accounts of whats going on here.

                          I can record the firewall using vmware though so I can have two pages of the gui open to record.

                          What would be the best two pages to have open in the gui as you said theres 24x7 monitoring being done in the gui.

                          Edit.

                          One other thought, if this is not affecting stateless fw's just stateful fw's by virtue of the extra code thats needed to run in order to carry out the state checks, and knowing that is in effect an attack overwhelming the hw by virtue of the amount of code that needs to be run when carrying out the stateful checks as seen with the interrupt storm on a screenshot posted earlier in the thread, would a fw in front of pfsense which limits/throttles the amount of traffic reduce/remove the problem?

                          I say this because if not enough of these packets are received, can the hw ever be overwhelmed.

                          However is any fw that throttles traffic always a stateful fw or do stateless fw's exist which can throttle traffic, my knowledge of the different types of fw's that exist is limited.

                          Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                          Asch Conformity, mainly the blind leading the blind.

                          1 Reply Last reply Reply Quote 0
                          • T Offline
                            tim.mcmanus
                            last edited by

                            @Supermule:

                            Yes but we need a developer to get it going me thinks.

                            Its above my head to do that.

                            So if anybody with deep knowledge of freebsd/pfsense want to chime in then it would be great!

                            @Nullity:

                            Gonna try to get DTrace running?

                            It's not a pfSense issue, so don't expect the developers to engage.  This is a FreeBSD/PF/network driver issue.  You need to move this conversation over to the FreeBSD forums or else you're just pissing into the wind.

                            1 Reply Last reply Reply Quote 0
                            • S Offline
                              Supermule Banned
                              last edited by

                              Normally we only run the dashboard to monitor traffic and packetloss.

                              If we need then run 2 more tabs. (system activity and pfinfo).

                              On the console go into (8) shell and type "top -P" Capital P is needed.

                              Then you can monitor the CPU's from there if your tabs crash.

                              1 Reply Last reply Reply Quote 0
                              • N Offline
                                Nullity
                                last edited by

                                @tim.mcmanus:

                                @Supermule:

                                Yes but we need a developer to get it going me thinks.

                                Its above my head to do that.

                                So if anybody with deep knowledge of freebsd/pfsense want to chime in then it would be great!

                                @Nullity:

                                Gonna try to get DTrace running?

                                It's not a pfSense issue, so don't expect the developers to engage.  This is a FreeBSD/PF/network driver issue.  You need to move this conversation over to the FreeBSD forums or else you're just pissing into the wind.

                                I personally have not seen enough information to decide wtf is going on.

                                Please correct any obvious misinformation in my posts.
                                -Not a professional; an arrogant ignoramous.

                                1 Reply Last reply Reply Quote 0
                                • T Offline
                                  tim.mcmanus
                                  last edited by

                                  @Nullity:

                                  @tim.mcmanus:

                                  @Supermule:

                                  Yes but we need a developer to get it going me thinks.

                                  Its above my head to do that.

                                  So if anybody with deep knowledge of freebsd/pfsense want to chime in then it would be great!

                                  @Nullity:

                                  Gonna try to get DTrace running?

                                  It's not a pfSense issue, so don't expect the developers to engage.  This is a FreeBSD/PF/network driver issue.  You need to move this conversation over to the FreeBSD forums or else you're just pissing into the wind.

                                  I personally have not seen enough information to decide wtf is going on.

                                  I have, and it's clear as day.

                                  When folks run their testing, you're box will be taken down with default settings.  Increase the state table to 5M.  The interface will go down, and you should see an IRQ interrupt storm.  Interface goes down for obvious reasons.

                                  I've run several tests with SM using different routes, OSes, pfSense settings, and I had a quick conversation with the devs about my findings.  This is very much a FreeBSD/PF/network driver issue.

                                  Also, it's clear you didn't read any of the information I posted or any of the links.  I've probably spent about 100 hours so far working through this trying to help people understand what the issue actually is.  But if people want to go down this road on their own, nothing I can do to stop that.  I just wanted to help you not waste your time.

                                  1 Reply Last reply Reply Quote 0
                                  • T Offline
                                    tim.mcmanus
                                    last edited by

                                    @Supermule:

                                    Normally we only run the dashboard to monitor traffic and packetloss.

                                    If we need then run 2 more tabs. (system activity and pfinfo).

                                    On the console go into (8) shell and type "top -P" Capital P is needed.

                                    Then you can monitor the CPU's from there if your tabs crash.

                                    This also removes a very important CPU metric from top.  Just run top without "-P".  You should see a third line on the screen for CPU metrics.  This is a critically important set of metrics.

                                    You should also be observing the interrupts that the interface is seeing because interrupts are equally as important.

                                    1 Reply Last reply Reply Quote 0
                                    • H Offline
                                      Harvy66
                                      last edited by

                                      @tim.mcmanus:

                                      @Harvy66:

                                      Being susceptible to DDOS is not inherent to stateful firewalls, it's about not having a slow path that kills the machine. The fast path is existing states. If the slow path really has to be as bad as it is, like 1000 times slower, then have it give up when it spends too much time. Drop the packets for the non-existent states, don't allow existing states to be punished by blocked states.

                                      nutshell: The slow path is a corner case that is pathological and can be trigger on demand, make it lower priority so it doesn't blow stuff up. Existing state should not be affected.

                                      edit: a lot of what I do involves Big-O scaling, edge and corner cases, and making sure the worst case allows the system to function in a well defined limp-mode. Rule of thumb, modern computers have way more CPU and memory than internet bandwidth. If your network breaks before running out of bandwidth, something is incorrectly designed.

                                      Not entire true.

                                      The first test, and the screen shots are on this thread, filled the state table.  First capacity limit hit, interface goes down.  Second test, screen shot again posted, created an IRQ interrupt storm.  That is a hardware issue (probably a driver issue, but I'll explain more).  IRQ interrupt capacity hit, interface goes down.

                                      An IRQ interrupt storm can be generated by any piece of hardware.  Google it for some interesting examples.  When SSDs fail in some cases it generates IRQ interrupt storm, and it affects the machine in a similar fashion.

                                      When I increased the state limit in pfSense, I hit a system limitation where the incoming data could not be consumed fast enough by the hardware and software resources.  I could probably tweak this setting, but there will always be an upper limit.  Set high enough and the DDOS would consume all of my bandwidth, essentially achieving the same thing:  encumbering the interface.

                                      Some good reading if you want to tweak the performance of FreeBSD:  https://calomel.org/freebsd_network_tuning.html

                                      https://forums.freebsd.org/threads/igb-interrupt-storm-detected.9271/

                                      http://www.keil.com/forum/21608/

                                      From this link:  http://conferences.sigcomm.org/imc/2010/papers/p206.pdf

                                      "A packet’s journey through the capturing system begins at the network interface card (NIC). Modern cards copy the packets into the operating systems kernel memory using Di- rect Memory Access (DMA), which reduces the work the driver and thus the CPU has to perform in order to transfer the data into memory. The driver is responsible for allocat- ing and assigning memory pages to the card that can be used for DMA transfer. After the card has copied the captured packets into memory, the driver has to be informed about the new packets through an hardware interrupt. Raising an interrupt for each incoming packet will result in packet loss, as the system gets busy handling the interrupts (also known as an interrupt storm). This well-known issue has lead to the development of techniques like interrupt mod- eration or device polling, which have been proposed several years ago [7, 10, 11]. However, even today hardware inter- rupts can be a problem because some drivers are not able to use the hardware features or do not use polling—actually, when we used the igb driver in FreeBSD 8.0, which was re- leased in late 2009, we experienced bad performance due to interrupt storms. Hence, bad capturing performance can be explained by bad drivers; therefore, users should check the number of generated interrupts if high packet loss rates are observed.

                                      The driver’s hardware interrupt handler is called imme- diately upon the reception of an interrupt, which interrupts the normal operation of the system. An interrupt handler is supposed to fulfill its tasks as fast as possible. It therefore usually doesn’t pass on the captured packets to the operating systems capturing stack by himself, because this operation would take to long. Instead, the packet handling is deferred by the interrupt handler. In order to do this, a kernel thread is scheduled to perform the packet handling in a later point in time. The system scheduler chooses a kernel thread to perform the further processing of the captured packets ac- cording to the system scheduling rules. Packet processing is deferred until there is a free thread that can continue the packet handling.

                                      As soon as the chosen kernel thread is running, it passes the received packets into the network stack of the operat- ing system. From there on, packets need to be passed to the monitoring application that wants to perform some kind of analysis. The standard Linux capturing path leads to a subsystem called PF PACKET; the corresponding system in FreeBSD is called BPF (Berkeley Packet Filter). Improve- ments for both subsystems have been proposed."

                                      "The first test, and the screen shots are on this thread, filled the state table.  First capacity limit hit, interface goes down."

                                      The interface should not go down because there is no more room for states. Can't add a new state, do nothing, packet is ignored. Like I said, bad design.

                                      " Second test, screen shot again posted, created an IRQ interrupt storm.  That is a hardware issue (probably a driver issue, but I'll explain more).  IRQ interrupt capacity hit, interface goes down."

                                      Exactly, hardware issue, not a lack of resources, but bad design.

                                      "Set high enough and the DDOS would consume all of my bandwidth, essentially achieving the same thing:  encumbering the interface."

                                      This is the ONLY case a DDOS should take down a system. You can out of bandwidth. All other cases are because someone should be kicked in the head. Easy for me to say from my chair with hindsight, but true none-the-less.

                                      In the end, a lot of these types of issues will go away with changes from FreeBSD11 and later. There are some major changes being talked about that will effectively allow near linear scaling for SMP, which forces them to re-look at a lot of their algorithms. I still think the issue is probably with NAT forwarding, which is really not part of the network stack in the "normal" way. Kind of a hack.

                                      1 Reply Last reply Reply Quote 0
                                      • F Offline
                                        firewalluser
                                        last edited by

                                        @tim.mcmanus:

                                        I have, and it's clear as day.

                                        When folks run their testing, you're box will be taken down with default settings.  Increase the state table to 5M.  The interface will go down, and you should see an IRQ interrupt storm.  Interface goes down for obvious reasons.

                                        I've run several tests with SM using different routes, OSes, pfSense settings, and I had a quick conversation with the devs about my findings.  This is very much a FreeBSD/PF/network driver issue.

                                        Also, it's clear you didn't read any of the information I posted or any of the links.  I've probably spent about 100 hours so far working through this trying to help people understand what the issue actually is.  But if people want to go down this road on their own, nothing I can do to stop that.  I just wanted to help you not waste your time.

                                        You've seen (and others have seen the screen shot) showing the interrupt storm. Now Interrupt Storms affect all OS's to some degree or another, so I'd tend to agree with the statement its a FreeBSD issue rather than a pfsense issue although we can tune pfsense and thus freebsd by altering some of the settings (System Tunables) via pfsense.

                                        Ultimatelly all OS's are just running in a tight loop, you add stuff to it like a driver via an interrupt and you trigger a whole load of additional code which then needs to be run. Now depending on how well that code is written will determine if we see pfsense (or any other firewall running on a variety of OS's) get taken out or not as it depends on a variety of factors like was the driver code written with multi core CPU's in mind or not, likewise further up the code base has that been written to be multi threaded or not?

                                        Has anyone tried the freebsd links I've posted which show how to tune freebsd?

                                        Dtrace will help isolate the code in freebsd which is affected and it might show us the variables we can tune to reduce the incidence of an intterupt storm but imo we need to be focusing on the interrupt storm as the cause, everything else we have seen is just a symptom of the underlying problem.

                                        Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                        Asch Conformity, mainly the blind leading the blind.

                                        1 Reply Last reply Reply Quote 0
                                        • S Offline
                                          Supermule Banned
                                          last edited by

                                          When I run the attacks on myself all the time to harden the damn thing, I have never ever had an interrupt error on the console.

                                          I hardly see interrupt load (0-2%) when hit and thats not the issue IMHO.

                                          The issue is something that acts as a bottleneck on the way through NAT.

                                          Its no issue when NAT is not there and the traffic hits a blocked FW. When it does NAT, then it crashes (1 core hits 100%) and packetloss is observed.

                                          So the difference between no NAT and NAT is what takes it offline.

                                          1 Reply Last reply Reply Quote 0
                                          • M Offline
                                            mer
                                            last edited by

                                            @Supermule:

                                            When I run the attacks on myself all the time to harden the damn thing, I have never ever had an interrupt error on the console.

                                            I hardly see interrupt load (0-2%) when hit and thats not the issue IMHO.

                                            The issue is something that acts as a bottleneck on the way through NAT.

                                            Its no issue when NAT is not there and the traffic hits a blocked FW. When it does NAT, then it crashes (1 core hits 100%) and packetloss is observed.

                                            So the difference between no NAT and NAT is what takes it offline.

                                            So NAT takes the inbound packet, rewrites portions of the header, probably redoes checksums, then does it push the mbuf back onto the stack where it gets fed into PF processing again or does it just continue running the PF rules?  If it's redoing checksum, is that being offloaded to hardware or is sw doing that?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.