Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    What is the biggest attack in GBPS you stopped

    Scheduled Pinned Locked Moved General pfSense Questions
    737 Posts 33 Posters 818.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F Offline
      firewalluser
      last edited by

      @tim.mcmanus:

      I have, and it's clear as day.

      When folks run their testing, you're box will be taken down with default settings.  Increase the state table to 5M.  The interface will go down, and you should see an IRQ interrupt storm.  Interface goes down for obvious reasons.

      I've run several tests with SM using different routes, OSes, pfSense settings, and I had a quick conversation with the devs about my findings.  This is very much a FreeBSD/PF/network driver issue.

      Also, it's clear you didn't read any of the information I posted or any of the links.  I've probably spent about 100 hours so far working through this trying to help people understand what the issue actually is.  But if people want to go down this road on their own, nothing I can do to stop that.  I just wanted to help you not waste your time.

      You've seen (and others have seen the screen shot) showing the interrupt storm. Now Interrupt Storms affect all OS's to some degree or another, so I'd tend to agree with the statement its a FreeBSD issue rather than a pfsense issue although we can tune pfsense and thus freebsd by altering some of the settings (System Tunables) via pfsense.

      Ultimatelly all OS's are just running in a tight loop, you add stuff to it like a driver via an interrupt and you trigger a whole load of additional code which then needs to be run. Now depending on how well that code is written will determine if we see pfsense (or any other firewall running on a variety of OS's) get taken out or not as it depends on a variety of factors like was the driver code written with multi core CPU's in mind or not, likewise further up the code base has that been written to be multi threaded or not?

      Has anyone tried the freebsd links I've posted which show how to tune freebsd?

      Dtrace will help isolate the code in freebsd which is affected and it might show us the variables we can tune to reduce the incidence of an intterupt storm but imo we need to be focusing on the interrupt storm as the cause, everything else we have seen is just a symptom of the underlying problem.

      Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

      Asch Conformity, mainly the blind leading the blind.

      1 Reply Last reply Reply Quote 0
      • S Offline
        Supermule Banned
        last edited by

        When I run the attacks on myself all the time to harden the damn thing, I have never ever had an interrupt error on the console.

        I hardly see interrupt load (0-2%) when hit and thats not the issue IMHO.

        The issue is something that acts as a bottleneck on the way through NAT.

        Its no issue when NAT is not there and the traffic hits a blocked FW. When it does NAT, then it crashes (1 core hits 100%) and packetloss is observed.

        So the difference between no NAT and NAT is what takes it offline.

        1 Reply Last reply Reply Quote 0
        • M Offline
          mer
          last edited by

          @Supermule:

          When I run the attacks on myself all the time to harden the damn thing, I have never ever had an interrupt error on the console.

          I hardly see interrupt load (0-2%) when hit and thats not the issue IMHO.

          The issue is something that acts as a bottleneck on the way through NAT.

          Its no issue when NAT is not there and the traffic hits a blocked FW. When it does NAT, then it crashes (1 core hits 100%) and packetloss is observed.

          So the difference between no NAT and NAT is what takes it offline.

          So NAT takes the inbound packet, rewrites portions of the header, probably redoes checksums, then does it push the mbuf back onto the stack where it gets fed into PF processing again or does it just continue running the PF rules?  If it's redoing checksum, is that being offloaded to hardware or is sw doing that?

          1 Reply Last reply Reply Quote 0
          • T Offline
            tim.mcmanus
            last edited by

            @firewalluser:

            Dtrace will help isolate the code in freebsd which is affected and it might show us the variables we can tune to reduce the incidence of an intterupt storm but imo we need to be focusing on the interrupt storm as the cause, everything else we have seen is just a symptom of the underlying problem.

            Yes, this is definitely the next step.  The guidance I deceived from the devs is to get a FreeBSD 10.1 image with dtrace to identify the issue in FreeBSD and subsequently FreeBSD/FP.  However, there are certain features of dtrace that are not enabled by default, so it may/will require recompiling the kernel so you can capture those things.  I was informed that "there are no dtrace probes currently in sys/netpfil, (look for SDT_PROVIDER_DEFINE) so you’ll be starting from scratch."

            Again, it needs to be determined that the issue is not in FreeBSD 10.1 before you troubleshoot pfSense.  What's the point in putting any effort into trying to remediate this in pfSense when that may have no bearing on the issue?

            1 Reply Last reply Reply Quote 0
            • A Offline
              almabes
              last edited by

              I'll get a bare metal FreeBSD test box up by this evening.  It should be a beefy enough box–a Xeon W3565 with 4GB of RAM and a GigE nic.  I already have one loaded on my ESXi host. 
              We can test tomorrow.

              1 Reply Last reply Reply Quote 0
              • S Offline
                Supermule Banned
                last edited by

                This is very interesting!

                Youtube Video

                restarting services one by one to test. First doesnt do anything. 3 attacks with restarting DNS, NTP and Apinger does nothing.

                Restarting snort does it. Snort restart does something to the firewall that the others dont do. Its replicable since the last 2 attacks fares in a similar fashion.

                I have asked Bmeeks to tell us what Snort does to pfsense when restarted and what it resets.

                Suddenly its able to handle the traffic again (much nicer graphs).

                Why??

                1 Reply Last reply Reply Quote 0
                • T Offline
                  tim.mcmanus
                  last edited by

                  The "Pasta Method" of troubleshooting–throwing things against the wall to see what sticks--won't provide any value in identifying the root cause and resolving the issue.  IMHO it's a waste of time.

                  Eliminate the most basic things first--FreeBSD, PF, and then pfSense.  If you can recreate the issue in FreeBSD, poof, there's where the issue resides.  Troubleshooting pfSense when the issue could be in FreeBSD is really a waste of time.

                  1 Reply Last reply Reply Quote 0
                  • S Offline
                    Supermule Banned
                    last edited by

                    ok

                    1 Reply Last reply Reply Quote 0
                    • F Offline
                      firewalluser
                      last edited by

                      @Supermule:

                      When I run the attacks on myself all the time to harden the damn thing, I have never ever had an interrupt error on the console.

                      I hardly see interrupt load (0-2%) when hit and thats not the issue IMHO.

                      The issue is something that acts as a bottleneck on the way through NAT.

                      Its no issue when NAT is not there and the traffic hits a blocked FW. When it does NAT, then it crashes (1 core hits 100%) and packetloss is observed.

                      So the difference between no NAT and NAT is what takes it offline.

                      What model NIC are you using and are you running pfsense on ESXI thus going through ESXI's network stack?

                      Reason I ask is some NIC's can handle some of the network functionality that would otherwise be handled by the OS.
                      I wonder what NIC was in use when the photo was taken showing the interrupt storm message, it would be helpful to compare the difference in funtionality to see what work is offloaded to the OS and what is being handled by the NIC.

                      Mer is wondering the same in the post below with the checksum sentence.

                      @mer:

                      So NAT takes the inbound packet, rewrites portions of the header, probably redoes checksums, then does it push the mbuf back onto the stack where it gets fed into PF processing again or does it just continue running the PF rules?  If it's redoing checksum, is that being offloaded to hardware or is sw doing that?

                      @tim.mcmanus:

                      @firewalluser:

                      Dtrace will help isolate the code in freebsd which is affected and it might show us the variables we can tune to reduce the incidence of an intterupt storm but imo we need to be focusing on the interrupt storm as the cause, everything else we have seen is just a symptom of the underlying problem.

                      Yes, this is definitely the next step.  The guidance I deceived from the devs is to get a FreeBSD 10.1 image with dtrace to identify the issue in FreeBSD and subsequently FreeBSD/FP.  However, there are certain features of dtrace that are not enabled by default, so it may/will require recompiling the kernel so you can capture those things.  I was informed that "there are no dtrace probes currently in sys/netpfil, (look for SDT_PROVIDER_DEFINE) so you’ll be starting from scratch."

                      Again, it needs to be determined that the issue is not in FreeBSD 10.1 before you troubleshoot pfSense.  What's the point in putting any effort into trying to remediate this in pfSense when that may have no bearing on the issue?

                      Theres no reason why we couldnt setup freebsd with some basic functionality and build up from there. Setting up pf tables for example is time consuming but not difficult, but its time consuming which is why we use pfsense as alot of the functionality is setup for us.

                      Maybe it would be quicker to compare the XML backups of all those affected to isolate the differences between installations installed?

                      MS does a nice free XML notepad app which makes it easier in a dual pane tree view on one pane, xml properties on the right pane to make it fairly quick and easy to modify xml files. I know text compare apps exist which make it useful for comparing differences in program code/html/text between versions, so maybe that would be a quicker and easier approach to take?

                      The XML backup compare approach would be quickest imo and we could have someone/a few people comparing the XML backups, whilst a few others maybe work on setting up freebsd from the ground up, or try a different approach in parallel?

                      Other alternatives/options is Dtrace but setup time is unknown as is the setup time of freebsd but wil likely be more than trying to get Dtrace running on pfsense, but we might also be able to setup more quickly than Dtrace or FreeBSD the flame graphs also mentioned earlier in the thread.

                      I personally would like to see Dtrace in pfsense as I think that would be a better low level form of functionality to have in pfsense going forward if you ware western, or going backwards if from the Southern American continent (different regions of the planet view the future differently ie some see it as a path laid out in front whilst others see it as a path behind their head as its unknown what the future holds but I digress).  ;)

                      So whose in favour of what?

                      Thoughts otherwise we will end up going around in circles and nothing gets achieved with none of us any the wiser as to whats happening and no solution being found (if one can be found in this current version of pfsense as we dont know if it might be resolved freebsd 11 just to chuck that variable in as well).

                      Lots of variables but we need to sort out a plan otherwise I can only but insert the old cliche "if we dont plan then we plan to fail".  ;D

                      Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                      Asch Conformity, mainly the blind leading the blind.

                      1 Reply Last reply Reply Quote 0
                      • F Offline
                        firewalluser
                        last edited by

                        @tim.mcmanus:

                        The "Pasta Method" of troubleshooting–throwing things against the wall to see what sticks--won't provide any value in identifying the root cause and resolving the issue.  IMHO it's a waste of time.

                        Eliminate the most basic things first--FreeBSD, PF, and then pfSense.  If you can recreate the issue in FreeBSD, poof, there's where the issue resides.  Troubleshooting pfSense when the issue could be in FreeBSD is really a waste of time.

                        I agree, first I heard that Snort was also being used in this. What version of Snort is in use as a new version has been released over the last few months.

                        May I propose we all submit XML backups and I can compare the differences in XML to find the common elements by all those affected?

                        I think this will be the quickest way to resolve or at least potentially eliminate the odds things to find the common elements which might be affecting things. *

                        • I say might but if its low level as in deep in the freebsd OS, different packages or parts of the system maybe calling the same parts of the OS at a low level so its not 100% foolproof comparing the XML backups but its a start which shouldnt take too much time.

                        Dont worry about the encryption in the XML it can be broken easily enough so best to blank your passwords and anything else you want to keep private, but dont say I didnt warn you.  ;)

                        Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                        Asch Conformity, mainly the blind leading the blind.

                        1 Reply Last reply Reply Quote 0
                        • S Offline
                          Supermule Banned
                          last edited by

                          Emulating E1000 on Intel Dual Port server adapter on ESXi 4.1 U3.

                          Intel code is: E1G42ETBLK

                          1 Reply Last reply Reply Quote 0
                          • S Offline
                            Supermule Banned
                            last edited by

                            I installed Snort quite late in the process and it didnt matter on the performance and the issue at hand.

                            Until i restarted it….

                            1 Reply Last reply Reply Quote 0
                            • F Offline
                              firewalluser
                              last edited by

                              @Supermule:

                              Emulating E1000 on Intel Dual Port server adapter on ESXi 4.1 U3.

                              Intel code is: E1G42ETBLK

                              Thats quite old like at least a couple years old and thats got a bug in where it can be hacked from the network stack iirc?

                              Re the testing the fw, I'm still get setup at this end based on what I noticed last night and posted so still double checking rules & the system is ok before the test. What time you going home tonight dont forget timezone your in?

                              Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                              Asch Conformity, mainly the blind leading the blind.

                              1 Reply Last reply Reply Quote 0
                              • S Offline
                                Supermule Banned
                                last edited by

                                CET +1 is the timezone.

                                I believe they solved that one in U2 or U3.

                                Going out eating tonight at 7.30PM local time.

                                1 Reply Last reply Reply Quote 0
                                • F Offline
                                  firewalluser
                                  last edited by

                                  Got your XML file, I'll compare that to mine and any other's which get pm'ed and I'll try to build a table to show the differences and the common elements to hopefully make it easier to solve.

                                  I'm still curious to see if my home fw running pfsense 2.2.2. can be taken out so if you wanted to do a quick test, my ip is 2.101.3.83. I havent had chance to setup Skype yet as I've still got to get my mail server up and running and I dont allow ping so its not something that needs to be in the test. I've got VM recording the dashboard, pfinfo and system activity plus I'm also using a packet capture (full unlimited on the wan) so I can see whats going on. If the system falls down the ISP will automatically assign a new ip address so for the moment the 2.101.3.83 is mine to play with for now.

                                  Drop me a PM to say when you have done, I'll PM back to let you know if I detect any problems here or not either way.

                                  Edit.

                                  I had PM'ed the above so something got screwy with the forum comms for it to appear here but also explains my post here https://forum.pfsense.org/index.php?topic=94573.0 which is weird as I can access the forum via a free vpn no problem but can no longer access it direct, unless Supermules XML backup triggered a snort alert which is now blocking that machine - will have to check in a moment.

                                  Anyway have you run the script yet against my ip address? I'm still on that IP address and nothing appears to have happened if you have, so let us know Supermule if you have run the script or not.

                                  Much obliged.

                                  Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                  Asch Conformity, mainly the blind leading the blind.

                                  1 Reply Last reply Reply Quote 0
                                  • F Offline
                                    firewalluser
                                    last edited by

                                    Anyway still nothing happened this end but Supermule did say he was going to eat tonight whatever that is, so for now I'm still on that ip address if SM pops their head back in later on tonight. I'll update the ip address when it changes next.

                                    If anyone else wants to pm me their pf XML backup file affected who are affected by this scan I can do a comparison to see what are the common elements and what are the exclusive to you elements so hopefully we can start to narrow down what, where and when.

                                    Just remember to blank the bits you want to keep private as encryption and stuff could be useful to the wrong people, etc etc. Once I've got them compiled I can do a table without names showing the common bits so we can then test an example with the common bits, see if it falls down and go from there to further narrow it down in the absence of anything else like Dtrace, flame graphs et al.

                                    Edit.

                                    ISP has forced an IP change so when Supermule touches base again I'll pass on the latest ip address change. The food must be good.  :)

                                    Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                    Asch Conformity, mainly the blind leading the blind.

                                    1 Reply Last reply Reply Quote 0
                                    • M Offline
                                      mer
                                      last edited by

                                      @firewalluser:

                                      Anyway still nothing happened this end but Supermule did say he was going to eat tonight whatever that is, so for now I'm still on that ip address if SM pops their head back in later on tonight. I'll update the ip address when it changes next.

                                      If anyone else wants to pm me their pf XML backup file affected who are affected by this scan I can do a comparison to see what are the common elements and what are the exclusive to you elements so hopefully we can start to narrow down what, where and when.

                                      Just remember to blank the bits you want to keep private as encryption and stuff could be useful to the wrong people, etc etc. Once I've got them compiled I can do a table without names showing the common bits so we can then test an example with the common bits, see if it falls down and go from there to further narrow it down in the absence of anything else like Dtrace, flame graphs et al.

                                      Edit.

                                      ISP has forced an IP change so when Supermule touches base again I'll pass on the latest ip address change. The food must be good.  :)

                                      Maybe it's not the food but the beer or wine?  ;D

                                      1 Reply Last reply Reply Quote 0
                                      • S Offline
                                        Supermule Banned
                                        last edited by

                                        Just got home from a nice dinner with friends and its 3.39AM here  :D

                                        Going to bed and will have a look during the day tomorrow. (saturday).

                                        @firewalluser:

                                        Got your XML file, I'll compare that to mine and any other's which get pm'ed and I'll try to build a table to show the differences and the common elements to hopefully make it easier to solve.

                                        I'm still curious to see if my home fw running pfsense 2.2.2. can be taken out so if you wanted to do a quick test, my ip is 2.101.3.83. I havent had chance to setup Skype yet as I've still got to get my mail server up and running and I dont allow ping so its not something that needs to be in the test. I've got VM recording the dashboard, pfinfo and system activity plus I'm also using a packet capture (full unlimited on the wan) so I can see whats going on. If the system falls down the ISP will automatically assign a new ip address so for the moment the 2.101.3.83 is mine to play with for now.

                                        Drop me a PM to say when you have done, I'll PM back to let you know if I detect any problems here or not either way.

                                        Edit.

                                        I had PM'ed the above so something got screwy with the forum comms for it to appear here but also explains my post here https://forum.pfsense.org/index.php?topic=94573.0 which is weird as I can access the forum via a free vpn no problem but can no longer access it direct, unless Supermules XML backup triggered a snort alert which is now blocking that machine - will have to check in a moment.

                                        Anyway have you run the script yet against my ip address? I'm still on that IP address and nothing appears to have happened if you have, so let us know Supermule if you have run the script or not.

                                        Much obliged.

                                        1 Reply Last reply Reply Quote 0
                                        • S Offline
                                          Supermule Banned
                                          last edited by

                                          Anyone care to explain this??

                                          Youtube Video

                                          2 first attacks is causing packetloss.

                                          Removing blocked hosts in Snort and it makes the damn thing survive 3 next attacks no problems.

                                          No packetloss and no CPU hits 100%.

                                          What does removing the snort blocked hosts exactly do to pfsense??

                                          Wonder if some setting is over time, resetting something because hours later its back to losing packets until I remove the blocked hosts again.

                                          1 Reply Last reply Reply Quote 0
                                          • H Offline
                                            Harvy66
                                            last edited by

                                            I think snort is blocking and has to inspect every connection before letting it through. Like an extra firewall, but much less efficient. Anything to reduce the work snort has to do will speed things up. Sounds like snort is single threaded in some way. Probably a similar issue with why NAT crumbles when you enabled port forwarding?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.