Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Device Pooling and Interrupts for Intel Pro NIC

    Scheduled Pinned Locked Moved Hardware
    23 Posts 4 Posters 9.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      sullrich
      last edited by

      Please try this and see if it helps.

      Edit /tmp/rules.debug and add this to the top:

      set timeout interval 1
      set timeout { tcp.finwait 10, tcp.closed 5 }

      Now run this from a shell: pfctl -f /tmp/rules.debug

      Please test with these settings and let me know if it is better.  We might have to make this a hidden variable.  Or something.

      1 Reply Last reply Reply Quote 0
      • P
        PeterZ
        last edited by

        This does not help but it does change things:

        netstat -I em1 -w 1

        input          (em1)          output
          packets  errs      bytes    packets  errs      bytes colls
              9995  249  12152731      6990    0    1242327    0
            10572  227  12847133      7346    0    1259203    0
            10748  359  13196148      7666    0    1305645    0
            11388  310  14000169      8052    0    1397665    0
            11842  259  14595778      8389    0    1464044    0

        So now we have same packet loss each second instead of once per 10 seconds.

        It looks to be like purging happening for whole table rather than in pieces.

        Changing it to do purging once per 60 seconds gives this:

        124838    0  156441927      84043    0  12043484    0
            117116  897  146381898      80182    0  11669285    0
            116284    0  144356067      80062    0  11898673    0

        (10 second increments)

        Once per 60 seconds so more packets are lost in spikes but fewer in average.

        Interesting - is not there some kernel buffer one can increase to avoid this problem.
        It looks very strange to me purging states table blocks  network device buffer from processing.

        @sullrich:

        Please try this and see if it helps.

        Edit /tmp/rules.debug and add this to the top:

        set timeout interval 1
        set timeout { tcp.finwait 10, tcp.closed 5 }

        Now run this from a shell: pfctl -f /tmp/rules.debug

        Please test with these settings and let me know if it is better.  We might have to make this a hidden variable.  Or something.

        1 Reply Last reply Reply Quote 0
        • S
          sullrich
          last edited by

          You did test this with polling off, right?

          1 Reply Last reply Reply Quote 0
          • P
            PeterZ
            last edited by

            @sullrich:

            You did test this with polling off, right?

            Sure. The pooling is off and I'm not even trying to turn it on because it becomes so much worse…

            1 Reply Last reply Reply Quote 0
            • S
              sullrich
              last edited by

              Okay, I will run this by Bill and I have emailed Max Laier who might have some tuning advice for us.

              Just for the record what kind of bandwidth are you pushing?  Can you share RRD bandwidth and packet graphs?

              EDITED: Spelling mistakes.

              1 Reply Last reply Reply Quote 0
              • B
                billm
                last edited by

                Can you send us the output of:

                sysctl net.inet.ip.intr_queue_drops
                

                And maybe increase net.inet.ip.intr_queue_maxlen

                sysctl net.inet.ip.intr_queue_maxlen=250
                

                And let me know if that helps.

                Also…send the output of:

                sysctl net.isr
                

                Thanks

                –Bill

                pfSense core developer
                blog - http://www.ucsecurity.com/
                twitter - billmarquette

                1 Reply Last reply Reply Quote 0
                • S
                  sullrich
                  last edited by

                  Yes, please provide the outputs that Bill is requesting.

                  Once you have outputted that and tried upping the sysctl if we still have not made any progress I have received a patch from Max that might help.  If we get to this point I will compile a custom test kernel for you.

                  1 Reply Last reply Reply Quote 0
                  • P
                    PeterZ
                    last edited by

                    @sullrich:

                    Yes, please provide the outputs that Bill is requesting.

                    Once you have outputted that and tried upping the sysctl if we still have not made any progress I have received a patch from Max that might help.  If we get to this point I will compile a custom test kernel for you.

                    sysctl net.isr

                    net.isr.direct: 0
                    net.isr.count: 444903
                    net.isr.directed: 0
                    net.isr.deferred: 444903
                    net.isr.queued: 321
                    net.isr.drop: 0
                    net.isr.swi_count: 411596

                    sysctl net.inet.ip.intr_queue_drops

                    net.inet.ip.intr_queue_drops: 0

                    After queue was increased to 250:

                    netstat -I em1 -w 100

                    input          (em1)          output
                      packets  errs      bytes    packets  errs      bytes colls
                        638808  312  765089702    450895    0  70931893    0
                        668226  951  808850726    458403    0  63521833    0

                    So we're still loosing packets, the number is less but the traffic also dropped 30 from day high so the problem did not went away.

                    1 Reply Last reply Reply Quote 0
                    • S
                      sullrich
                      last edited by

                      From a shell, run this (testing a new kernel):

                      Backup old kernel

                      cp /boot/kernel/kernel.gz ~root/
                      fetch -o /boot/kernel/kernel.gz http://www.pfsense.com/~sullrich/kernel.gz
                      shutdown -r now

                      This will reboot your box, be prepared.

                      1 Reply Last reply Reply Quote 0
                      • P
                        PeterZ
                        last edited by

                        Thanks Scott,

                        I tried this kernel and unfortunately with it system crashes within 2  to 15 minutes from the boot.
                        There is still packet loss

                        netstat -I em1 -w 10

                        input          (em1)          output
                          packets  errs      bytes    packets  errs      bytes colls
                            125527    73  145482529      89756    0    8569116    0
                            125283    0  144453737      90196    0    8797933    0
                            114783    0  128255459      82857    0    8698382    0
                            112750  109  125654284      81726    0    8263655    0
                            112577    0  125250797      81648    0    7740500    0
                            113482    0  124736359      81646    0    7689588    0
                            110745    0  121549353      80587    0    8305343    0
                            111846    39  121826640      81538    0    8449537    0

                        This is with default 10 sec purge interval.

                        By the way this box is running pfsense 1.0.1 - would you recommend upgrading to 1.2.0RC1 - is it something safe enough to do for remote box ?

                        Also I think it would be very handy to add a feature to enter advanced pf settings somewhere so they are kept in config.
                        Changing purge interval and other advanced settings is very inconvenient now as rules.debug are always recreated from scratch.

                        @sullrich:

                        From a shell, run this (testing a new kernel):

                        Backup old kernel

                        cp /boot/kernel/kernel.gz ~root/
                        fetch -o /boot/kernel/kernel.gz http://www.pfsense.com/~sullrich/kernel.gz
                        shutdown -r now

                        This will reboot your box, be prepared.

                        1 Reply Last reply Reply Quote 0
                        • P
                          PeterZ
                          last edited by

                          I thought I would also post data from the old kernel (6.1-RELEASE-p10) for comparison, right after the box reboot:

                          netstat -I em1 -w 10

                          input          (em1)          output
                            packets  errs      bytes    packets  errs      bytes colls
                              119425  227  136364409      85812    0  10439358    0
                              121176  316  137712815      87569    0  10499359    0
                              122182  348  140802834      87724    0  10258430    0
                              124622  468  144679509      88983    0  10354789    0
                              131813  427  152227016      94048    0  10657670    0
                              131419  449  151380609      93904    0  11080772    0
                              129773  457  149226974      91306    0  10959052    0

                          So the new kernel seems to be loosing less packets but still looses some and crashes.

                          1 Reply Last reply Reply Quote 0
                          • S
                            sullrich
                            last edited by

                            Yes, please update to 1.2-RC1.

                            http://wiki.pfsense.com/wikka.php?wakka=UsingThePHPpfSenseShell

                            1 Reply Last reply Reply Quote 0
                            • P
                              Perry
                              last edited by

                              So how did this story end?

                              /Perry
                              doc.pfsense.org

                              1 Reply Last reply Reply Quote 0
                              • B
                                billm
                                last edited by

                                Haven't heard anything else.  But for what it's worth, I'm not seeing this in FreeBSD 6.2 w/ a couple of the pfSense patches added to our kernel.

                                
                                # netstat -w 10
                                            input        (Total)           output
                                   packets  errs      bytes    packets  errs      bytes colls
                                    868544     0  286227325     865170     0  251535256     0 
                                    770774     0  222347441     796016     0  225886191     0 
                                    731287     0  224789308     766395     0  231316740     0 
                                    767101     0  234638730     798607     0  244245061     0 
                                    828549     0  245917253     847273     0  242236942     0 
                                    782814     0  235875581     809549     0  238561715     0 
                                    743229     0  222066030     776047     0  239061961     0 
                                
                                

                                And the pci-id's of the cards

                                dual port fiber card
                                dev.em.0.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9
                                dev.em.0.%driver: em
                                dev.em.0.%location: slot=1 function=0
                                dev.em.0.%pnpinfo: vendor=0x8086 device=0x1012 subvendor=0x8086 subdevice=0x1012 class=0x020000

                                dual port copper card
                                dev.em.2.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9
                                dev.em.2.%driver: em
                                dev.em.2.%location: slot=1 function=0
                                dev.em.2.%pnpinfo: vendor=0x8086 device=0x1079 subvendor=0x8086 subdevice=0x1179 class=0x020000

                                –Bill

                                pfSense core developer
                                blog - http://www.ucsecurity.com/
                                twitter - billmarquette

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.