Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    High Pings times when Captive Portal is enabled.

    Scheduled Pinned Locked Moved General pfSense Questions
    15 Posts 3 Posters 2.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      michaelcox1
      last edited by

      hardware information

      CPU Type Intel(R) Xeon(R) CPU E3-1280 v3 @ 3.60GHz
      Current: 3600 MHz, Max: 3601 MHz
      8 CPUs: 1 package(s) x 4 core(s) x 2 SMT threads

      16 Gb Ram

      interfaces used - 10Gbit deul port Broadcom Copper NIC

      13 Vlans are configured on the captive portal

      Version of Pfsense - 2.3.2

      First wanted to say thank you to anyone who gives any light into this issue.

      The issue i have , I'm running an Intel Xeon 4 physical and 4 virtual SMT cores and when i have the captive portal enabled the GUI becomes unresponsive (501 Error bad gateway), the CPU temps go up almost 20c and the ping times to google go from 9ms to over 200ms

      the server is connected to a 1GB fiber circuit and the load with CP on is around 500mbps and with it turned off total traffic load is around 700mbps

      when i look at the system activity this is what i see when the portal is turned on.

      PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
        11 root              155 ki31    0K  128K RUN    4 580:42  99.76% [idle{idle: cpu4}]
        11 root              155 ki31    0K  128K CPU5    5 581:40  99.66% [idle{idle: cpu5}]
        11 root              155 ki31    0K  128K CPU6    6 581:44  94.97% [idle{idle: cpu6}]
        11 root              155 ki31    0K  128K CPU7    7 582:13  92.29% [idle{idle: cpu7}]
        12 root                -92    -    0K  608K RUN    1 319:47  66.70% [intr{irq266: bxe0:fp0}]
        12 root                -92    -    0K  608K RUN    0 322:12  61.18% [intr{irq265: bxe0:fp0}]
        12 root                -92    -    0K  608K RUN    3 323:51  59.47% [intr{irq268: bxe0:fp0}]
        12 root                -92    -    0K  608K RUN    2 319:26  55.18% [intr{irq267: bxe0:fp0}]
        12 root                -92    -    0K  608K CPU2    2 213:22  46.58% [intr{irq272: bxe1:fp0}]
        12 root                -92    -    0K  608K CPU3    3 206:58  39.16% [intr{irq273: bxe1:fp0}]
        12 root                -92    -    0K  608K CPU1    1 204:19  39.16% [intr{irq271: bxe1:fp0}]
        12 root                -92    -    0K  608K CPU0    0 213:37  38.48% [intr{irq270: bxe1:fp0}]
      65361 root              52    0  280M 54080K piperd  5  0:05  9.77% php-fpm: pool nginx (php-fpm)
      34927 root              76    0 12276K  5392K RUN    3  0:00  7.28% /sbin/sysctl -a
      35002 root              29    0 18740K  2292K piperd  7  0:00  6.98% grep temperature
      62347 root              45    0  280M 52044K piperd  7  0:03  6.79% php-fpm: pool nginx (php-fpm)
      34745 root              47    0 17000K  2484K wait    4  0:00  6.79% sh -c /sbin/sysctl -a | grep temperatur
        11 root                155 ki31    0K  128K RUN    0  68:44  3.96% [idle{idle: cpu0}]

      from this i can see four of the cores are not really doing much so im not sure why it would be slugglish or cause the GUI to crash.

      any help would be much appreciated.

      Thank you.

      1 Reply Last reply Reply Quote 0
      • M
        michaelcox1
        last edited by

        last pid: 37334;  load averages:  0.45,  1.85,  4.22  up 1+03:39:06    23:22:33
        209 processes: 9 running, 162 sleeping, 38 waiting

        Mem: 75M Active, 790M Inact, 639M Wired, 3257M Buf, 14G Free
        Swap: 32G Total, 32G Free

        PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
          11 root        155 ki31    0K  128K CPU7    7  26.6H 100.00% [idle{idle: cpu7}]
          11 root        155 ki31    0K  128K CPU6    6  26.5H 100.00% [idle{idle: cpu6}]
          11 root        155 ki31    0K  128K CPU5    5  26.5H 100.00% [idle{idle: cpu5}]
          11 root        155 ki31    0K  128K RUN    4  26.5H  99.37% [idle{idle: cpu4}]
          11 root        155 ki31    0K  128K CPU0    0 647:14  98.29% [idle{idle: cpu0}]
          11 root        155 ki31    0K  128K CPU1    1 675:11  97.36% [idle{idle: cpu1}]
          11 root        155 ki31    0K  128K CPU3    3 680:18  95.90% [idle{idle: cpu3}]
          11 root        155 ki31    0K  128K CPU2    2 661:39  95.75% [idle{idle: cpu2}]
        30617 root        22    0 25132K 12060K select  7  37:34  4.88% /usr/local/sbin/miniupnpd -f /var/etc/m
          12 root        -92    -    0K  608K WAIT    1 599:09  2.49% [intr{irq266: bxe0:fp0}]
          12 root        -92    -    0K  608K WAIT    0 598:25  2.20% [intr{irq265: bxe0:fp0}]
          12 root        -92    -    0K  608K WAIT    2 603:58  1.95% [intr{irq267: bxe0:fp0}]
          12 root        -92    -    0K  608K WAIT    3 591:06  1.56% [intr{irq268: bxe0:fp0}]
          12 root        -92    -    0K  608K WAIT    0 391:23  1.27% [intr{irq270: bxe1:fp0}]
          12 root        -92    -    0K  608K WAIT    2 386:54  1.17% [intr{irq272: bxe1:fp0}]
          12 root        -92    -    0K  608K WAIT    3 381:04  1.17% [intr{irq273: bxe1:fp0}]
          12 root        -92    -    0K  608K WAIT    1 379:19  1.07% [intr{irq271: bxe1:fp0}]
        35499 root        21    0  280M 48420K piperd  4  0:00  0.49% php-fpm: pool nginx (php-fpm)

        This is with the portal off.

        1 Reply Last reply Reply Quote 0
        • M
          michaelcox1
          last edited by

          After doing some messing around with the Portal settings , it seems to be doing much better when individual bandwidth per user is turned off , anyone else run into this ?

          1 Reply Last reply Reply Quote 0
          • M
            michaelcox1
            last edited by

            nevermind , same issue with portal enabled.

            now i have changed the Maximum Concurrent connection to 3 and checked the disable Concurrent user logins

            will update if anything changes.

            1 Reply Last reply Reply Quote 0
            • H
              heper
              last edited by

              looks like an interrupt storm to me, not always easy to fix

              try nic tuning wiki: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

              1 Reply Last reply Reply Quote 0
              • M
                michaelcox1
                last edited by

                Thank you very much for replying,

                I am running Broadcom 10Gb NICS so i did implement this fix :

                Broadcom bce(4) Cards
                Several users have noted issues with certain Broadcom network cards, especially those built into Dell hardware. If the bce cards in the firewall are behaving erratically, dropping packets, or causing system crashes, then the following tweaks may help, especially on amd64.

                In /boot/loader.conf.local - Add the following (or create the file if it does not exist):

                kern.ipc.nmbclusters="131072"
                hw.bce.tso_enable=0
                hw.pci.enable_msix=0

                i did edit the BCE for BXE which is what my cards are labeled as.

                I will monitor it and see if we get any improvements , Thank you again !

                1 Reply Last reply Reply Quote 0
                • M
                  michaelcox1
                  last edited by

                  Monitored over night with very little change , the only issue was i never rebooted the servers after making the change so i have that done now and again will monitor.

                  1 Reply Last reply Reply Quote 0
                  • M
                    michaelcox1
                    last edited by

                    After rebooting i got the 502 error  had to use the option 16 to reboot PHP

                    1 Reply Last reply Reply Quote 0
                    • M
                      michaelcox1
                      last edited by

                      So last night went very well after the reboot , i was getting a constant ping to google between 9-10 ms
                      CPU utilization was between 15-20%
                      temperatures on the cores were between 69-71

                      so far what i have done for this issue -

                      I Applied the interface file fix that Heper had suggested
                      in the portal i have mac concurrent connections set to 3
                      and also i had checked - Disable Concurrent user logins
                      If enabled only the most recent login per username will be active. Subsequent logins will cause machines previously logged in with the same username to be disconnected.

                      but i think this last one forced my users to keep re-registering so i turned that one back off.

                      I will be making the changes to another server that had the same issue and again i will report back and if it solves it i will mark the post SOLVED !

                      1 Reply Last reply Reply Quote 0
                      • M
                        michaelcox1
                        last edited by

                        still having issues , seems to be anything over 400mb and its struggling , im going to install a second server to drop the vlans down to 6 per server

                        i keep getting these errors too nginx:  [alert] 75850#100188: send() failed (40: Message too long)

                        1 Reply Last reply Reply Quote 0
                        • H
                          Harvy66
                          last edited by

                          I guess my question would be, what is the Captive portal changing that interrupts go crazy? Do you have it set to do any special things per user or something?

                          1 Reply Last reply Reply Quote 0
                          • M
                            michaelcox1
                            last edited by

                            Users Authenticate to a Radius server then in the portal i have bandwidth per user configured other than that nothing special

                            1 Reply Last reply Reply Quote 0
                            • H
                              Harvy66
                              last edited by

                              Try disabling bandwidth shaping just to see if it makes a difference.

                              1 Reply Last reply Reply Quote 0
                              • M
                                michaelcox1
                                last edited by

                                I did that , just disabled the bandwidth per user option and it was still the same.

                                so far i have split the site into two -  one of the servers Temps are running at 60c the second server is now at 80c so im really wondering if i have a device in one of the vlans that just hammering the portal

                                here the CPU activity on both servers and by the way both servers are identical only difference is the Vlan numbers hardware ect is the exact same , config was a clone

                                this server is running good and temps are at 60

                                PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
                                  11 root        155 ki31    0K  128K RUN    2  75.6H 100.00% [idle{idle: cpu2}]
                                  11 root        155 ki31    0K  128K CPU3    3  75.6H 100.00% [idle{idle: cpu3}]
                                  11 root        155 ki31    0K  128K CPU5    5  75.5H 100.00% [idle{idle: cpu5}]
                                  11 root        155 ki31    0K  128K CPU1    1  81.2H  99.56% [idle{idle: cpu1}]
                                  11 root        155 ki31    0K  128K CPU4    4  75.5H  98.78% [idle{idle: cpu4}]
                                  11 root        155 ki31    0K  128K CPU7    7  76.7H  97.27% [idle{idle: cpu7}]
                                  11 root        155 ki31    0K  128K CPU6    6  75.5H  97.27% [idle{idle: cpu6}]
                                  11 root        155 ki31    0K  128K CPU0    0  41.0H  53.27% [idle{idle: cpu0}]
                                  12 root        -92    -    0K  480K WAIT    0  36.9H  35.25% [intr{irq264: bxe0}]
                                  12 root        -92    -    0K  480K WAIT    0  24.5H  22.75% [intr{irq265: bxe1}]
                                77899 root        28    0  276M 48848K piperd  2  0:00  1.07% php-fpm: pool nginx (php-fpm)
                                35775 root        20    0 29228K 15472K select  6  82:21  0.39% /usr/local/sbin/miniupnpd -f /var/etc/m
                                    0 root        -92    -    0K  272K -      5 148:13  0.00% [kernel{dummynet}]
                                  12 root        -88    -    0K  480K WAIT    7  8:38  0.00% [intr{irq267: ahci0}]
                                6783 root        20    0 14508K  2316K select  4  3:19  0.00% /usr/sbin/syslogd -s -c -c -l /var/dhcp
                                  12 root        -60    -    0K  480K WAIT    3  2:28  0.00% [intr{swi4: clock}]
                                70215 unbound      20    0  239M  152M kqread  7  2:01  0.00% /usr/local/sbin/unbound -c /var/unbound
                                70215 unbound      20    0  239M  152M kqread  5  2:01  0.00% /usr/local/sbin/unbound -c /var/unbound

                                This server is running at 80c

                                PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
                                  11 root        155 ki31    0K  128K CPU7    7  31.8H 100.00% [idle{idle: cpu7}]
                                  11 root        155 ki31    0K  128K CPU6    6  31.7H 100.00% [idle{idle: cpu6}]
                                  11 root        155 ki31    0K  128K CPU5    5  31.6H 100.00% [idle{idle: cpu5}]
                                  11 root        155 ki31    0K  128K RUN    4  31.6H 100.00% [idle{idle: cpu4}]
                                  11 root        155 ki31    0K  128K CPU1    1  24.1H  71.29% [idle{idle: cpu1}]
                                  11 root        155 ki31    0K  128K CPU0    0  23.8H  70.26% [idle{idle: cpu0}]
                                  11 root        155 ki31    0K  128K CPU2    2  23.6H  70.07% [idle{idle: cpu2}]
                                  11 root        155 ki31    0K  128K CPU3    3  23.1H  65.87% [idle{idle: cpu3}]
                                  12 root        -92    -    0K  608K WAIT    0 297:07  23.88% [intr{irq265: bxe0:fp0}]
                                  12 root        -92    -    0K  608K WAIT    3 361:53  20.65% [intr{irq268: bxe0:fp0}]
                                  12 root        -92    -    0K  608K WAIT    1 318:48  17.58% [intr{irq266: bxe0:fp0}]
                                  12 root        -92    -    0K  608K WAIT    2 334:15  16.36% [intr{irq267: bxe0:fp0}]
                                  12 root        -92    -    0K  608K WAIT    2 195:49  15.58% [intr{irq272: bxe1:fp0}]
                                  12 root        -92    -    0K  608K WAIT    3 197:09  15.09% [intr{irq273: bxe1:fp0}]
                                  12 root        -92    -    0K  608K WAIT    1 182:28  11.67% [intr{irq271: bxe1:fp0}]
                                  12 root        -92    -    0K  608K WAIT    0 197:09  7.86% [intr{irq270: bxe1:fp0}]
                                80217 root        52    0  276M 47392K piperd  5  0:01  3.66% php-fpm: pool nginx (php-fpm)
                                    0 root        -92    -    0K  368K -      0  42:01  0.00% [kernel{dummynet}]

                                not sure what the two unbound are the bottom of the first results were.

                                so far pings times are way better so splitting the vlans up onto two servers helped for sure just curious as too why one is at 60c and the other is at 80c

                                also the server now at 60c was the original server that i posted about so thats the first set of results above.

                                1 Reply Last reply Reply Quote 0
                                • H
                                  Harvy66
                                  last edited by

                                  "Unbound" is a play on "Bind", another DNS server.

                                  I guess I'm with you wondering if something is hammering the server when the portal is enabled. Try a packet dump.

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.