• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

High Pings times when Captive Portal is enabled.

Scheduled Pinned Locked Moved General pfSense Questions
15 Posts 3 Posters 2.1k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    michaelcox1
    last edited by Sep 20, 2016, 4:17 PM Sep 19, 2016, 5:55 AM

    hardware information

    CPU Type Intel(R) Xeon(R) CPU E3-1280 v3 @ 3.60GHz
    Current: 3600 MHz, Max: 3601 MHz
    8 CPUs: 1 package(s) x 4 core(s) x 2 SMT threads

    16 Gb Ram

    interfaces used - 10Gbit deul port Broadcom Copper NIC

    13 Vlans are configured on the captive portal

    Version of Pfsense - 2.3.2

    First wanted to say thank you to anyone who gives any light into this issue.

    The issue i have , I'm running an Intel Xeon 4 physical and 4 virtual SMT cores and when i have the captive portal enabled the GUI becomes unresponsive (501 Error bad gateway), the CPU temps go up almost 20c and the ping times to google go from 9ms to over 200ms

    the server is connected to a 1GB fiber circuit and the load with CP on is around 500mbps and with it turned off total traffic load is around 700mbps

    when i look at the system activity this is what i see when the portal is turned on.

    PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
      11 root              155 ki31    0K  128K RUN    4 580:42  99.76% [idle{idle: cpu4}]
      11 root              155 ki31    0K  128K CPU5    5 581:40  99.66% [idle{idle: cpu5}]
      11 root              155 ki31    0K  128K CPU6    6 581:44  94.97% [idle{idle: cpu6}]
      11 root              155 ki31    0K  128K CPU7    7 582:13  92.29% [idle{idle: cpu7}]
      12 root                -92    -    0K  608K RUN    1 319:47  66.70% [intr{irq266: bxe0:fp0}]
      12 root                -92    -    0K  608K RUN    0 322:12  61.18% [intr{irq265: bxe0:fp0}]
      12 root                -92    -    0K  608K RUN    3 323:51  59.47% [intr{irq268: bxe0:fp0}]
      12 root                -92    -    0K  608K RUN    2 319:26  55.18% [intr{irq267: bxe0:fp0}]
      12 root                -92    -    0K  608K CPU2    2 213:22  46.58% [intr{irq272: bxe1:fp0}]
      12 root                -92    -    0K  608K CPU3    3 206:58  39.16% [intr{irq273: bxe1:fp0}]
      12 root                -92    -    0K  608K CPU1    1 204:19  39.16% [intr{irq271: bxe1:fp0}]
      12 root                -92    -    0K  608K CPU0    0 213:37  38.48% [intr{irq270: bxe1:fp0}]
    65361 root              52    0  280M 54080K piperd  5  0:05  9.77% php-fpm: pool nginx (php-fpm)
    34927 root              76    0 12276K  5392K RUN    3  0:00  7.28% /sbin/sysctl -a
    35002 root              29    0 18740K  2292K piperd  7  0:00  6.98% grep temperature
    62347 root              45    0  280M 52044K piperd  7  0:03  6.79% php-fpm: pool nginx (php-fpm)
    34745 root              47    0 17000K  2484K wait    4  0:00  6.79% sh -c /sbin/sysctl -a | grep temperatur
      11 root                155 ki31    0K  128K RUN    0  68:44  3.96% [idle{idle: cpu0}]

    from this i can see four of the cores are not really doing much so im not sure why it would be slugglish or cause the GUI to crash.

    any help would be much appreciated.

    Thank you.

    1 Reply Last reply Reply Quote 0
    • M
      michaelcox1
      last edited by Sep 19, 2016, 11:23 PM

      last pid: 37334;  load averages:  0.45,  1.85,  4.22  up 1+03:39:06    23:22:33
      209 processes: 9 running, 162 sleeping, 38 waiting

      Mem: 75M Active, 790M Inact, 639M Wired, 3257M Buf, 14G Free
      Swap: 32G Total, 32G Free

      PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
        11 root        155 ki31    0K  128K CPU7    7  26.6H 100.00% [idle{idle: cpu7}]
        11 root        155 ki31    0K  128K CPU6    6  26.5H 100.00% [idle{idle: cpu6}]
        11 root        155 ki31    0K  128K CPU5    5  26.5H 100.00% [idle{idle: cpu5}]
        11 root        155 ki31    0K  128K RUN    4  26.5H  99.37% [idle{idle: cpu4}]
        11 root        155 ki31    0K  128K CPU0    0 647:14  98.29% [idle{idle: cpu0}]
        11 root        155 ki31    0K  128K CPU1    1 675:11  97.36% [idle{idle: cpu1}]
        11 root        155 ki31    0K  128K CPU3    3 680:18  95.90% [idle{idle: cpu3}]
        11 root        155 ki31    0K  128K CPU2    2 661:39  95.75% [idle{idle: cpu2}]
      30617 root        22    0 25132K 12060K select  7  37:34  4.88% /usr/local/sbin/miniupnpd -f /var/etc/m
        12 root        -92    -    0K  608K WAIT    1 599:09  2.49% [intr{irq266: bxe0:fp0}]
        12 root        -92    -    0K  608K WAIT    0 598:25  2.20% [intr{irq265: bxe0:fp0}]
        12 root        -92    -    0K  608K WAIT    2 603:58  1.95% [intr{irq267: bxe0:fp0}]
        12 root        -92    -    0K  608K WAIT    3 591:06  1.56% [intr{irq268: bxe0:fp0}]
        12 root        -92    -    0K  608K WAIT    0 391:23  1.27% [intr{irq270: bxe1:fp0}]
        12 root        -92    -    0K  608K WAIT    2 386:54  1.17% [intr{irq272: bxe1:fp0}]
        12 root        -92    -    0K  608K WAIT    3 381:04  1.17% [intr{irq273: bxe1:fp0}]
        12 root        -92    -    0K  608K WAIT    1 379:19  1.07% [intr{irq271: bxe1:fp0}]
      35499 root        21    0  280M 48420K piperd  4  0:00  0.49% php-fpm: pool nginx (php-fpm)

      This is with the portal off.

      1 Reply Last reply Reply Quote 0
      • M
        michaelcox1
        last edited by Sep 20, 2016, 12:40 AM

        After doing some messing around with the Portal settings , it seems to be doing much better when individual bandwidth per user is turned off , anyone else run into this ?

        1 Reply Last reply Reply Quote 0
        • M
          michaelcox1
          last edited by Sep 20, 2016, 2:09 PM

          nevermind , same issue with portal enabled.

          now i have changed the Maximum Concurrent connection to 3 and checked the disable Concurrent user logins

          will update if anything changes.

          1 Reply Last reply Reply Quote 0
          • H
            heper
            last edited by Sep 20, 2016, 3:41 PM

            looks like an interrupt storm to me, not always easy to fix

            try nic tuning wiki: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

            1 Reply Last reply Reply Quote 0
            • M
              michaelcox1
              last edited by Sep 20, 2016, 4:09 PM

              Thank you very much for replying,

              I am running Broadcom 10Gb NICS so i did implement this fix :

              Broadcom bce(4) Cards
              Several users have noted issues with certain Broadcom network cards, especially those built into Dell hardware. If the bce cards in the firewall are behaving erratically, dropping packets, or causing system crashes, then the following tweaks may help, especially on amd64.

              In /boot/loader.conf.local - Add the following (or create the file if it does not exist):

              kern.ipc.nmbclusters="131072"
              hw.bce.tso_enable=0
              hw.pci.enable_msix=0

              i did edit the BCE for BXE which is what my cards are labeled as.

              I will monitor it and see if we get any improvements , Thank you again !

              1 Reply Last reply Reply Quote 0
              • M
                michaelcox1
                last edited by Sep 21, 2016, 1:22 PM

                Monitored over night with very little change , the only issue was i never rebooted the servers after making the change so i have that done now and again will monitor.

                1 Reply Last reply Reply Quote 0
                • M
                  michaelcox1
                  last edited by Sep 21, 2016, 1:29 PM

                  After rebooting i got the 502 error  had to use the option 16 to reboot PHP

                  1 Reply Last reply Reply Quote 0
                  • M
                    michaelcox1
                    last edited by Sep 22, 2016, 2:56 PM

                    So last night went very well after the reboot , i was getting a constant ping to google between 9-10 ms
                    CPU utilization was between 15-20%
                    temperatures on the cores were between 69-71

                    so far what i have done for this issue -

                    I Applied the interface file fix that Heper had suggested
                    in the portal i have mac concurrent connections set to 3
                    and also i had checked - Disable Concurrent user logins
                    If enabled only the most recent login per username will be active. Subsequent logins will cause machines previously logged in with the same username to be disconnected.

                    but i think this last one forced my users to keep re-registering so i turned that one back off.

                    I will be making the changes to another server that had the same issue and again i will report back and if it solves it i will mark the post SOLVED !

                    1 Reply Last reply Reply Quote 0
                    • M
                      michaelcox1
                      last edited by Sep 23, 2016, 1:26 AM

                      still having issues , seems to be anything over 400mb and its struggling , im going to install a second server to drop the vlans down to 6 per server

                      i keep getting these errors too nginx:  [alert] 75850#100188: send() failed (40: Message too long)

                      1 Reply Last reply Reply Quote 0
                      • H
                        Harvy66
                        last edited by Sep 23, 2016, 12:01 PM

                        I guess my question would be, what is the Captive portal changing that interrupts go crazy? Do you have it set to do any special things per user or something?

                        1 Reply Last reply Reply Quote 0
                        • M
                          michaelcox1
                          last edited by Sep 23, 2016, 9:08 PM

                          Users Authenticate to a Radius server then in the portal i have bandwidth per user configured other than that nothing special

                          1 Reply Last reply Reply Quote 0
                          • H
                            Harvy66
                            last edited by Sep 24, 2016, 7:38 PM

                            Try disabling bandwidth shaping just to see if it makes a difference.

                            1 Reply Last reply Reply Quote 0
                            • M
                              michaelcox1
                              last edited by Sep 25, 2016, 1:21 AM

                              I did that , just disabled the bandwidth per user option and it was still the same.

                              so far i have split the site into two -  one of the servers Temps are running at 60c the second server is now at 80c so im really wondering if i have a device in one of the vlans that just hammering the portal

                              here the CPU activity on both servers and by the way both servers are identical only difference is the Vlan numbers hardware ect is the exact same , config was a clone

                              this server is running good and temps are at 60

                              PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
                                11 root        155 ki31    0K  128K RUN    2  75.6H 100.00% [idle{idle: cpu2}]
                                11 root        155 ki31    0K  128K CPU3    3  75.6H 100.00% [idle{idle: cpu3}]
                                11 root        155 ki31    0K  128K CPU5    5  75.5H 100.00% [idle{idle: cpu5}]
                                11 root        155 ki31    0K  128K CPU1    1  81.2H  99.56% [idle{idle: cpu1}]
                                11 root        155 ki31    0K  128K CPU4    4  75.5H  98.78% [idle{idle: cpu4}]
                                11 root        155 ki31    0K  128K CPU7    7  76.7H  97.27% [idle{idle: cpu7}]
                                11 root        155 ki31    0K  128K CPU6    6  75.5H  97.27% [idle{idle: cpu6}]
                                11 root        155 ki31    0K  128K CPU0    0  41.0H  53.27% [idle{idle: cpu0}]
                                12 root        -92    -    0K  480K WAIT    0  36.9H  35.25% [intr{irq264: bxe0}]
                                12 root        -92    -    0K  480K WAIT    0  24.5H  22.75% [intr{irq265: bxe1}]
                              77899 root        28    0  276M 48848K piperd  2  0:00  1.07% php-fpm: pool nginx (php-fpm)
                              35775 root        20    0 29228K 15472K select  6  82:21  0.39% /usr/local/sbin/miniupnpd -f /var/etc/m
                                  0 root        -92    -    0K  272K -      5 148:13  0.00% [kernel{dummynet}]
                                12 root        -88    -    0K  480K WAIT    7  8:38  0.00% [intr{irq267: ahci0}]
                              6783 root        20    0 14508K  2316K select  4  3:19  0.00% /usr/sbin/syslogd -s -c -c -l /var/dhcp
                                12 root        -60    -    0K  480K WAIT    3  2:28  0.00% [intr{swi4: clock}]
                              70215 unbound      20    0  239M  152M kqread  7  2:01  0.00% /usr/local/sbin/unbound -c /var/unbound
                              70215 unbound      20    0  239M  152M kqread  5  2:01  0.00% /usr/local/sbin/unbound -c /var/unbound

                              This server is running at 80c

                              PID USERNAME    PRI NICE  SIZE    RES STATE  C  TIME    WCPU COMMAND
                                11 root        155 ki31    0K  128K CPU7    7  31.8H 100.00% [idle{idle: cpu7}]
                                11 root        155 ki31    0K  128K CPU6    6  31.7H 100.00% [idle{idle: cpu6}]
                                11 root        155 ki31    0K  128K CPU5    5  31.6H 100.00% [idle{idle: cpu5}]
                                11 root        155 ki31    0K  128K RUN    4  31.6H 100.00% [idle{idle: cpu4}]
                                11 root        155 ki31    0K  128K CPU1    1  24.1H  71.29% [idle{idle: cpu1}]
                                11 root        155 ki31    0K  128K CPU0    0  23.8H  70.26% [idle{idle: cpu0}]
                                11 root        155 ki31    0K  128K CPU2    2  23.6H  70.07% [idle{idle: cpu2}]
                                11 root        155 ki31    0K  128K CPU3    3  23.1H  65.87% [idle{idle: cpu3}]
                                12 root        -92    -    0K  608K WAIT    0 297:07  23.88% [intr{irq265: bxe0:fp0}]
                                12 root        -92    -    0K  608K WAIT    3 361:53  20.65% [intr{irq268: bxe0:fp0}]
                                12 root        -92    -    0K  608K WAIT    1 318:48  17.58% [intr{irq266: bxe0:fp0}]
                                12 root        -92    -    0K  608K WAIT    2 334:15  16.36% [intr{irq267: bxe0:fp0}]
                                12 root        -92    -    0K  608K WAIT    2 195:49  15.58% [intr{irq272: bxe1:fp0}]
                                12 root        -92    -    0K  608K WAIT    3 197:09  15.09% [intr{irq273: bxe1:fp0}]
                                12 root        -92    -    0K  608K WAIT    1 182:28  11.67% [intr{irq271: bxe1:fp0}]
                                12 root        -92    -    0K  608K WAIT    0 197:09  7.86% [intr{irq270: bxe1:fp0}]
                              80217 root        52    0  276M 47392K piperd  5  0:01  3.66% php-fpm: pool nginx (php-fpm)
                                  0 root        -92    -    0K  368K -      0  42:01  0.00% [kernel{dummynet}]

                              not sure what the two unbound are the bottom of the first results were.

                              so far pings times are way better so splitting the vlans up onto two servers helped for sure just curious as too why one is at 60c and the other is at 80c

                              also the server now at 60c was the original server that i posted about so thats the first set of results above.

                              1 Reply Last reply Reply Quote 0
                              • H
                                Harvy66
                                last edited by Sep 25, 2016, 4:38 AM

                                "Unbound" is a play on "Bind", another DNS server.

                                I guess I'm with you wondering if something is hammering the server when the portal is enabled. Try a packet dump.

                                1 Reply Last reply Reply Quote 0
                                4 out of 15
                                • First post
                                  4/15
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                  This community forum collects and processes your personal information.
                                  consent.not_received