State table bottleneck



  • This post is essentially a continuation of another post, https://forum.pfsense.org/index.php/topic,72486.0.html.

    I continue to have issues with what seems to be state table changes, the rate of changes.  I've updated to 2.1-release.  I'm running an 8 core box (CPU E5-2470 0 @ 2.30GHz) with 6GB of ram and I'm running intel nics (igb driver).  The issue I'm having seems to relate to the number of new connections per second coming into the WAN port, we get a large volume of HTTP requests.  When the problem is occuring, top -SH tends to show something like this:
    last pid:  2937;  load averages:  7.85,  7.66,  5.69
    203 processes: 15 running, 130 sleeping, 55 waiting, 3 lock
    CPU:  0.4% user,  0.5% nice,  2.2% system, 89.9% interrupt,  7.1% idle
    Mem: 898M Active, 36M Inact, 1496M Wired, 128K Cache, 179M Buf, 3443M Free
    Swap: 16G Total, 16G Free

    PID USERNAME PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
      12 root    -44    -    0K  1008K CPU1    1 392:59 79.79% intr{swi1: netisr 1}
      12 root    -44    -    0K  1008K CPU6    6 397:58 79.30% intr{swi1: netisr 6}
      12 root    -44    -    0K  1008K CPU7    7 387:46 79.05% intr{swi1: netisr 7}
      12 root    -44    -    0K  1008K CPU5    5 396:55 78.96% intr{swi1: netisr 5}
      12 root    -44    -    0K  1008K CPU2    2 392:54 78.96% intr{swi1: netisr 2}
      12 root    -44    -    0K  1008K *pf ta  0  57:58 78.76% intr{swi1: netisr 0}
      12 root    -44    -    0K  1008K *pf ta  4  57:38 78.37% intr{swi1: netisr 4}
      12 root    -44    -    0K  1008K *pf ta  3 393:33 77.88% intr{swi1: netisr 3}
    74645 root    120  20  1092M  770M CPU6    6  0:06 19.87% pfctl
      11 root    171 ki31    0K  128K RUN    5  76.9H 15.97% idle{idle: cpu5}
      11 root    171 ki31    0K  128K RUN    6  77.0H 15.38% idle{idle: cpu6}
      11 root    171 ki31    0K  128K CPU3    3  76.9H 15.38% idle{idle: cpu3}
        8 root    -16    -    0K    16K pftm    1  11:01 14.70% pfpurge
      11 root    171 ki31    0K  128K RUN    4  78.2H 14.60% idle{idle: cpu4}
      11 root    171 ki31    0K  128K CPU2    2  76.7H 14.06% idle{idle: cpu2}
      11 root    171 ki31    0K  128K RUN    0  77.7H 12.89% idle{idle: cpu0}
      11 root    171 ki31    0K  128K RUN    7  77.2H 10.06% idle{idle: cpu7}
      11 root    171 ki31    0K  128K RUN    1  76.4H  9.28% idle{idle: cpu1}

    I can see there that the CPU is busy handling interrupts.  I had been running 8 igb queues and an igb interrupt rate of as much as 30000.  In working the problem I've tuned things to use 4 igb queues and an interrupt rate of 3000.  So my current igb settings are:
    hw.igb.max_interrupt_rate="3000"
    hw.igb.num_queues="4"
    hw.igb.rxd=4096
    hw.igb.txd=4096
    hw.igb.rx_process_limit=600
    hw.igb.fc_setting=0
    hw.igb.enable_aim=0

    With fewer queues and interrupts per second, when the problem occurs top -SH now shows something like this:
    last pid: 90610;  load averages:  3.26,  2.13,  1.45
    174 processes: 15 running, 117 sleeping, 42 waiting
    CPU:  0.0% user,  6.2% nice,  6.6% system, 49.9% interrupt, 37.2% idle
    Mem: 1609M Active, 35M Inact, 2283M Wired, 140K Cache, 176M Buf, 1946M Free
    Swap: 16G Total, 16G Free

    PID USERNAME PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
      11 root    171 ki31    0K  128K RUN    3  43:32 100.00% idle{idle: cpu3}
    88156 root    139  20  2072M  1484M CPU4    4  1:09 100.00% pfctl
      11 root    171 ki31    0K  128K CPU1    1  42:47 97.07% idle{idle: cpu1}
      11 root    171 ki31    0K  128K CPU2    2  42:54 96.29% idle{idle: cpu2}
      12 root    -44    -    0K  752K CPU6    6  15:21 60.25% intr{swi1: netisr 6}
      12 root    -44    -    0K  752K CPU5    5  15:21 59.28% intr{swi1: netisr 5}
      12 root    -44    -    0K  752K CPU7    7  15:13 57.76% intr{swi1: netisr 7}
      12 root    -44    -    0K  752K CPU0    0  5:03 54.20% intr{swi1: netisr 0}
      11 root    171 ki31    0K  128K RUN    7  30:07 40.87% idle{idle: cpu7}
      11 root    171 ki31    0K  128K RUN    5  29:48 40.58% idle{idle: cpu5}
      11 root    171 ki31    0K  128K RUN    6  29:52 40.28% idle{idle: cpu6}
      11 root    171 ki31    0K  128K RUN    0  28:56 39.26% idle{idle: cpu0}
        8 root    -16    -    0K    16K pftm    2  0:37  8.89% pfpurge
      12 root    -68    -    0K  752K WAIT    0  6:20  4.79% intr{irq256: igb0:que}
      12 root    -32    -    0K  752K RUN    0  0:04  1.56% intr{swi4: clock}
      12 root    -68    -    0K  752K WAIT    2  0:44  1.37% intr{irq268: igb2:que}
      12 root    -68    -    0K  752K WAIT    3  0:52  1.27% intr{irq259: igb0:que}
      12 root    -68    -    0K  752K WAIT    1  0:53  1.07% intr{irq257: igb0:que}
      12 root    -68    -    0K  752K WAIT    2  0:53  0.98% intr{irq258: igb0:que}
      12 root    -68    -    0K  752K WAIT    0  4:57  0.59% intr{irq266: igb2:que}
      12 root    -68    -    0K  752K WAIT    1  0:44  0.39% intr{irq267: igb2:que}
      12 root    -68    -    0K  752K WAIT    3  0:42  0.10% intr{irq269: igb2:que}
        0 root    -16    0    0K  496K sched  1    ???  0.00% kernel{swapper}
      11 root    171 ki31    0K  128K RUN    4  42:26  0.00% idle{idle: cpu4}
      258 root      64  20  6908K  1396K kqread  4  0:24  0.00% check_reload_status

    I also tried increasing the maximum number of states from 1.4M to 4M.  I also tried adding Adaptive settings, start set at 3.8M end set at 4.5M.  If I watch pfctl -si as traffic increases, here is what I see (some output truncated for brevity):
    pfctl -si
    Status: Enabled for 0 days 00:09:14          Debug: Urgent
    State Table                          Total            Rate
      current entries                    64457
      searches                        55269830        99765.0/s
      inserts                          735335        1327.3/s
      removals                          670878        1211.0/s

    Status: Enabled for 0 days 00:14:45          Debug: Urgent
    State Table                          Total            Rate
      current entries                    92571
      searches                      101515078      114706.3/s
      inserts                          1356592        1532.9/s
      removals                        1264021        1428.3/s

    Status: Enabled for 0 days 00:16:52          Debug: Urgent
    State Table                          Total            Rate
      current entries                  106661
      searches                      122907764      121450.4/s
      inserts                          1678675        1658.8/s
      removals                        1572014        1553.4/s

    Status: Enabled for 0 days 00:17:52          Debug: Urgent
    State Table                          Total            Rate
      current entries                  115283
      searches                      133190397      124244.8/s
      inserts                          1842674        1718.9/s
      removals                        1727391        1611.4/s

    Status: Enabled for 0 days 00:20:49          Debug: Urgent
    State Table                          Total            Rate
      current entries                    91448
      searches                      163840362      131177.2/s
      inserts                          2324826        1861.3/s
      removals                        2233378        1788.1/s

    Status: Enabled for 0 days 00:22:21          Debug: Urgent
    State Table                          Total            Rate
      current entries                    93690
      searches                      179943446      134186.0/s
      inserts                          2584864        1927.6/s
      removals                        2491174        1857.7/s

    Status: Enabled for 0 days 00:24:03          Debug: Urgent
    State Table                          Total            Rate
      current entries                  109246
      searches                      197771316      137055.7/s
      inserts                          2866301        1986.3/s
      removals                        2757055        1910.6/s

    Status: Enabled for 0 days 00:28:27          Debug: Urgent
    State Table                          Total            Rate
      current entries                  120905
      searches                      245126239      143600.6/s
      inserts                          3656115        2141.8/s
      removals                        3535210        2071.0/s

    Status: Enabled for 0 days 00:38:16          Debug: Urgent
    State Table                          Total            Rate
      current entries                  131888
      searches                      349074148      152035.8/s
      inserts                          5362049        2335.4/s
      removals                        5230161        2277.9/s

    Status: Enabled for 0 days 00:41:47          Debug: Urgent
    State Table                          Total            Rate
      current entries                  1990544
      searches                      388062532      154791.6/s
      inserts                          7984517        3184.9/s
      removals                        5993973        2390.9/s

    Essentially, over time, the rate of state searches increases with traffic.  The number of state table entries increases at the same time.  Somewhere beyond 140000 searches a second I hit a knee in the curve (so to speak) where it no longer keeps up and the number of states in the table rockets to the max (or near the max) and the CPU use goes to 100% (at least on cores handling netisr).  At that point pfSense seems to be spending all its time dealing with states.

    I'm not sure what else I can do to tune my way out of this.  Short of faster CPUs, is there something I can do to get pfSense to keep up?

    I've attached a few charts for reference.  The data rates and packet rates don't seem that extraordinary to me, seems like the hardware should be able to handle that.

    Thanks,
    Steve









  • Netgate Administrator

    So does it ever reach 3.8M states? If not then what's the point having the adaptive rate starting there?
    Try starting at 1M for a much more gradual increase.

    As I said in the other thread though this is beyond anything I've tried to deal with.  ;)

    Steve



  • Not sure I let it get to 3.8M states before taking action - this is production traffic.  My experience is that the adaptive settings don't really help.  When pfSense tosses states for an active connection the sender tries again, apparently the sender has more capacity to generate new connections/states than pfSense has capacity to keep up - pfSense loses every time, so far no matter what the settings are.  My recourse is to reduce the traffic, though that does not meet our business needs.