Pfsense ignores/blocks machines after high traffic



  • Hi,

    I have just installed a clean version of pfsense - 2.0.2 stable - and we are seeing a quite annoying problem.

    After high traffic pfsense blocks the local server and webconfigurator stops answering from other machines also.

    I am pulling out hair here and can't figure out whats going on.

    Could anyone please point me in the right direction ?

    Offcourse, it helps to reboot it.. but i rather not wanting to reboot the box 100 times a day..

    From /var/log/system.log
    –----------------

    Apr  4 19:50:16 pfsense check_reload_status: Syncing firewall
    Apr  4 19:50:16 pfsense check_reload_status: Reloading filter
    Apr  4 19:50:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 19:52:33 pfsense check_reload_status: Syncing firewall
    Apr  4 19:52:33 pfsense check_reload_status: Reloading filter
    Apr  4 19:55:01 pfsense php: : Creating rrd update script
    Apr  4 19:55:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:00:02 pfsense php: : Creating rrd update script
    Apr  4 20:00:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:05:01 pfsense php: : Creating rrd update script
    Apr  4 20:05:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:08:50 pfsense php: : Creating rrd update script
    Apr  4 20:09:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:10:01 pfsense php: : Creating rrd update script
    Apr  4 20:10:28 pfsense apinger: rrdtool respawning too fast, waiting 300s.
    Apr  4 20:15:02 pfsense php: : Creating rrd update script
    Apr  4 20:15:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^$

    br,
    Leon



  • What kind of hardware are you running on?

    Sounds like state exhaustion but cant know until more info.

    :)



  • A hp U1 server

    2gigs of memory
    Version 2.0.2-RELEASE (amd64)
    built on Fri Dec 7 22:39:16 EST 2012
    FreeBSD 8.1-RELEASE-p13
    Platform pfSense
    CPU Type Intel(R) Pentium(R) D CPU 2.80GHz

    What info can i get for you ?



  • On your dashboard look for "states" like the picture below.  In my picture Ive just rebooted. (still getting calls asking why the internet is down…  ;D)

    The number on the right of the slash is maximum your box is set for.  The number on the left is present.  What are those?




  • Sorry, got confused there :)

    Last config change Thu Apr 4 20:46:09 CEST 2013
    State table size
    Show states
    MBUF Usage 2724/25600



  • The one you show is MBUF. What does the "State Table Size" above it show?

    The first number should increase as traffic increases.  You might try increasing the maximum states on    "System-Advanced-Firewall/NAT  Firewall Maximum States"

    Try 1million and see if your problem ceases.



  • Increased and let it run over night - same problem.. it kills the connection to any server with high traffic for XX minuts… something like 20 mins...



  • Go to Diagnostics->Tables and check your "virusprot" table to see if the IP addresses of the affected hosts are listed.  Maybe the high-bandwidth sessions are triggering the lockout via a firewall rule.



  • Allready checked that, table is empty.

    I tried to upgrade to latest dev. version - same issue.. still have to reboot the machine to get the systems back on track..



  • Apr  4 19:50:16 pfsense check_reload_status: Syncing firewall
    Apr  4 19:50:16 pfsense check_reload_status: Reloading filter
    Apr  4 19:50:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 19:52:33 pfsense check_reload_status: Syncing firewall
    Apr  4 19:52:33 pfsense check_reload_status: Reloading filter
    Apr  4 19:55:01 pfsense php: : Creating rrd update script
    Apr  4 19:55:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:00:02 pfsense php: : Creating rrd update script
    Apr  4 20:00:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:05:01 pfsense php: : Creating rrd update script
    Apr  4 20:05:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:08:50 pfsense php: : Creating rrd update script
    Apr  4 20:09:28 pfsense apinger: Error while feeding rrdtool: Broken pipe
    Apr  4 20:10:01 pfsense php: : Creating rrd update script
    Apr  4 20:10:28 pfsense apinger: rrdtool respawning too fast, waiting 300s.
    Apr  4 20:15:02 pfsense php: : Creating rrd update script
    Apr  4 20:15:28 pfsense apinger: Error while feeding rrdtool: Broken pipe

    Just noticed you added this to your first post after my reply…

    Can you provide a topography of your setup?  Looks like your WAN port goes down.

    What kind of connection?

    Who is the provider?

    What are your provider advertised speeds?

    Ect-



  • Hi,

    Actually.. WAN stays online since i dont get disconnected from that.

    Its hosted at Indicate in sweden to a 1gbit fiber connection - our ClearOS gateway servers are stable without issue… ironically - we firsted tested a virtual pfsense in hyper-v based on Alex's work - that one would crash and restart on high load.

    Our provider have 2 x 1gbit fiber and we are connected directly to that.

    Its worth to notice that i use nginx (also tested with squid) as a reverse proxy - however - there is no diffrence on wether i just use nat or the reverse proxies - it feels like i have i have tried every solution i can come up with and the logs dosent give any clear indication on why it happends.. to me it seems clear that pfsense is blocking the machine, since eg. server1 can be accesible while 2+3 is not.. first the servers is cut of.. then the webconfig becomes inaccesible .. then the machines sooner or later come back.. only to die 10-15 mins. later



  • Is it hitting the states limit? That's the exact symptoms of maxing out your state table. You can see in the RRD graph, System tab, states.



  • states was set to 1mill limit earlier..



  • Issue resolved by reinstalling the server with 32bit version.. seems to me that 64bit version of pfsense have driver issues with some network cards… or something like that.

    Thank you for all of your help.

    Bit of a note, webconfig still dies once in a while.. a 5 or 10 min. cronjob to restart the webconfig fixes that.

    */5 * * * * /scripts/fixwebconfig.sh >/dev/null


Locked