Two bugs on 1.2.2 and 1.2.3 RC3, multiple snapshots
-
Hello,
I have been experiencing two bugs which affect my pfsense installations quite a lot.
My installations have been running 1.2 for quite a bit now, without any problems. We needed a proper load balancer, since there is a haproxy package on 1.2.2 i decided to upgrade.
So i went to 1.2.2, the upgrade went very smoothly. Immediately i started noticing quite a high cpu usage. I run a high traffic environment (400 mbit/s ++), the machines consist of xeon 5500 quadcore cpu with PCIE Intel nic's, yet cpu usage has risen from 10% to 60%. The culprit seems to be the taskq processes. Enabling/disabling polling or offloading does not make a difference, also the racoon proces seems to be using a lot more.
I'm now hitting 100% cpu with haproxy enabled. Multi core processing is working.
Someone on IRC told me the cpu usage has been fixed in 1.2.3 RC3, so i decided to upgrade again.
After upgrading to 1.2.3 RC3 the cpu usage has dropped to normal levels again, but now, i got some complaints because some connection to one of our customers is not working.
Basically when i do a http request to one certain customer I never get a response back. the data is reaching the customer, but no data is coming back.
You can try this by going to http://mobile-entry.com, the site will not work as expected from behind pfsense.
LSF in irc reported the following sites also not working: www.yr.no, www.ba.no, www.nrk.no
Anyone know how to solve one of these bugs ?
-
Basically when i do a http request to one certain customer I never get a response back. the data is reaching the customer, but no data is coming back.
You can try this by going to http://mobile-entry.com, the site will not work as expected from behind pfsense.
LSF in irc reported the following sites also not working: www.yr.no, www.ba.no, www.nrk.no
I can connect to the sites from ubuntu 9.10 trough a vanilla pfSense with the latest snapshot.
-
How's your network looking ?
I run nat with multiple vlans on lan with more than 50 vip's
I''m not the only one reporting issues with these websites, so it's definitely not only me.
-
All 4 sites open up without problems here, and I'm currently testing a pfsense setup so I'm even behind 2 pfsense boxes (both running NAT) currently.
current network setup: internet <-> PPPoE ADSL <-> pfSense 1.2.2 <-> wireless network <-> pfSense 1.2.3-RC3 <-> wired network <-> my laptop
-
Here is my cpu usage on 1.2.2 currently
last pid: 44185; load averages: 6.42, 5.82, 5.29 up 7+02:27:05 13:15:55
91 processes: 10 running, 60 sleeping, 19 waiting, 2 lock
CPU states: 3.6% user, 0.0% nice, 86.9% system, 1.7% interrupt, 7.9% idle
Mem: 463M Active, 11M Inact, 236M Wired, 876K Cache, 64M Buf, 2544M Free
Swap: 8192M Total, 8192M FreePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
29 root 1 -68 - 0K 8K - 3 31.6H 54.88% em2 taskq
27 root 1 -68 - 0K 8K - 1 35.7H 54.49% em0 taskq
40583 www 1 105 0 484M 324M RUN 3 1:15 44.48% haproxy
17561 root 1 106 0 5720K 4076K RUN 3 573:31 44.38% racoon
40584 www 1 105 0 485M 324M RUN 1 1:15 43.36% haproxy
40585 www 1 105 0 484M 324M CPU2 2 1:15 43.16% haproxy
40586 www 1 105 0 483M 323M RUN 3 1:15 42.97% haproxy
3080 root 1 62 0 8268K 6884K select 3 35.2H 25.29% bsnmpd
21448 nobody 1 54 0 3132K 1428K *Giant 1 311:47 14.45% dnsmasq
7 root 1 8 - 0K 8K - 2 476:04 11.57% thread taskq
11 root 1 171 ki31 0K 8K RUN 3 120.9H 7.18% idle: cpu3
12 root 1 171 ki31 0K 8K RUN 2 117.0H 6.88% idle: cpu2
14 root 1 171 ki31 0K 8K RUN 0 109.9H 6.30% idle: cpu0
13 root 1 171 ki31 0K 8K RUN 1 114.6H 5.96% idle: cpu1
15 root 1 -44 - 0K 8K WAIT 3 692:14 5.57% swi1: net
30 root 1 -68 - 0K 8K - 0 88:06 1.66% em3 taskq
14614 root 1 4 0 42716K 16960K accept 2 0:11 0.10% php -
I can connect to the site from a windows 7 behind pfsense 1.2.3.
-
They all look fine from behind 1.2.3RC2, four vlans, four WANs, twenty VIPs, CARP.
Noticed that a few examples end in 230-231. Are you trying to policy route that out something other than WAN? See the issue here: http://forum.pfsense.org/index.php/topic,19763.0.html
It has been fixed post RC3. -
All those load for me, how recent was the last snapshot you tried?
You can try this by going to http://mobile-entry.com, the site will not work as expected from behind pfsense.
LSF in irc reported the following sites also not working: www.yr.no, www.ba.no, www.nrk.no
-
Latest snapshot i tried was from a week ago. Not using any policy based routing (only 1 wan).
Please know traffic is reaching the customer but no data is sent back.
What can I do to give more information ?
-
What can I do to give more information ?
Maybe some info on the setup of the client's that can't connect also are the using the same ISP.
basically what do they have common.