Packetloss problem

stens

Hi there. First of all I would like to say thank you for the hard work behind this project. Appreciate what the developers and users behind this have created and im very impressed with pfsense!

I have installed 2 servers with pfsense 2.0 RC to share a 100meg/100meg and 30meg/30meg dual link internet connection to a local lan behind a redudant Linux router doing ipforwarding. 1st server is a P4 2.8GHz, 1GB ram, dell poweredge with sata hard disk, quad port PCI nic. 2nd server is a Celeron 2.4GHz, 1GB ram, dell poweredge with sata hard disk and quad port nic.

The quad port nics are dlink DFE580-TX. The 100meg/100meg connection is active and 30meg/30meg is unused currently.

My problem is that im seeing packet loss on the LAN interface between the Linux router and active pfsense node.

I'm using MTR and ping to see the packetloss.
If I switch pf1 to Backup and allow pf2 to become master, the packetloss follows to the new active node pf2.
I'm seeing the packetloss on the LAN CARP IP as well as the real IP.
Packetloss is usually 1%. Haven't saw it higher than 2% after 24hours.
I can verify that the active pf node is the problem by pinging from 1) linux router, 2) slave pf node, 3) client on local lan. Linux router does not have any packetloss.
Have verified that all nics and switches are 100 Full duplex. Replaced cables on LAN side.
The amount of wan traffic being passed through LAN doesn't appear to cause any difference to the packet loss. Have seen packetloss at busy and at times with low traffic.

My question is, does pfsense/freebsd do any limiting of icmp? net.inet.icmp.icmplim is set to 0. The only similarity between these 2 systems are their QoS rules and the pci quad port network cards.

Any tips or things to check over would be much appreciated. Thanks for your time

edit: some examples of the packetloss:

CARP IP:
Hostname %Loss Rcv Snt Last Best Avg Worst
1. 192.168.245.6 1% 8784 8844 0 0 11 223

PF IP:
64 bytes from 192.168.245.5: icmp_seq=6935 ttl=64 time=0.233 ms
64 bytes from 192.168.245.5: icmp_seq=6936 ttl=64 time=83.6 ms
64 bytes from 192.168.245.5: icmp_seq=6937 ttl=64 time=0.243 ms
64 bytes from 192.168.245.5: icmp_seq=6938 ttl=64 time=0.245 ms
64 bytes from 192.168.245.5: icmp_seq=6939 ttl=64 time=0.252 ms
64 bytes from 192.168.245.5: icmp_seq=6940 ttl=64 time=0.254 ms
64 bytes from 192.168.245.5: icmp_seq=6941 ttl=64 time=0.257 ms
64 bytes from 192.168.245.5: icmp_seq=6942 ttl=64 time=91.3 ms
64 bytes from 192.168.245.5: icmp_seq=6943 ttl=64 time=0.379 ms
64 bytes from 192.168.245.5: icmp_seq=6944 ttl=64 time=0.258 ms
64 bytes from 192.168.245.5: icmp_seq=6945 ttl=64 time=7.13 ms
64 bytes from 192.168.245.5: icmp_seq=6946 ttl=64 time=0.264 ms
64 bytes from 192.168.245.5: icmp_seq=6947 ttl=64 time=4.63 ms
64 bytes from 192.168.245.5: icmp_seq=6948 ttl=64 time=97.9 ms
64 bytes from 192.168.245.5: icmp_seq=6949 ttl=64 time=0.275 ms
64 bytes from 192.168.245.5: icmp_seq=6950 ttl=64 time=14.5 ms
64 bytes from 192.168.245.5: icmp_seq=6951 ttl=64 time=13.7 ms

stens

photonman

are you using squid.

I know in my setup, once I turned off squid, my packet loss was completely eliminated

eri--

If you are using QoS i would suspect that as the source of this.

stens

I'm not using squid or any proxy. Have configured pfsense to do QoS and traffic limiting.

Ermal, thanks for the information. Is there a way to do the QoS so that it doesn't affect packetloss? VOIP with 1% pl will not be so nice. Or am I asking too much to prioritise VOIP on the same connection that web, email, etc will be using. I'm not routing VOIP across this yet but that was the end goal.

On the Linux routers the default route is going out via pfsense carp ip but im source base routing VOIP out our old connection at the moment. The plan with this pfsense setup was to do QoS and rate limiting to minimise the amount of disruption to VOIP traffic when we switched VOIP out the 100/100 and 30/30.

eri--

Well it depends on you what packet loss you are seeing.
It might be the way you have configured the queues they may be enforcing the policy by dropping packets.

Without giving detail on how you have configured the QoS no conclusion can be made.

stens

Update on this. Sorry for late reply.

So far I have replaced the quad port nic with a nic that uses a different chipset - same problem.
Disabled Shaping rules - problem gone! no more 1% packetloss!

It seems the rules are a bit too strict so back to configuring the shaping. Is there a way to "dump" the rules via the command line? for pasting onto forum, instead of attaching lots of pictures. Im not sure what is configured wrong.

Thanks again.

eri--

Well i can say that at the speeds you are after you need to increase the 'Queue Limit' created.
By default it is 50 which is a bit low for your requirements.
Put it at 300 or more depending on testing that you will do.

Be aware that making the queue bigger will impact your delay on traffic.

dreamslacker

@stens:

I'm not using squid or any proxy. Have configured pfsense to do QoS and traffic limiting.

Did you add a rule to queue ICMP traffic into an unrestricted or high priority queue?

If not, it (ICMP) could end up in a queue that drops packets when overloaded, leading you to think that there are link latency issues when it is just the traffic shaper doing it's job.

I used to get this, albeit with an actual cable WAN connection until I added a rule to queue ICMP traffic into qACK (highest priority, non-bandwidth limited).