Inconsistent Ping Times

rklopoto

Hello,

Forgive me if this is covered already, but I spent the last hour or so reading 13 pages of search results to no avail.

I have a pfSense box configured here on my network (small-medium sized college) with roughly 2500 users behind it. The box has 2 interfaces one to a gigabit LAN switch, the other to a gigabit interface facing a 100MB Internet connection (partial OC3). The OC3 connection is utilized at most about 55-60Mb.

The NAT rules are slightly complex on this box. I have 12 /16 subnets NATing to 12 /32 public IP addresses. I also have a /24 within each of these subnets 1:1 NATing to another public range. For example, the 10.128.0.0/16 network might NAT to xx.xx.240.16 while 10.128.90.0/14 NATs to xx.xx.241.0/24. This configuration seems to work without issue.

Lately, my users have been complaining about inconsistent ping times. I verified today that they are right. From the WAN interface of the machine, I can ping a host and get a decent response:

64 bytes from 66.70.150.19: icmp_seq=0 ttl=50 time=20.554 ms
64 bytes from 66.70.150.19: icmp_seq=1 ttl=50 time=17.849 ms
64 bytes from 66.70.150.19: icmp_seq=2 ttl=50 time=18.575 ms

From a machine behind the NAT, the results are not so great:

64 bytes from broomeman.com (66.70.150.19): icmp_seq=2996 ttl=48 time=371 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=2997 ttl=48 time=610 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=2998 ttl=48 time=254 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=2999 ttl=48 time=256 ms

A reset of the state table drops them back into fairly normal ranges:

64 bytes from broomeman.com (66.70.150.19): icmp_seq=3151 ttl=48 time=53.5 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=3152 ttl=48 time=23.6 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=3153 ttl=48 time=69.9 ms

At first I thought maybe my state table was too small, so I increased to 50K, from 10K. I also set the table expiry to "aggressive". While this helped a little bit, every 30-45 seconds, I will all the sudden get 10-15 packets of high ping times ~500ms.

Are my rules too complex? Is there anything else I can do to tune this box into better performance?

ttlinna

@rklopoto:

Lately, my users have been complaining about inconsistent ping times. I verified today that they are right. From the WAN interface of the machine, I can ping a host and get a decent response:

64 bytes from 66.70.150.19: icmp_seq=0 ttl=50 time=20.554 ms
64 bytes from 66.70.150.19: icmp_seq=1 ttl=50 time=17.849 ms
64 bytes from 66.70.150.19: icmp_seq=2 ttl=50 time=18.575 ms

From a machine behind the NAT, the results are not so great:

64 bytes from broomeman.com (66.70.150.19): icmp_seq=2996 ttl=48 time=371 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=2997 ttl=48 time=610 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=2998 ttl=48 time=254 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=2999 ttl=48 time=256 ms

A reset of the state table drops them back into fairly normal ranges:

64 bytes from broomeman.com (66.70.150.19): icmp_seq=3151 ttl=48 time=53.5 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=3152 ttl=48 time=23.6 ms
64 bytes from broomeman.com (66.70.150.19): icmp_seq=3153 ttl=48 time=69.9 ms

At first I thought maybe my state table was too small, so I increased to 50K, from 10K. I also set the table expiry to "aggressive". While this helped a little bit, every 30-45 seconds, I will all the sudden get 10-15 packets of high ping times ~500ms.

Are my rules too complex? Is there anything else I can do to tune this box into better performance?

This is a known bug in pfSense. ICMP packets go for some reason to default queue and it seems your default queue is saturated.

There has been many threads earlier here also.

BR,

Tommi