PfSense causing bad IP Header checksums
-
I have been struggling with bad voip audio quality for a while. At first I thought it had to do with traffic shaping (http://forum.pfsense.org/index.php/topic,20589.0.html). But today I discovered bad IP header checksums for traffic passing through my pfsense.
Here is a portion of a packet capture on the LAN side of the pfsense:
Internet Protocol, Src: 64.2.142.19 (64.2.142.19), Dst: 192.168.3.30 (192.168.3.30)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
…. ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 60
Identification: 0x8589 (34185)
Flags: 0x02 (Don't Fragment)
0.. = Reserved bit: Not Set
.1. = Don't fragment: Set
..0 = More fragments: Not Set
Fragment offset: 0
Time to live: 53
Protocol: UDP (0x11)
Header checksum: 0x0000 [incorrect, should be 0x2e4c]
[Good: False]
[Bad : True]
[Expert Info (Error/Checksum): Bad checksum]
[Message: Bad checksum]
[Severity level: Error]
[Group: Checksum]
Source: 64.2.142.19 (64.2.142.19)
Destination: 192.168.3.30 (192.168.3.30)
User Datagram Protocol, Src Port: 10474 (10474), Dst Port: 13750 (13750)
Source port: 10474 (10474)
Destination port: 13750 (13750)
Length: 40
Checksum: 0x700b [validation disabled]
[Good Checksum: False]
[Bad Checksum: False]
Data (32 bytes)0000 80 12 36 fc 00 00 19 a0 2f 00 3f 75 70 12 d5 9d ..6…../.?up...
0010 2a 42 c8 0e 00 d0 78 76 95 9d 96 2b c9 82 b9 5e *B....xv...+...^
Data: 801236FC000019A02F003F757012D59D2A42C80E00D07876...
[Length: 32]Notice the:
Header checksum: 0x0000 [incorrect, should be 0x2e4c]
What is also interesting is that traffic FROM my pbx (192.168.3.30) to the LAN packet capture are fine. Its just the packets that pass through the firewall TO my pbx have bad checksums.
I also did a capture on the WAN interface:
Internet Protocol, Src: 216.70.150.48 (216.70.150.48), Dst: 64.2.142.19 (64.2.142.19)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0xb8 (DSCP 0x2e: Expedited Forwarding; ECN: 0x00)
1011 10.. = Differentiated Services Codepoint: Expedited Forwarding (0x2e)
…. ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 60
Identification: 0x6b8e (27534)
Flags: 0x02 (Don't Fragment)
0.. = Reserved bit: Not Set
.1. = Don't fragment: Set
..0 = More fragments: Not Set
Fragment offset: 0
Time to live: 63
Protocol: UDP (0x11)
Header checksum: 0x0000 [incorrect, should be 0x92de]
[Good: False]
[Bad : True]
[Expert Info (Error/Checksum): Bad checksum]
[Message: Bad checksum]
[Severity level: Error]
[Group: Checksum]
Source: 216.70.150.48 (216.70.150.48)
Destination: 64.2.142.19 (64.2.142.19)
User Datagram Protocol, Src Port: 11904 (11904), Dst Port: 16130 (16130)
Source port: 11904 (11904)
Destination port: 16130 (16130)
Length: 40
Checksum: 0xa622 [validation disabled]
[Good Checksum: False]
[Bad Checksum: False]
Data (32 bytes)0000 80 12 88 ef 00 3b 7f a0 6d 99 2c 25 28 64 a8 31 …..;..m.,%(d.1
0010 3a 73 23 29 c5 32 68 02 68 6d d9 58 f1 e6 fd bb :s#).2h.hm.X....
Data: 801288EF003B7FA06D992C252864A8313A732329C5326802...
[Length: 32]Notice the:
Header checksum: 0x0000 [incorrect, should be 0x92de]
Same thing here, packets heading OUT of the firewall on the WAN side to my VOIP provider have bad checksums, but packets coming FROM my VOIP provider to the WAN interface have good checksums.
My firewall hardware is an Alix 2c3 with pfSense 1.2.2.
I've tried replacing the ethernet wires, but could try that again I suppose. The firewall has been replaced multiple times with different equipement all running pfSense. Although I didn't do packet captures on the other hardware, we still had bad VOIP quality.
I've attached a RRD WAN quality graph.
Does anyone know what could be causing this or how to fix it?

 -
Try this:
http://doc.pfsense.org/index.php/Lost_Traffic_/_Packets_Disappear -
That may be normal, hardware checksum offloading means the checksum will be gone by the time the traffic gets captured. If you disable checksum offloading you'll see what's really on the wire. That will likely solve the problem if there really are frames on the wire with bad checksums
-
@cmb:
That may be normal, hardware checksum offloading means the checksum will be gone by the time the traffic gets captured. If you disable checksum offloading you'll see what's really on the wire. That will likely solve the problem if there really are frames on the wire with bad checksums
I found this option and checked it, but still get the same corrupt checksums. So I don't think that solves my problem.
I tried to do a packet capture at another clients with a similar setup, but the packet capture feature seems to be broken for me:
http://forum.pfsense.org/index.php/topic,21381.0.html
-
Can anyone else confirm or deny if they are getting bad checksums when data passes through pfSense 1.2.2 on an Alix 2c3?
I'm not positive the bad checksums are causing our audio problems, but I suspect that is the case. I need to find a solution ASAP. I would LOVE to stick with pfsense, but at this point am willing to try anything that will work.
-
The only time I had problems with checksums on ALIX were when I had a wireless interface bridged to LAN, and in that case it resulted in no traffic passing at all, not lost packets.
It may still help. It doesn't hurt to disable checksums and try it, see if it makes things better.
-
The only time I had problems with checksums on ALIX were when I had a wireless interface bridged to LAN, and in that case it resulted in no traffic passing at all, not lost packets.
It may still help. It doesn't hurt to disable checksums and try it, see if it makes things better.
Yes, I did disable checksums with the same corrupt results as before. So that doesn't seem to be the cause.
-
Do you get the bad checksums for TCP traffic too? If so, I would expect to see retransmissions from the other end, due to the TCP segments being discarded.
-
Its interesting that the bad checksums are zero. Is this also true of a larger sample?
With support for hardware checksum offloading I would expect that on transmission of a routed packet the IP header checksum might be set to 0 with the device driver to take responsibility for calculating the IP header checksum (either offloaded to hardware or calculated in software). If checksum is calculated by hardware the packet capture might always show a bad checksum. If checksum is calculated by software and packet capture is done before the checksum is updated in the packet then packet capture will always show a bad checksum. (And this could be driver specific!) This is just speculation because I haven't checked the sources.
Have you checked for reported checksum errors in any of the end systems involved in this conversation?
Do you see the bad checksums in RECEIVED packets on the WAN interface? (You don't see them on received packets on the LAN interface.)
Have you checked the protocol error counters on the LAN system in the VOIP conversations? On a FreeBSD system the shell commands systat -tcp and systat -ip can be used to watch the change in the protocol counters.
-
good point, wallaby.
-
I seem to have bad checksums for all traffic, but not ALL traffic, just quite a bit.
Also, its ONLY happening when the traffic goes THROUGH the firewall. If I do a capture on the LAN side, all traffic heading to the outside world is fine. If I do a WAN capture, all internet traffic into our LAN is fine. Once it traverses the firewall is where the checksums become bad.
What is interesting is that in 1.2.2 I disabled the hardware checksum offloading and continued to have the bad checksums. But at another clients with 1.2.3 RELEASE I disabled the hardware checksum, and the bad checksums mostly went away.
Before this conversation, I have NOT checked for checksum errors.
The PBX is a linux box, and the phones are mostly Polycom 501's. I have not checked for protocol error counters.
-
Well, this makes sense, since the firewall is regenerating the packets as it forwards them. If a packet received has a bad IP checksum, it should be discarded, so the only traffic to transit thru the firewall should be one with a good checksum.
-
Well, this makes sense, since the firewall is regenerating the packets as it forwards them. If a packet received has a bad IP checksum, it should be discarded, so the only traffic to transit thru the firewall should be one with a good checksum.
All the packets it receives have good checksums. pfSense is writing the bad checksums and sending them along. I think the pfSense might be the problem with my poor voip quality.
I need to update the the 1.2.2 to 1.2.3 and disable the checksum feature in the pfSense to see if it makes a difference. I wont be able to do that until after the new year.
-
All the packets it receives have good checksums. pfSense is writing the bad checksums and sending them along.
In an earlier reply I tried to suggest the possibility that packet capture of transmitted packets might not show exactly what is transmitted on the wire. Before concluding that pfSense is corrupting the checksums on packets it forwards (if pfSense was routinely doing this I'm sure there would be a lot more complaints) I think it would be good to get confirming evidence, say from the protocol error counters on the PBX or from a tcpdump trace on the PBX or a protocol analyser.
The other thing I think you should check is that you really did turn off checksum unloading. I believe it is not sufficient to just tick the box, I think the save button below the box needs to clicked on to activate the changed setting.
-
All the packets it receives have good checksums. pfSense is writing the bad checksums and sending them along.
In an earlier reply I tried to suggest the possibility that packet capture of transmitted packets might not show exactly what is transmitted on the wire. Before concluding that pfSense is corrupting the checksums on packets it forwards (if pfSense was routinely doing this I'm sure there would be a lot more complaints) I think it would be good to get confirming evidence, say from the protocol error counters on the PBX or from a tcpdump trace on the PBX or a protocol analyser.
The other thing I think you should check is that you really did turn off checksum unloading. I believe it is not sufficient to just tick the box, I think the save button below the box needs to clicked on to activate the changed setting.
I was able to do a tcpdump on the pbx (tcpdump -s 1500 -nl not port 22 -w wireshark.cap) and look at the file with wireshark. I did NOT see any checksum errors when doing this.
It seems that the packet capture on pfsense is capturing before writing back to the wire, so your not seeing the the truth. I'm just surprised that more people did not mention this was the case.
As for making sure turning off checksums, yes, I not only checked the box, but also clicked the save button.
-
I had similar checksum errors early on when I first started using voip through my pfsense box. In my case it turned out to be what I thought to be an apparent faulty nic. (Realtech- I know I know… I see now...) Since replacing the nic, my voip systems have been rock solid. And no more errors.
May not be related but I figured Id mention it...