UDP stream is concatenated crossing into the LAN - pfSense 1.2.2

praevidium

I still have a pfSense v1.2.2 box under my purview with an odd problem. We're testing a new Apache-based Java web app that sends an XML string in a UDP stream through the pfSense WAN interface to a server in the LAN for processing. Using Wireshark, I see that it leaves the Apache server OK with a total of 1558 bytes, but by the time it gets to the LAN, it's cut off at the end and is missing the last few bytes of the stream. It comes in at only 1514 bytes and the checksum is incorrect. I've turned off hardware checksum offloading just in case and have set the firewall's optimization setting to 'conservative', but to no avail. I'm scratching my head on this one.

Can anyone shed some light on why this might be happening and how to fix it? I'd really appreciate it.

cmb

What the packet capture shows on LAN is what comes in on the wire. 1514 would be the max frame size on standard 1500 MTU Ethernet. That makes it sound like the switch is truncating it, though I guess it could be the firewall's NIC, I can't say I've ever seen or heard of that happening. Unless you're using jumbo frames on the web server, it shouldn't be sending frames that big. 1514 (14 bytes of Ethernet header, 1500 of payload) is the maximum you should have unless you're using jumbo frames.

Completely unrelated - you should upgrade to a supported release. 1.2.2 is over 4 years old, it has lots of known bugs and a number of security issues in the now very outdated components it contains (albeit none trivially easy to exploit, or remotely at all generally aside from XSS/CSRF).

praevidium

Thanks for the response. For clarification's sake, the 1558 bytes is just the data stream aside from the headers. The sending server NIC does support jumbo frames. I'm still checking on the switches. I can try putting both servers on the same LAN segment, though, and see what happens. That way, there'd only be a single switch involved. If it works, I guess I'll blame it on the pfSense box LAN-side NIC.

praevidium

I have more information, now. I'm working on the hypothesis that the receiving server is discarding the packets and, hence, not responding to the sending server's request.

I turned off the hardware checksum offloading in pfSense. Via 2 packet captures (one at the pfSense LAN port from within pfSense and another at the connecting switch port) I see perfect packets coming into the LAN from pfSense, yet at the switch they have IP checksum errors. Those errors persist right up to the receiving server's NIC.

As I understand it, Wireshark is poking into the NDIS at the LLC layer, so the path in question is:

pfSense LAN port Switch port
LLC->MAC->PHY –--> PHY->MAC->LLC

I also understand that the IP checksum only protects the header. So how can the IP header get changed along this path? There are no other checksum errors with the other traffic, so I'm having a hard time blaming either the pfSense box or the switch. This is very odd.

Does anyone have any wisdom/insight they could share? I'd certainly appreciate it.

praevidium

Well, here's the end of the story:

The actual problem turned out not to be a stream truncation at all. A different Wireshark filter showed it had to do with IP fragmentation. The UDP packet was being fragmented and somehow the IP headers were altered and the checksums were incorrect by the time the packets hit the LAN. A packet capture at the LAN nic didn't show any errors, but one at the corresponding switch port did, which was very difficult to figure out. I resolved it by upgrading both the switch firmware and then pfSense (to 2.0.2). It was after the pfSense upgrade that the packets in question finally got to the destination server application. I'm relieved.