Looking for a way to deliberately drop UDP fragments (for a LAB exercise)

swinster

Hey All,

I run several classes that create specific network related issues in order the student can diagnose such issue. One problem we come across in the real world is that large UDP datagrams that have been fragmented, but then the fragment is dropped on route to the final destination, thus making the datagram useless as only the partial packet turns up. This particularity affect IPsec setup where the IKE AUTH datagrams are >1500 bytes.

What I would like to do is replicate this in pfSense. I have looked at a few things, but could not manage to achieve this at the moment, although I'm guessing it would be possible.

Any ideas welcome.

JKnott

The only thing that comes to mind is you can check the fragment offset field (I'm assuming you're working with IPv4). So, you'd need some way to examine that field and make a decision based on it.

swinster

Interesting (and yes this is IPv4).

In Wireshark, I would use the following filter to highlight both the packets coming in over the relevant UDP port (in this case 500), and the associated fragment.

udp.port==500 or (ip.flags.mf ==1 or ip.frag_offset gt 0)

My advance pfSense foo is not great, so I would not quite know were start here though.

johnpoz

Are you just looking to drop random packets in your data stream? Or the specific say last fragment in the fragmented stream?

Pretty sure dummynet can be use to cause loss.. Which dummynet can be used in pfsense.

swinster

Hey @johnpoz

I'm looking to drop all UDP fragments, which will create the effect of an IPsec tunnel not being able to be establish. Yet if you take a PCAP and Wireshark filter for just IKE traffic over UDP 500, you will still see what looks to be full two way traffic. As mentioned, we come across this in numerous real world deployments, perhaps using devices like Cisco ASA's or Checkpoint firewall etc.

I have attached a filtered PCAP taken form my lab showing normal IKE traffic. Note packets 3 and 5 are the fragments to 4 and 6 respectively. So, in this example, packets 3 and 5 would be dropped.

![IKE Packets.png](/public/imported_attachments/1/IKE Packets.png)
![IKE Packets.png_thumb](/public/imported_attachments/1/IKE Packets.png_thumb)

JKnott

Your situation raises 2 questions:

Why are the fragments being lost? Each fragment should be able to be passed along the entire path.
Why is IPSec sending packets that large? Are they being fragmented at source, as they should be? Also, why isn't path MTU detection working. Take a look at the IPSec packets to see if the do not fragment flag is set. PMTUD requires it.

swinster

Hi @JKnott,

I think you misunderstand what I want to do. pfSence is fine, we have no problem with pfSense handling this traffic. But, for a LAB environment where we are indicating to students of potential real world issues, we want to replicate this issue with pfSense, in order that they can see the outcome that they might come across. We use pfSense in a multitude of ways like this, to induce problems in order that we can then diagnose the issue in class in a "real" (fake) environment. pfSense is awesome like this :).

In pretty much all of the cases I have seen that show this issue, there is always a firewall tweak that can be made.

The IKE AUTH packet (StrongSwan) is just big. Yes it is fragmented at source, but not all infrastructure has been set to handle these fragments.

So, I'm looking to create this problem, not actually solve it :)

Does this make sense?

JKnott

@swinster:

Hi @JKnott,

I think you misunderstand what I want to do. pfSence is fine, we have no problem with pfSense handling this traffic. But, for a LAB environment where we are indicating to students of potential real world issues, we want to replicate this issue with pfSense, in order that they can see the outcome that they might come across. We use pfSense in a multitude of ways like this, to induce problems in order that we can then diagnose the issue in class in a "real" (fake) environment. pfSense is awesome like this :).

In pretty much all of the cases I have seen that show this issue, there is always a firewall tweak that can be made.

The IKE AUTH packet (StrongSwan) is just big. Yes it is fragmented at source, but not all infrastructure has been set to handle these fragments.

So, I'm looking to create this problem, no actually solve it :)

I guess I misunderstood this:

One problem we come across in the real world is that large UDP datagrams that have been fragmented, but then the fragment is dropped on route to the final destination, thus making the datagram useless as only the partial packet turns up. This particularity affect IPsec setup where the IKE AUTH datagrams are >1500 bytes.

If it's a real world problem the question remains why aren't those fragments making it through the 'net. Also, fragmenting at source is different from fragmenting in routers along the path. The fragment offset I mention would only be in packets that have been fragmented by routers. The source will simply break up the original datagram into bite size pieces that fit the MTU. The original datagram can be as big as 65K and spit up into individual non-fragmented packets.

swinster

@JKnott:

If it's a real world problem the question remains why aren't those fragments making it through the 'net. Also, fragmenting at source is different from fragmenting in routers along the path. The fragment offset I mention would only be in packets that have been fragmented by routers. The source will simply break up the original datagram into bite size pieces that fit the MTU. The original datagram can be as big as 65K and spit up into individual non-fragmented packets.

The "usual" reason a fragment is dropped is due to a firewall/router misconfiguration. For example https://www.cisco.com/c/en/us/td/docs/ios/12_2/ipaddr/command/reference/fipras_r/1rfip1.html#wp1071180 - implicit deny.

Hopefully, in the image above, you can see that the fragment is broken at the 1500 byte MTU boundary. I believe the the version of StongSwan we use causes fragmentation at the Network layer, not the Transport layer, however, switching to a newer version of StongSwan (which - I believe - allows for UDP fragmentation, rather than IP fragmentation, although this is not my area of expertise) is not a simple thing for our application.

Anyhow I digress.

JKnott

Well, if the packet is fragmented along the path, what do you expect to be able to do about it? You can't recreate the packet, when part of it is missing. I haven't used StrongSWAN, so I can't comment on it, but generally the solution to this sort of problem is to limit the tunnel packet size.

The question still remains, why are oversize packets being sent? Those appear to be close to 2400 bytes original size, which should never happen if the MTU is set to 1500. If the datagram was properly fragmented before transmission, you wouldn't be seeing those fragments.

swinster

Hi JKottt,

I don't think I'm explaining this very well :(

Fragmentation is at source, at the Network layer of the stack. So StongSwan creates a 2400 bytes (or whatever) datagram, and this is fragmented into two packets, one 1500 bytes, and one with what is left over. This is what is placed onto the wire by the sending device.

Often (not pfSense), other manufactures firewall/routers will drop this fragment, thus the packet with the UDP headers turns up at the far side, but the fragment packet has disappeared. This is often a firewall/router configuration issue.

What I'm looking to replicate in pfSense is to also drop these fragments.

Does this make any sense?

JKnott

Something is clearly wrong here. If it was fragmented at source, you shouldn't be seeing fragments at all, just a series of un-fragmented packets containing the fragments of the original datagram. If you're seeing fragmented packets, as shown in that capture, they're being fragmented elsewhere. Was that capture actually at the source? Or the destination?

swinster

This is correct. This was the source.

As you say, in order for a large datagram to be placed on the wire, we must abide by the MTU that has been set. Assuming this is 1500 bytes (which in this case it is), then (I believe) two things could occur. At the transport layer (update - by the application), we fragment, meaning that there are two separate UDP packets created, or (as in the case here) the job is offloaded to the Network layer, and it become the job of IP to fragment - hence what you are seeing. This is still quite normal and perfectly fine to happen. Would fragmentation at the Transport layer be better, yes, but this is not the point of this exercise. Ultimately, this is normal behaviour.

The task of the exercise was to see if pfSense could be forced to drop these fragments. In its vanilla OOTB format, pfSense WILL handle thees packets WITH NO ISSUE. There is NO problem with pfSense whatsoever. The issue is seen with other manufacturer devices, and in all case I have come across, their is an extra piece of config that can be added to the router/firewall to account for this.

The question remains, can pfSense be configured to drop these fragments that it receives to "simulate" these other real world scenarios.

swinster

In fact, you could say that some intermediate device has to re-fragment further. For example, if there is some upstream hop the requires a lower MTU (such as an MPLS tunnel), thus the datagram, could again be re-fragmented. Is this bad form, yes, and you would likely then lower the MTU at source to avoid this, but it is still perfectly normal. Fragments could be seen at a pfSense, and this is what I want to force a drop of.

Remember, this is for a training exercise.

mikeisfly

I think everyone is missing the point, he wants to know if there is a way to drop frames under a certain byte size the problem is you want to only drop udp packets. This sounds like something you could do with your switch but the switch wouldn't be layer 4 aware. There may be a package that might help you out but I couldn't tell you which one. You may want to take a look.

JKnott

No, it's not packets under a certain size, it fragments. Like other packets, fragments can be any legal size, but they are only part of the original packet and have the fragment offset etc.. Fragmented packets are caused by routers connected to links with a smaller MTU than the original network. I also don't understand how a source would fragment a packet when it's supposed to fragment the datagram to fit the MTU. What would happen with this on IPv6, where packet fragmentation isn't allowed?

Harvy66

@JKnott:

Something is clearly wrong here. If it was fragmented at source, you shouldn't be seeing fragments at all, just a series of un-fragmented packets containing the fragments of the original datagram. If you're seeing fragmented packets, as shown in that capture, they're being fragmented elsewhere. Was that capture actually at the source? Or the destination?

just a series of un-fragmented packets containing the fragments of the original datagram

This just made my head explode. A series of unfragmented packets containing fragments. Yes, this is how Ethernet and IPv4 work for all fragmented IPv4 packets that cannot fit in an Ethernet frame or MTU.

If I send a 64KiB ICMP packet, I expect it to be fragmented as it leaves the source because no switch I know supports 64KiB jumbo frames.

JKnott

If I send a 64KiB ICMP packet, I expect it to be fragmented as it leaves the source because no switch I know supports 64KiB jumbo frames.

The original datagram, which can be as large as 65K can be fragmented to fit the MTU. Packets at this point are not fragmented. If they have to travel over a link with a smaller MTU, then a router will fragment the packets. At this point there will be a fragment offset etc., to show the packet has been fragmented somewhere along the path, but not at the source.

Fragmented datagrams, entirely normal.
Fragmented packets, occur only when a packet passes through a link with a smaller MTU.

However, fragmenting packets does not happen with IPv6, but fragmenting datagrams still does. Path MTU discovery is mandatory with IPv6, to ensure packet fragmentation is not needed. There is also a trend to this with IPv4 & PMTUD.

swinster

Hi JKnott,

Well, for what appeared on the surface to be a yes/no answer (i.e. can I get pfSense to drop fragments), we have taken a road that I had not anticipated. I must apologies for any confusions here.

I believe what you are looking for is RFC 791 https://tools.ietf.org/html/rfc791, which allows for fragmentation of datagrams by the Internet Protocol (IP). As I understand this, it is a fundamental building block of the entire protocol, and the Internet as we know it today. This has been in existence for many, many years and will continue to be in existence for many, many years to come. It is a current RFC and has not been obsoleted.

As we have said previously, it would be great if the application was more aware of what it was producing and (importantly) what the underlying network is that it sits upon. As such, “the application” can then control how it sends such a large data chunk, and therefore allow multiple individual UDP packets to be sent, which would be reassembled by the corresponding application on the far side. However, this is not always the case (as we see here), and IP absolutely allows for fragmentation. It is not just the preserve of a gateway either, but any device that implements an IP stack.

Remember too that UDP has little concept of the MTU size of Ethernet either. You can quite happily create a UDP datagram of 65k (as has been pointed out above), but it is not the responsibility of UDP to then chop up and fragment this datagram, that lies with IP.

Looking back at the image of the PCAP, are you happy that no third-party gateway is involved here? Both devices exist in the same subnet, there is no gateway involved, correct? I have attached the actual PCAP for you to inspect further. There is no trickery here. It really is what it is.

These are two virtual machines, and the application (in this case StrongSwan, which is leveraged by our own application) is unaware of the MTU of the underlying Ethernet. It produces what it does with little regard to the underlying network. This is then encapsulated in a large UDP datagram, and passed down to IP. The problem is, IP IS aware of the MTU size and cannot send this large datagram as a whole, it MUST fragment it, unless the datagram has the Do Not Fragment bit set.

Now, if this datagram was to pass through a layer three router or gateway (such as pfSense), and that gateway had a lower MTU size set on the interfaces that would forward it out on (say 1400 bytes), then it would need to be re-fragmented to cope with the new MTU, but everything should be able to be reassembled by IP at the destination.

I will set this up shortly and take some more PCAPs for you, but I’m having some issues with my vCentre at the moment.

[IKE Key Exchange.pcap](/public/imported_attachments/1/IKE Key Exchange.pcap)

swinster

FWIW, here are a couple more PCAPs. Now we have:


192.168.199.61 (MTU 1500) <-----> (MTU 1500) 192.168.199.254|10.168.199.254 (MTU 1400) <-----> (MTU 1500) 10.168.199.61

      StrongSwan <---------------------------------------> pfSense <-----------------------------------------> StrongSwan

I have captured on 192.168.199.61 and 10.168.199.61.

We see the same 4 datagrams on each side. First, we have:


IKE_INIT Initiator Request ------------>
<------- IKE_INIT Responder Response

Again, these are fine, both small diagrams that fit the MTU.

Then we have:


IKE_AUTH Initiator Request ------------>
<------- IKE_AUTH Responder Response

Both of these are large datagrams, approximately 2400 and 2300 bytes, but let’s look specifically at the first PCAP and the IKE_AUTH Initiator Request, as sent by 192.168.199.61. This will not fit into a 1500 byte Ethernet packet, so IP (at the source) fragments the datagram into two packets. The first is the maximum MTU size (1500 bytes - as set on the interface of the source device), and the second is what is left over (approx. 800 bytes).

However, if we now look at the second PCAP and the same the IKE_AUTH Initiator Request that is RECEIVED by 10.168.199.61, we note that it has been re-fragmented by pfSense. The entire Datagram was received by the pfSense interface at 192.168.199.254, but it now needs to be forwarded out on an interface (10.168.199.254) whose MTU is only 1400 bytes. The entire datagram now needs to be re-fragmented to ensure that the fragment packet fits the new MTU. The fragment is now the largest size possible (1400 bytes) and the second packet contains what is left (approx. 900 bytes). We still only see two packets in this case, but the fragment has changed size.

Looking at the reverse path (IKE_AUTH Responder Response), I left the MTU alone. The two transmitting interfaces (the source at 10.168.199.61, and pfSense at 192.168.199.254) both have MTUs set to 1500 bytes. In this case, the fragment remains unchanged as it passes through pfSense.

IP fragmentation can happen at gateways (yes, this is likely if outgoing MTU sizes are lower that the incoming packet size), but can also be applied at source. This is what IP does.

So, now we have cleared this up, I wonder if we can go back to the question :

Can we get pfSense to drop a fragment?

192.168.199.61_IKE_1500.pcap
10.168.199.61_IKE_1500.pcap