Possible firewall bug - Confirmed with testbed

jonnytabpni

It would be appreciated if a mod could let me know whether or not I should put this in a bug report in redmine.

Thanks

jimp

Before you do anything else, upgrade to a recent snapshot. Testing a bug on a month-old snapshot doesn't tell us anything conclusive.

cmb

pf keeps ICMP state by ICMP ID and source and destination IP. If you try enough, and your OS doesn't generate its ICMP IDs very randomly, you will get a collision that will allow traffic through in the opposite direction you might expect as long as the state is up matching the ID and IPs.

jonnytabpni

No problem, I'll upgrade to the latest snapshot.

As for ICMP state, I don't think it's related to ping at all. I did originally have the same theory as cmb, however I feel it's an issue with the arp table, as once I flush the Arp Cache, all services (I've tested with SSH) work for a while (Maybe 5 - 10 minutes or so). I think for those few minutes, pfsense thinks that traffic coming from Server Z originates on the WAN interface, and uses the WAN rules, instead of correctly using the PUBLIC rules. Remember that WAN and PUBLIC are bridged together which makes this theory possible.

Also, the O/S being used is CentOS, and I don't think I've ever experienced this problem before with Ping IDs?

What I will do tonight (after upgrade to latest snapshot), is to take ICMP out of the equation and test using just SSH. I will try and SSH into server X from Server Z. Just to confirm, Server Z will be behind the PUBLIC interface who only rule will be "block all" (However the WAN tab will allow SSH access to X).

Surely pfsense can't be classed as safe if it suddenly opened all ports, just because the arp cache was flushed?

scoop

Maybe I'm stating the obvious, but can you rule out the possibility of a network loop between PUBLIC and WAN? I.e. did you look with tcpdump to double check if traffic indeed is coming in from the interface PUBLIC?

jonnytabpni

@scoop:

Maybe I'm stating the obvious, but can you rule out the possibility of a network loop between PUBLIC and WAN? I.e. did you look with tcpdump to double check if traffic indeed is coming in from the interface PUBLIC?

I can indeed rule this out. This setup was done on a Xen box, however the WAN interface was a physical PCI NIC passed through to pfsense, while the PUBLIC NIC was a virtual NIC, so it would be impossible to have a loop.

Also, I did indeed check tcpdump on the pfsense box to confirm that the packets were indeed entering/leaving the "PUBLIC" interface (This was when I realised that there was a real problem and decided to report on the forums).

I also made sure that there wern't any MAC address conflicts

cmb

@jonnytabpni:

Surely pfsense can't be classed as safe if it suddenly opened all ports, just because the arp cache was flushed?

It won't, ever, under any circumstances, do that. ARP has no impact at all on filtering, that indicates you have a loop or some other path where systems can communicate without the firewall.

@jonnytabpni:

I can indeed rule this out. This setup was done on a Xen box, however the WAN interface was a physical PCI NIC passed through to pfsense, while the PUBLIC NIC was a virtual NIC, so it would be impossible to have a loop.

Not impossible at all, lots of ways to get such a scenario where that traffic isn't actually going through the firewall, bridging the NICs among others.

jonnytabpni

@cmb:

@jonnytabpni:

Surely pfsense can't be classed as safe if it suddenly opened all ports, just because the arp cache was flushed?

It won't, ever, under any circumstances, do that. ARP has no impact at all on filtering, that indicates you have a loop or some other path where systems can communicate without the firewall.

@jonnytabpni:

I can indeed rule this out. This setup was done on a Xen box, however the WAN interface was a physical PCI NIC passed through to pfsense, while the PUBLIC NIC was a virtual NIC, so it would be impossible to have a loop.

Not impossible at all, lots of ways to get such a scenario where that traffic isn't actually going through the firewall, bridging the NICs among others.

I'm not suggesting that it's the filtering that going wrong. I'm suggesting that pfSense thinks that the traffic is coming from WAN instead of PUBLIC, so it uses WAN's rules instead of PUBLIC's.

I can assure you that there is no loop. Tcpdump has confirmed this for me.

Anyway, I'll will install the latest snapshot on a bare metal box and use a couple of laptops connected directly to the pfsense machine to test and get back to you.

jonnytabpni

It's also very reproducable. All I have to do is reset the pfsense ARP cache and reset states, then for about 5 minutes, hosts connected to PUBLIC will use WAN's rules.

jonnytabpni

I can confirm that this is indeed a bug, as it happens with my clean test bed.

Here is the test bed:

Latest pfsense 2.0 snapshot. A brand new server with 2 NICs. One is configure as WAN, the other as LAN. I gave the WAN interface an IP, and gave LAN no ip. I then bridged the 2 interfaces together.

The only rule on the WAN tab was "allow all"
The only rule in the LAN was block all.

I connected a PC to the WAN interface, to be used to access the WebGUI. I connected a host to the LAN interface. Once I reset my arp cache and reset my state table, the host connected to the LAN interface can ping the WAN ip of pfsense for about 5 minutes. This is concurs with my results on my Xen system.

Should I file a bug report now? Is there anything else you would like me to do?

Thanks

jonnytabpni

Should I post this to Redmine?

jimp

The output of the following commands from the shell would also help:

# ifconfig -a

# cat /tmp/rules.debug