Network connection lost
I'm testing a new pfsense setup, and have noticed a strange situation. In the attachment you can see a (simplified) part of our setup.
The two firewalls on the right are the new pfsense boxes (using carp). Currently they only connect to our internet router. The IP's in bold are the VIPs. The firewall/router in the middle is our current router, a linux machine that we setup a few years ago. It connects to the internet as well.
All clients currently have 10.1.5.6 as their gateway. In the pfsense boxes, I added a static route (interface LAN, network 192.168.1.0/24, gateway 10.1.5.33). My pc, connected to the 10.0.0.0 network, has 10.1.5.39 as default gateway to test this new setup.
I can ping the pc (192.168.1.12) behind the left router without timeouts. I can start a vnc connection to it as well. Buf after a few minutes, the vnc connection closes with the error "connection reset by peer". The other clients, who still have their default gateway set to 10.1.5.6, don't have this problem. I tried setting my gateway to 10.1.5.37 because i thought maybe carp was acting weird, but that doesn't help at all.
Any idea what this could be?
All suggestions are welcome! :-)
Maybe you can elaborate more on your problem. It might also help to point out what the default gateway is on the left router and if you tell more about from where to where your VNC connection problem occurs (I guess your own PC from within the 10.x.x.x network). From where are you connecting? From your own PC I presume? If info like this doesn't reveal the issue it is probably best to obtain a network trace from a point-of-view that should be able to see all your traffic (i.e. the left router). The difficulty with these setups is that it also involves ICMP redirects to the correct router which probably doesn't show up at all in network traces made from the left router.
FWIW, I myself would prefer to keep things simple and only route from one point in your network (pfSense in this case) and avoid using multiple routers if you can. I understand you want move slowly towards your new network setup, but working like that makes things more complicated and error prone as well.
The default gateway for 10.1.5.33 is 10.1.5.6, like all clients. The vnc connection is from 10.1.2.54 (my pc) to 192.168.1.12, but I have the same issue when I start a vnc session to a machine behind yet another router.
I have an idea for a possible reason: when 10.1.2.54 wants to talk to 192.168.1.12, it asks its default gateway (10.1.5.39) and then finds out it has to send through 10.1.5.33. When 192.168.1.12 wants to reply, it sends it to his gateway, 192.168.1.1, which knows it has to put the packets on the lan, so the reply-packets don't pass via 10.1.5.39 again. In pfsense, I had to enter a firewall rule to allow traffic between 10.x.x.x and 192.168.1.x. Is it possible that because of this, pfsense closes this communication because it doens't see any reply-packets?
I don't think so. I think pfSense will normally just redirect traffic from 10.1.2.54 -> 192.168.1.12 to router 10.1.5.33 via ICMP redirect so they can communicate without intervention of pfSense, unless it performs source NAT. But in that case the traffic is still returned to pfSense and not an explanation why it would fail or timeout. Maybe you should also share how you have configured your LAN firewall rules, as by default it would allow all traffic from LAN.
If you can I'd start with tracing the traffic on the left router (with tcpdump or whatever is available) and check what happens during the session. Bear in mind though that you might miss relevant traffic, because it might have a source or destination address you might not expect.
Thanks for your help! We did some testing now. Tcpdump on 10.1.5.33 was not an option, so we played with port mirroring on the switch where 10.1.5.33 is attached to, and a laptop with wireshark on the mirror port, and wireshark on my pc.
With our old linux router, when I start a vnc session, I see ICMP redirect messages from that router, but all of the next packets from me to the "vnc target" go to the default gateway instead of directly to the target. So the redirects are ignored. Don't know if that's normal? Anyway, when I change my default gateway to the pfsense router, I don't see any ICMP redirects at all. When I check the state table in pfsense (diagnostics -> states) immediately after connecting with vnc, it shows that the tcp connection is closed. I don't know how long such a connection should stay "open", but in my case it's closes almost immediately.
I changed the default firewall rules a bit, but for now (for testing) I added a rule on the lan interface that allows any protocol, from any source to any destination (this is the first rule of the chain of course).
I have no theory to back this up, but it sounds to me like all of a sudden traffic seems to come in or back with a different IP address which makes the targeted system respond with a TCP RST packet, effectively closing the TCP stream immediately. It also sounds like pfSense is performing NAT since it's not using ICMP redirects, but I'm totally guessing here.
Maybe you can also add the port of the linux router to the mirror port and look for packets from/to 192.168.1.12 that might give a clue what's happening.
If pfsense is using NAT, can I disable that somewhere? I think it shouldn't do that as long as traffic doesn't go to another interface…
EDIT: Found the solution! I was browsing the pfsense book when I noticed chapter "8: Routing", and more specifically "8.1.2: bypass firewall rules for traffic on same interface"... Enabling that option (under System -> Advanced) did the trick! I still see no icmp redirects, but the vnc connection stays alive now. I'm happy :-)
Good. :) To see for what traffic NAT is being performed (which is not the case now of course after reading your solution) you go to Firewall -> NAT -> Outbound tab and look at the rules generated / created there. But I guess as long as you've changed nothing there it will only have the default outbound NAT configuration there and NAT would indeed never apply to the traffic entering and leaving the LAN interface.
There's only one rule there, created while setting up carp according to the book. Anyway, it's fixed, so no need to punish the brains any further… Thanks for helping!