SSH connections dropping on backup router when on different VLAN
-
@iptvcld that screams asymmetrical routing to me.. When pfsense doesn't see answer come back through its state, the state would close. And your client wouldn't be able send more traffic to the ssh box.
-
@johnpoz thanks for the response! I will read into that and adjust but just thinking why would this only happen only on my backup router and not on my main? The settings are synced via HA carp. Just trying to fully understand before I I make changes.
-
@johnpoz I enabled Bypass firewall rules for traffic on the same interface; but the SSH session to pfsense still dropped within 30 to 40 seconds.
This only happens when I am on my desktop on vlan 100 (192.168.2.4) connecting to my backup pfsense untagged interface 10.200.1.81 - webGui is slow to navigate.
But if i connect from my desktop 192.168.2.4 to the LAN tagged interface 192.168.2.81 then there are no drops and the web gui of pfsense is quick.
On top of this all, if I connect to the master pfsense (10.200.1.80) from 192.168.2.1 (desktop), no issues..
-
@iptvcld said in SSH connections dropping on backup router when on different VLAN:
enabled Bypass firewall rules for traffic on the same interface
That is by no means the correct way to go about it, and doesn't fix all asymmetrical.
Why I said it screams asymmetrical is is the default timeout for a state..
When you send traffic, to open up your ssh session, you send a syn to open a state.. If the return traffic is getting back to your client via some other method that pfsense doesn't see an never fully establishes the state, then it will close after 30 seconds. And now your client trying to talk to server would not be allowed until it sends another syn to open a state again..
Does this ssh box your wanting to talk to, have a leg in your clients network, does it maybe send traffic to your backup router as its gateway? Are you running multiple L3 on the same L2 and this ssh box has the wrong mask and thinks client is local, and since on actually the same L2 just answer it? So pfsense never sees the syn,ack back to fully establish the state.
What I would do is sniff on your ssh servers network interface on pfsense, now start your conversation to this ssh IP, you see the syn go to the ssh servers IP, do you actually see the syn,ack come back to pfsense?
example... Here as a client on 192.168.9.100/24, here is start of ssh session to a pi that sits on 192.168.3.10/24
This is sniff on pfsense 192.168.3.253 interface, you see it send on the syn, and back to its 3 interface you see the pi at 3.10 send back its syn,ack - and then you see them sending traffic back and forth starting the ssh session.
If when you sniff on your pfsense interface that is connected to your ssh servers network you only see syn, but no syn,ack and the exchange of info - then your client is returning the traffic via some other path, and pfsense will never be able to open up the full state,.
-
@johnpoz I will add some more background to this. When i mention SSH, i am SSH'ing into the pfSense router it self.
I have my backup pfsense as a VM via Proxmox and when I ssh into the mgmt IP 10.200.1.81 from vlan 100 192.168.2.4 then the SSH connection drops after 30 to 40 seconds. (just timed it now and i got disconnected after 1m 50sec) But if i ssh to the vlan 100 interface 192.168.2.81 then ssh is fine.
Odd thing is, if i ssh into my master pfsense router from 192.168.2.4 to the mgmt interface 10.200.1.80 it works fine there.
My WAN, LAN and SYNC ports via Proxmox are all their own ports and defined for the VM. The switch ports are set to mgmt vlan
-
@johnpoz Just tested something.. So this issue only occurs on the router that is in BACKUP Carp status. As soon as i cut it over to Master then SSH remains up.
So it seems like i cannot ssh via inter-vlan to the pfsense that is marked as BACKUP.
-
@iptvcld said in SSH connections dropping on backup router when on different VLAN:
I have my backup pfsense as a VM via Proxmox and when I ssh into the mgmt IP 10.200.1.81 from vlan 100 192.168.2.4 then the SSH connection drops after 30 to 40 seconds. (just timed it now and i got disconnected after 1m 50sec) But if i ssh to the vlan 100 interface 192.168.2.81 then ssh is fine.
So access it by its VLAN 100 IP.
If you try to access the backup by using its LAN IP the initial request packet goes to the VLAN 100 CARP VIP, which is the gateway and owned by the master. So it enters the VLAN 100 interface of the master and is routed out on the LAN interface, since its destined to a LAN IP, and goes to the backup node. However, the backup sends the response packet out on the VLAN interface, since the target IP is in the VLAN 100 subnet.
So the master doesn't see the respond packet and will block the next one of the connection. Ergo you run into asymmetric routing as @johnpoz already suspected.So either use the next interface IP of the backup node to access it or do a masquerading workaround as described here: Troubleshooting VPN Connectivity to a High Availability Secondary Node.
-
@viragomann ah that makes sense and since the baskup pfsense interfaces are in back mode mode I assume it cannot respond from there.
Accessing ssh using the lan 100 interface works fine since I am not crossing vlans and I can leave it like this as well but I decided to have a look at the link you provided and I have setup Nat from the mgmt interface to an alias of the 2 pfsense mgmt interface IPs and now the ssh connection is stable and the also the webgui response time is back to normal!
Thank you to both of you for your help.
-
@iptvcld
Best practice is to add the management IPs of both nodes to this alias.
So you can also access to primary, when it's in backup state and the traffic is flowing across the secondary. -
@viragomann that’s right, I added the 2 Mgmt interface IPs into that alias and my thought was also the same where I can access the primary when it’s in backup carp mode and that the webgui response would also be smooth. This is great!