SSH Idle Session Timeout/Dropping Issues
-
I am running pfSense 2.2.5-RELEASE and am having an issue where SSH sessions are dropped after being idle for a minute or two. I have tried all the solutions I have found:
- Set Firewall Optimization Options to Conservative
- Clear invalid DF bits instead of dropping the packets
- Disabled Firewall Scrub
- Bypass firewall rules for traffic on the same interface
- Disabled hardware checksum offload
- Disabled hardware TCP segmentation offload
- Disabled hardware large receive offload
- Disabled State Killing on Gateway Failure
Nothing has fixed the problem, although setting the Firewall Optimization Options to Conservative seemed to allow idle SSH sessions to stay open longer before dropping.
My current setup is a pretty beefy server and I am using a transparent bridge for the network interface that I am filtering.
ISP -> (igb0)Firewall(igb1) -> Core Switch
igb0 and igb1 are in a bridge (BRIDGE0) in pfSense
I am using pfBlockerNG to manage my dynamic rule lists and create firewall rules for them and those are being placed in floating rules.I see that if I go to System->Advanced->Firewall / NAT, I can change the TCP Timeout options manually (and I assume override what the Firewall Optimization Options sets automatically). I don't think I need to worry as much about my state table filling up since I have an overpowered server (24 cores and 48GB RAM) running the firewall. All I care about is SSH sessions not being dropped when idle since SSH is heavily used on our network. Any suggestions?
Thanks!
-
My ssh sessions stay up until an event takes them down, but I'm not running transparent proxying bridge. I'm talking days, weeks, months.
I didn't do anything special in pfSense, the ssh server, or the ssh client to make it happen. TCP KeepAlives are pretty much the norm these days. Maybe the ssh logs on the server or client would bear fruit.
-
+1. Never seen a session dropped beyond broken clients (like, STFP plugin for Total Commander).
-
asymmetric routing? TCP connections have a 24 hour timeout. If it's dropping after 90 seconds, it's because the TCP handshake isn't being completed from PFSense's perspective.
-
Yeah it's really weird. I don't believe it is asymmetric routing since I am just using it as a transparent bridge. Right now I only have one of our networks on it, but if I can get this working I am going to put both on it.
ISP (X.X.49.254) -> (igb0)Firewall(igb1) -> Core Switch (X.X.49.247)
ISP (X.X.50.254) -> (igb2)Firewall(igb3) -> Core Switch (X.X.50.247)In our case, the "ISP" is the college's network that we are using. This is for a single dorm floor.
Everything on our network is using X.X.49.247 and X.X.50.247 as their gateways (depending on which they are on) and then our core switch passes it along to X.X.49.254 or X.X.50.254. Right now, only the X.X.50.0/24 network is going through the firewall. I have disabled Outbound NAT rule generation in case that had anything to do with it.
I have now discovered that X11 forwarding seems to also kill any SSH session after maybe 5-10 seconds (Write failed: Connection reset by peer).
I have attached my rules for WAN, LAN, and Floating rules in case they might help. LAN and WAN are interface groups that include both the 49 and 50 WAN and LAN interfaces.
I have also tried changing the State Type for the LAN and WAN rules to "sloppy state" but that didn't help (I'm not sure if it is something that could have helped but I figured I would try it). If you need any more information that might help with troubleshooting this issue, please let me know. Thanks!




 -
Progress! Well kind of. If I completely disable all packet filtering (System->Advanced->Firewall / NAT-> Disable all packet filtering.), it seems to solve the problem. Could it be that "State Type" option in the rules and I am just not setting it for the correct rules?
-
If I completely disable all packet filtering (System->Advanced->Firewall / NAT-> Disable all packet filtering.), it seems to solve the problem.
Is pfSense actually doing anything useful on your network? Have hard time understanding your setup.
-
Sorry, I meant that it solves the problem but isn't a valid solution since it defeats the purpose of having it on our network. I am using it at the moment with pfBlockerNG to pull in a bunch of blacklists for now and possibly use Snort at some point in the future.
I believe I have fixed the issue though. I added a rule to Floating Rules, WAN Rules, and LAN Rules that passes the TCP protocol and in the advanced settings has the TCP flags set to "Any flags." and the State Type set to "sloppy state". The Floating Rule is being applied in the "out" direction on all interfaces but my management interface. I'm not sure if all three of the rules or what I have set in them are necessary, but the issue seems to be fixed. If it comes back up again, I will post again. Thanks for all your help! I really appreciate it!
-
I've encountered SSH timeouts every 30-40 secs but only on non-VPN connections to local servers.
There are no issues when the same ssh connections are made on the VPN. Isn't this odd? I don't recall making these settings for the VPN connection.
I've applied Conservative in System/Advanced/Firewall&Nat section but still trying to figure out what makes the local SSH connections timeout.Anyone else experienced this?
-
So your not talking a ssh session to pfsesne, your talking a ssh session inbound to a server behind pfsense. That you forward in from pfsense wan..
-
-
I'd fix whatever weirdness is on my network making that necessary but that's probably just me.
I would at least want to understand why.
-
Guess we are just weird like that Derelict ;) hehehe I would do the same thing.
-
Progress! Well kind of. If I completely disable all packet filtering (System->Advanced->Firewall / NAT-> Disable all packet filtering.), it seems to solve the problem. Could it be that "State Type" option in the rules and I am just not setting it for the correct rules?
Ahh, my young padawan, it does not solve the issue, it solves the symptom. I could tell this is what you meant with "Well kind of", but words can be important as they can influence how you think about a problem, even if you "know what you meant".
-
I'd fix whatever weirdness is on my network making that necessary but that's probably just me.
I would at least want to understand why.
Can you define what you mean by "network" weirdness, if pfSense is the cause of the problem?
Pardon me for jumping onto this thread, but I am experiencing the same problem, accessing a Cisco phone system on its voice VLAN 150, doing automatic routing through pfSense to VLAN 1.
pfSense is set as the gateway on VLAN 150 for the Cisco, and is set as the gateway for the desktop PC on VLAN 1.
Open SSH on VLAN 1 to the Cisco controller, and I can connect, but the connection only stays active 1-3 minutes and suddenly force-closes.
It kills the SSH session, even if the session is NOT idle but streaming data, such as dropping right in the middle of a "show tech-support" text dump.
VLAN 150 firewall config in pfSense is "pass all IPv4, all protocols, from any to any", setting it wide open to remove any possible flow restrictions.
pfSense still drops SSH anyway, even though I can ping the Cisco continuously from VLAN 1, no problem.
If I put a computer directly on VLAN 150, SSH works normally, sessions can stay open indefinitely, no problem.
-
It means "sloppy state" and "disabling pf" should never be necessary. I am not going to try to decipher that text, do the work, and make a diagram for you.
Look at the diagram in my sig to see the information necessary and make one and post it.