Sticky connections not working with dual WAN
-
Do you understand what is in bogon? It IP address that do not route on the internet.. How would a source IP that is listed in bogon ever hit your internal interfaces?
Your rules on vlan1_trusted already all say hey only vlan1_trusted are allowed.. So any such odd IP, say a downstream network or "bogon" wouldn't be allowed in the first place by any of your rules.. There is zero point to having bogon on your internal networks... And to be honest little point on your wan either ;) These are IPs that are not meant to route on internet - how/way would you see them on your wan in the first place?
Your lucky pfsense pulls out rfc1918 space from bogon list, or you wouldn't be able to get anywhere with bogon lan side being blocked.
If your IP is changing then you don't have sticky set.. Or your states are expiring or being removed.. Do you have it setup to remove states on issue with the gateway? I think this is default??? That if there is a wan even, states are cleared?
-
No I wasn't sure what bogon is, so I've removed it now thanks.
I do have sticky set and have tried with both the "source tracking timeout for sticky connections" set to the default of "0" and again at "1200" after closing all browser sessions and resettings states.
I'm not sure what you mean when you say "Do you have it setup to remove states on issue with the gateway? I think this is default??? That if there is a wan even, states are cleared?"?
I thought all I have to do to enable sticky connections is to enable it in "System > Advanced > Miscellaneous"? If there is something else to do please could you be clear as i'm not an advanced user with pfSense.
Thanks.
-
So if there is an issue with wan... say not answering ping.. Or the like - I mean its out there as a possible cause.. It might happen now and then off chance..
This is the settings im talking about.
Under advanced - networking
And then under misc
These 2 things could flush all your states on you..
One way to check is look in your state table - has it been flushed? When you see this problem occur.
Does it happen all the time, or just now and then it has happen?
If its always happening like very connection - that points to sticky not working, or setting not actually took place.. Have you tried toggle the setting saved, and then turned it back on? Saved - have you looked in the actual xml to validate the setting is in there?
-
Thanks,
I've checked those both and they are already disabled, so I enabled both and disabled them.
I've gone to check the XML file but it's not clear exactly what I should be looking for? Do you happen to know, please?
It happens a lot, for example I can login to Santander for my banking, click just a few links and I'm logged out.
I log in to my own servers admin panel and again within a few clicks, it logs me out just like Santander. The logs show the same IPs so it's not changing but specifically show it's down to me using an IP that I didn't login with:
Rejected session for user admin because IP (5.70.xxx.xxx) doesn't match session file (217.45.xxx.xxx)I am also sure my connection is not dropping that often, there's just no way.
Thanks.
-
Just to confirm also, after turning the aforementioned settings on and off again I tried again with the "source tracking timeout for sticky connections" set to "1200" so it shouldn't change my IP when connected to the website for that amount of time (i.e. log me out).
However, it's still happening:
2020:06:06-02:06:04: '5.70.xxx.xxx' successful login to 'admin' after 1 attempts
2020:06:06-02:11:23: '217.45.xxx.xxx' successful login to 'admin' after 1 attemptsThe second login was because my IP changed and I had to login again.
I actually submitted this as a bug because I believe it is (I also sought out help in the IRC channel but they couldn't help me) but they referred me to here first:
https://redmine.pfsense.org/issues/10634Cheers.
-
It could be a bug - but I would think a lot more than just you would be reporting it.. I would think dual wan with sticky would be a common enough sort of setup that there are quite a few out there in the field..
I don't have dual wan, or would love to try and duplicate.. That you have a server to test to makes it easy to see exactly what is happening etc..
I would have to simulate a dual wan - which I could do.. But lets see if we get some any other traction - maybe someone with dual wan even if not using in load balancing - might be willing to try and duplicate the problem.
As temp solution - only thing I could suggest would be to turn off the load balancing and just use 2nd connection as failover.
-
As a temp solution, I've just set a rule that anything going to my servers or santander.co.uk & retail.santander.co.uk will use a specific gateway.
Are we just hoping someone with Dual WAN setup reads this and jumps in to help then?
Thanks.
-
Well we could call in @Derelict but don't think he is around for a few days..
-
Well there's no major rush as I'm not exactly down so I'll just hang on for an update and hopefully, he'll see this soon.
Thanks for your help so far Johnpoz :)
-
If the application doesn't work with load balancing it doesn't work with load balancing.
That's pretty much what I have. Talk to the application side about accepting sessions from multiple IP addresses.
-
I'm using this exact scenario, with dual wan, banking sites and quite a few users accessing them. No issues
I did have issues in the beginning and I had to raise stickiness to 2500.I also have raised the default weight to 2, so no line has a weight for 1.
I recall reading somewhere about an issue with load balancing, and this as a suggested workaround, but I can't recall it.In any case, it doesn't hurt anything to use a default weight of 2 and adjust smaller lines accordingly.
I'm on 2.4.5 and this also worked flawlesly on 2.4.4.p3
-
@Derelict I’m sorry but I don’t understand your reply.
The application does work with loadbalancing (Google Chrome, Microsoft Edge etc...) but the security of these websites being visited require that the IP doesn’t change. Isn’t that the exact purpose of sticky connections to work around this?
Plus if someone else is now reporting the issue surely it warrants being looked into?
Thank you.
-
Look at the states when you are connected. If there are two different IP addresses being connected to, but all connections to the same IP address use the same WAN, then load balancing is doing what it is designed to do and you will need to policy route all traffic for that application out the same WAN or Failover gateway group, not a load balance gateway group.
-
^ great point... But my take on him saying his server was logging 2 different IPs connecting is that he was only connecting to 1 destination IPv4 address..
But your point is very valid for many of these sites that are hosted on cdn where www.whatever.com could end up being 2 different destination ips for the same site..
-
-
@Derelict I tried as you suggested.... killed all states, went to my own server and logged in via the website (as said the server only has 1 IP). I was almost immediately logged out so logged in again.
Checked the states and noticed it's using both WANs as suspected:
VLAN1_TRUSTED tcp 192.168.1.126:64519 -> 62.3.XXX.XXX:3334 TIME_WAIT:TIME_WAIT 8 / 8 2 KiB / 936 B
WAN1 tcp 217.45.XXX.XXX:8341 (192.168.1.126:64519) -> 62.3.XXX.XXX:3334 TIME_WAIT:TIME_WAIT 8 / 8 2 KiB / 936 B
VLAN1_TRUSTED tcp 192.168.1.126:64522 -> 62.3.XXX.XXX:3334 FIN_WAIT_2:FIN_WAIT_2 8 / 8 2 KiB / 4 KiB
WAN2 tcp 5.70.XXX.XXX:59341 (192.168.1.126:64522) -> 62.3.XXX.XXX:3334 FIN_WAIT_2:FIN_WAIT_2 8 / 8 2 KiB / 4 KiBSticky connections are on and the timeout is set to 1200.
Thanks.
-
Well your states are showing fin_wait.. and time_wait
Those states are being closed..
I would sniff this traffic and who is sending the fin?
-
I'm not sure what you mean? I'm the only person connecting.
It took me a few minutes to find those details in the states so that's probably why it shows they connections are closing. But I just grabbed any 2 connections in the logs showing it was using more than 1 WAN. There were many other lines of logs showing connections on both WANs.
These connections were made over the timeframe of 1 minute and after killing states so shouldn't there only be 1 WAN IP in the logs regardless?
-
@Daskew78 said in Sticky connections not working with dual WAN:
t took me a few minutes to find those details
You can filter states.. My point was that those states are closed..
This statement "Once the states for that source expire" means what exactly... If any state, even closed states that are just waiting to time out.. Or does that state have to actually be active?
This where I thought maybe @Derelict could help..
Lets look at this scenario... You create a connection to IP X, now that state has been set to be closed.. fin.. and you enter a time_wait state.. Is that state considered expired - so a new session which is what you show there from a different source port would that go out the same wan, or would it round robin to the other wan?
You could look at it both ways.. Since the the state is just waiting to close, and you have this new session coming fro a different source port maybe I should round robin that connection.. Or you could look at it as hey there is ANY state from IP your rfc1918 address to this public IP 62.3 - so always use that wan? I am not exactly sure how it is looked at?
I could see both ways being valid ways of looking at.. Hey this client has an active session to x, any new sessions it creates will go out the same wan.. Or hey this session is closed or closing... Since this is a new session "different source port.. Maybe it should go out the other wan to load balance.
-
@johnpoz said in Sticky connections not working with dual WAN:
w that state has been set to be closed.. fin.. and you enter a time_wait state.. Is that state considered expired - so a new session which is what you show there from a different source port would that go out the same wan, or would it round robin to the other wan?
Would you be willing to do a remote session with me and I can show you all the evidence? I really think there's a bug here.