Sticky connections not working with dual WAN
-
Thanks,
I've checked those both and they are already disabled, so I enabled both and disabled them.
I've gone to check the XML file but it's not clear exactly what I should be looking for? Do you happen to know, please?
It happens a lot, for example I can login to Santander for my banking, click just a few links and I'm logged out.
I log in to my own servers admin panel and again within a few clicks, it logs me out just like Santander. The logs show the same IPs so it's not changing but specifically show it's down to me using an IP that I didn't login with:
Rejected session for user admin because IP (5.70.xxx.xxx) doesn't match session file (217.45.xxx.xxx)I am also sure my connection is not dropping that often, there's just no way.
Thanks.
-
Just to confirm also, after turning the aforementioned settings on and off again I tried again with the "source tracking timeout for sticky connections" set to "1200" so it shouldn't change my IP when connected to the website for that amount of time (i.e. log me out).
However, it's still happening:
2020:06:06-02:06:04: '5.70.xxx.xxx' successful login to 'admin' after 1 attempts
2020:06:06-02:11:23: '217.45.xxx.xxx' successful login to 'admin' after 1 attemptsThe second login was because my IP changed and I had to login again.
I actually submitted this as a bug because I believe it is (I also sought out help in the IRC channel but they couldn't help me) but they referred me to here first:
https://redmine.pfsense.org/issues/10634Cheers.
-
It could be a bug - but I would think a lot more than just you would be reporting it.. I would think dual wan with sticky would be a common enough sort of setup that there are quite a few out there in the field..
I don't have dual wan, or would love to try and duplicate.. That you have a server to test to makes it easy to see exactly what is happening etc..
I would have to simulate a dual wan - which I could do.. But lets see if we get some any other traction - maybe someone with dual wan even if not using in load balancing - might be willing to try and duplicate the problem.
As temp solution - only thing I could suggest would be to turn off the load balancing and just use 2nd connection as failover.
-
As a temp solution, I've just set a rule that anything going to my servers or santander.co.uk & retail.santander.co.uk will use a specific gateway.
Are we just hoping someone with Dual WAN setup reads this and jumps in to help then?
Thanks.
-
Well we could call in @Derelict but don't think he is around for a few days..
-
Well there's no major rush as I'm not exactly down so I'll just hang on for an update and hopefully, he'll see this soon.
Thanks for your help so far Johnpoz :)
-
If the application doesn't work with load balancing it doesn't work with load balancing.
That's pretty much what I have. Talk to the application side about accepting sessions from multiple IP addresses.
-
I'm using this exact scenario, with dual wan, banking sites and quite a few users accessing them. No issues
I did have issues in the beginning and I had to raise stickiness to 2500.I also have raised the default weight to 2, so no line has a weight for 1.
I recall reading somewhere about an issue with load balancing, and this as a suggested workaround, but I can't recall it.In any case, it doesn't hurt anything to use a default weight of 2 and adjust smaller lines accordingly.
I'm on 2.4.5 and this also worked flawlesly on 2.4.4.p3
-
@Derelict I’m sorry but I don’t understand your reply.
The application does work with loadbalancing (Google Chrome, Microsoft Edge etc...) but the security of these websites being visited require that the IP doesn’t change. Isn’t that the exact purpose of sticky connections to work around this?
Plus if someone else is now reporting the issue surely it warrants being looked into?
Thank you.
-
Look at the states when you are connected. If there are two different IP addresses being connected to, but all connections to the same IP address use the same WAN, then load balancing is doing what it is designed to do and you will need to policy route all traffic for that application out the same WAN or Failover gateway group, not a load balance gateway group.
-
^ great point... But my take on him saying his server was logging 2 different IPs connecting is that he was only connecting to 1 destination IPv4 address..
But your point is very valid for many of these sites that are hosted on cdn where www.whatever.com could end up being 2 different destination ips for the same site..
-
-
@Derelict I tried as you suggested.... killed all states, went to my own server and logged in via the website (as said the server only has 1 IP). I was almost immediately logged out so logged in again.
Checked the states and noticed it's using both WANs as suspected:
VLAN1_TRUSTED tcp 192.168.1.126:64519 -> 62.3.XXX.XXX:3334 TIME_WAIT:TIME_WAIT 8 / 8 2 KiB / 936 B
WAN1 tcp 217.45.XXX.XXX:8341 (192.168.1.126:64519) -> 62.3.XXX.XXX:3334 TIME_WAIT:TIME_WAIT 8 / 8 2 KiB / 936 B
VLAN1_TRUSTED tcp 192.168.1.126:64522 -> 62.3.XXX.XXX:3334 FIN_WAIT_2:FIN_WAIT_2 8 / 8 2 KiB / 4 KiB
WAN2 tcp 5.70.XXX.XXX:59341 (192.168.1.126:64522) -> 62.3.XXX.XXX:3334 FIN_WAIT_2:FIN_WAIT_2 8 / 8 2 KiB / 4 KiBSticky connections are on and the timeout is set to 1200.
Thanks.
-
Well your states are showing fin_wait.. and time_wait
Those states are being closed..
I would sniff this traffic and who is sending the fin?
-
I'm not sure what you mean? I'm the only person connecting.
It took me a few minutes to find those details in the states so that's probably why it shows they connections are closing. But I just grabbed any 2 connections in the logs showing it was using more than 1 WAN. There were many other lines of logs showing connections on both WANs.
These connections were made over the timeframe of 1 minute and after killing states so shouldn't there only be 1 WAN IP in the logs regardless?
-
@Daskew78 said in Sticky connections not working with dual WAN:
t took me a few minutes to find those details
You can filter states.. My point was that those states are closed..
This statement "Once the states for that source expire" means what exactly... If any state, even closed states that are just waiting to time out.. Or does that state have to actually be active?
This where I thought maybe @Derelict could help..
Lets look at this scenario... You create a connection to IP X, now that state has been set to be closed.. fin.. and you enter a time_wait state.. Is that state considered expired - so a new session which is what you show there from a different source port would that go out the same wan, or would it round robin to the other wan?
You could look at it both ways.. Since the the state is just waiting to close, and you have this new session coming fro a different source port maybe I should round robin that connection.. Or you could look at it as hey there is ANY state from IP your rfc1918 address to this public IP 62.3 - so always use that wan? I am not exactly sure how it is looked at?
I could see both ways being valid ways of looking at.. Hey this client has an active session to x, any new sessions it creates will go out the same wan.. Or hey this session is closed or closing... Since this is a new session "different source port.. Maybe it should go out the other wan to load balance.
-
@johnpoz said in Sticky connections not working with dual WAN:
w that state has been set to be closed.. fin.. and you enter a time_wait state.. Is that state considered expired - so a new session which is what you show there from a different source port would that go out the same wan, or would it round robin to the other wan?
Would you be willing to do a remote session with me and I can show you all the evidence? I really think there's a bug here.
-
I am not saying its not a bug or that there isn't a problem - I just don't know which specifics pfsense is using to know keep a connection sticky.. I made a bit of edit addition - on my previous post.
You can look at it both ways, I don't know exactly what "Once the states for that source expire" means.. Maybe once there has been a fin, that state is no longer looked at - I am not sure..
-
Well I'm at a loss as to what to do next.
I think it comes down to @Derelict needs to advise what further testing I can do or accept it may be a possible bug?
I hope he replies!
-
I still think is out for a bit, my understand is he wouldn't be back for a few more days... So his check into the thread was a bit unexpected to me..
We can see if @jimp has any advice as well.. This is just a bit out of my comfort level, since I do not use multiple wans in a load balancing setup.. I don't really see the point to it to be honest ;) If you need to load balance tells me your connections are undersized ;) hehehe
I have more experience with this sort of thing on fortinet load balancing to servers behind them, and how their sticky connections work.. And even then its not a day to day sort of thing, only get called into consult on issues - normally they give me sniffs to work with and help them figure out what is going wrong ;)
If you could show state that is clearly active, and then another state being opened - then I would agree that is not how I would understand sticky to work.
You know who might be good as well would be @stephenw10