Secondary router in HA setup web GUI unresponsive
-
I have two Netgate 6100s in an HA cluster. We have two WANs, call them wan1 and WAN2.
We have /29 public ip subnet for wan1 so we are using the conventional CARP setup for wan1. WAN2 has a single public IP. We are running HA here using an alternative configuration where the WAN2 interface’s IP is a private address, and the carp ip is the public address we are given.
It works with one side effect. The web GUI of the router that is the backup is almost totally unresponsive. It specifically affects whichever router is the backup. Ergo if Router 1 is master and Router 2 is backup, Router 2’s web GUI is almost completely unresponsive. If I manually trigger a failover to make Router 2 the master, then Router 1’s web GUI becomes very unresponsive.
I am somewhat sure this is a DNS issue. We are using the DNS Resolver in forwarding mode and we have 4 public dns servers specified in System -> General Setup. 2 are using wan1’s gateway m, 2 are using wan2’s gateway.
DNS resolution behavior in General Setup is local dns first, then use public dns. This may not technically be correct since dns resolver is set to forward to servers listed in General Setup. I did try changing behavior to in General Setup to use local dns and ignore remote, but no change.
I suspect the issue is that the firewall is trying to resolve a dns name when opening the web GUI and the fact that WAN2 on the backup router has only a private IP address, and thus dns queries trying to use wan2’s gateway can’t get out.
Why the backup router wouldn’t prefer wan1 gateway is odd, especially since WAN2 is marked as down on the backup due to single public ip carp setup. However every other time I’ve seen the GUI grind almost to a halt was due to dns resolution difficulties.
I can get additional ip addresses on WAN2. That’s not an issue. My issue is that this shouldn’t be a problem, period. I presume I have something configured wrong. Wanted to ask if anyone has encountered this issue with a similar setup and if there is a fix?
-
@bp81 Is the backup router able to reach the internet?
Looks like a dns issue, however switching to local resolver will still require some internet access, especially if the requests are not authoritative or the cache has expired.As a test, if you have an internal dns you could try using it as the forwarded.
This would always use whatever is active to reach the Internet and also serve dns queries to all systems.Its a bit ugly as a solution, but easy to test if your issue is purely dns.
-
@netblues said in Secondary router in HA setup web GUI unresponsive:
@bp81 Is the backup router able to reach the internet?
Looks like a dns issue, however switching to local resolver will still require some internet access, especially if the requests are not authoritative or the cache has expired.As a test, if you have an internal dns you could try using it as the forwarded.
This would always use whatever is active to reach the Internet and also serve dns queries to all systems.Its a bit ugly as a solution, but easy to test if your issue is purely dns.
I did try your idea and it didn't fix the issue, but I also have some additional information to add that my make the whole thing practically not matter.
I did not mention my internal VLAN configuration in my original post because I didn't think it was relevant, however, apparently it is.
I am running three VLANs: Guest, Authenticated Users, and Management. The routers management interfaces are, obviously, part of the Management vlan. We have no layer 3 switches, so everything is 'router on a stick' configuration.
My workstation is in the Authenticated Users vlan. Firewall rules are configured to allow my traffic in particular to access the management vlan. This works fine when I access the current master router on its management interface in the management vlan and, as mentioned before, not so much when accessing the backup router in similar fashion.
I decided to allow web gui management traffic in the authenticated user vlan from my personal workstation, and I can access BOTH routers web gui within that vlan without issue. This works with the DNS configuration you suggested, and it also works with my original configuration I started out with.
So it might yet still be a DNS issue, but there is something else at play here related to the VLANs and routing from one vlan to another. I suspect consultation of the routing table and DNS resolution attempts is in order here.
I would still like to solve this problem because I am of a mind that it should work the way I was originally trying to use it, but practically the need is somewhat less now. I would prefer to keep the web gui's management access away from the authenticated users vlan, but I also typically don't need to access the backup router's gui as a norm, and I can make a disabled firewall rule to allow that access and simply turn it on and off on an as-needed basis if I have to.
-
@bp81 said in Secondary router in HA setup web GUI unresponsive:
We are using the DNS Resolver in forwarding mode and we have 4 public dns servers specified in System -> General Setup. 2 are using wan1’s gateway m, 2 are using wan2’s gateway.
Any reasons for stating gateways?
I suspect the issue is that the firewall is trying to resolve a dns name when opening the web GUI
Do you access the web GUI by host name?
I am running three VLANs: Guest, Authenticated Users, and Management. The routers management interfaces are, obviously, part of the Management vlan. We have no layer 3 switches, so everything is 'router on a stick' configuration.
My workstation is in the Authenticated Users vlan.But the switch is VLAN-capable and your workstation is connected to a properly configured port?
Firewall rules are configured to allow my traffic in particular to access the management vlan. This works fine when I access the current master router on its management interface in the management vlan and, as mentioned before, not so much when accessing the backup router in similar fashion.
So you access the web interface by calling the management VLAN IP from user VLAN?
If so, why? That doesn't enhance security in any way.This will end up in asymmetric routing issues. The client will send request to the master where the packet is passed to the backup firewall, but responses from the backup are sent back directly to the client and don't pass the master.
Though there is a workaround for this behavior, as I mentioned, accessing the web configurator by an IP from another network makes no sense at all.Simply allow your computer web GUI access to "This firewall" and use the next interface address.
-
Taking them in order:
"Any reasons for stating gateways?"
Mostly because the documentation recommends it. Though I have found it's also not necessary, I presume if a gateway is not specified that the machine consults the routing table to attempt to reach a particular DNS server."Do you access the web GUI by host name?"
Typically no, though I have been migrating to that. I am finding that this problem occurs whether using IP addresses or hostname."But the switch is VLAN-capable and your workstation is connected to a properly configured port?"
Yes to both. We are using Ubiquiti switching and port based vlan tagging. My workstation receives an ip address within the correct vlan and all other connectivity to internal and external hosts works as expected."So you access the web interface by calling the management VLAN IP from user VLAN?
If so, why? That doesn't enhance security in any way."
Yes, because we haven't completed migration of our current networking configuration. It's in process as I type. I'm either adding a second NIC to my machine that's part of the management vlan for this purpose (we already have the wiring in the walls for this to work), or I will devise a way to harden web gui access from within the authenticated user vlan to only authorized machines. I am also considering setting up the Azure MFA extensions for NPS and just protect the web gui login with RADIUS that is itself backed by AD authentication and multifactor authentication via Authenticator app. That's not my first choice because an internet outage could lock me out of my web gui."This will end up in asymmetric routing issues. The client will send request to the master where the packet is passed to the backup firewall, but responses from the backup are sent back directly to the client and don't pass the master."
I am a little unclear on why this would be the case. Master router's management lan IP is 192.168.1.2. Backup router's management lan IP address is 192.168.1.3. The CARP IP is 192.168.1.1. Traffic sent to .2 should arrive at .1 first (CARP IP) and then get sent along to .2. Traffic destined for .3 it seems like would get the same treatment. Returning traffic would also seem to follow the same path. Return traffic from either .2 or .3 should go back to .1, to be handed off to the interface serving the authenticated user vlan and back to my workstation. Though it's entirely possible there's something happening under the hood with CARP that I am not aware of that would affect this. That would make a certain degree of sense, since if I am in the same subnet as the IP addresses I'm using to access the backup router's web gui, all seems to work as expected.
The thing that's a little bit of a headscratcher to me, is if what you're saying is correct, you'd assume that it wouldn't work at all. That is not the case. I can, nominally, contact the backup router's web gui. It does, in fact, respond...some. It is unusably slow, but it definitely does talk back at some level, as it will sometimes manage to partially load a page. I would assume with reverse interface mismatch issues that it simply would not work at all.
I think that given this works if my workstation can be in the same subnet, this is a far less urgent/concerning issue, especially considering that once we get transitioned into our final configuration it will work right anyway.
-
@bp81 said in Secondary router in HA setup web GUI unresponsive:
I presume if a gateway is not specified that the machine consults the routing table to attempt to reach a particular DNS server.
This is the desired behavior in most cases, I think. Possibly not for you, therefor I asked for the reason.
Traffic sent to .2 should arrive at .1 first (CARP IP) and then get sent along to .2. Traffic destined for .3 it seems like would get the same treatment. Returning traffic would also seem to follow the same path. Return traffic from either .2 or .3 should go back to .1
Any network device addresses responses to the source IP of the request packets and sends it out according to its routing table.
The source IP in requests is an IP from the users VLAN. So the backup firewall checks it's routing table, where it will find the IPs network assigned to the users VLAN interface. Hence it will send the packet out to the this interface. No reason to send back to the master at all.You can sniff the traffic on the involved interfaces to see what's going on there.
If you want to stay using the management IP for what ever reason, you can masquerade packets destined for the other node on this interface.
-
@bp81 said in [Secondary router in HA setup web GUI unresponsive]
, or I will devise a way to harden web gui access from within the authenticated user vlan to only authorized machines. I am also considering setting up the Azure MFA extensions for NPS and just protect the web gui login with RADIUS that is itself backed by AD authentication and multifactor authentication via Authenticator app. That's not my first choice because an internet outage could lock me out of my web gui. (/post/1014732):You can always disable the antilockout rule for authenticated users lan and just allow authorised ip's
A good password on top is probably all you need.
AD authentication opens up another attack surface too.
as for 2fa, its a very bad idea for the exact reasons you just mentioned.Now, since ip's can be changed, mac's can be spoofed how much security is enough security for you.?
You could also utilise a jump-host
where you could ssh and portforward remote ports when needed, or use windows and rdp to the device first, and then login to pf. -
@viragomann said in Secondary router in HA setup web GUI unresponsive:
@bp81 said in Secondary router in HA setup web GUI unresponsive:
I presume if a gateway is not specified that the machine consults the routing table to attempt to reach a particular DNS server.
This is the desired behavior in most cases, I think. Possibly not for you, therefor I asked for the reason.
Traffic sent to .2 should arrive at .1 first (CARP IP) and then get sent along to .2. Traffic destined for .3 it seems like would get the same treatment. Returning traffic would also seem to follow the same path. Return traffic from either .2 or .3 should go back to .1
Any network device addresses responses to the source IP of the request packets and sends it out according to its routing table.
The source IP in requests is an IP from the users VLAN. So the backup firewall checks it's routing table, where it will find the IPs network assigned to the users VLAN interface. Hence it will send the packet out to the this interface. No reason to send back to the master at all.You can sniff the traffic on the involved interfaces to see what's going on there.
If you want to stay using the management IP for what ever reason, you can masquerade packets destined for the other node on this interface.
I admit I had to think about this and stare at routing tables altogether too long, but I do think I understand what's happening now and why.
-
@netblues said in Secondary router in HA setup web GUI unresponsive:
@bp81 said in [Secondary router in HA setup web GUI unresponsive]
, or I will devise a way to harden web gui access from within the authenticated user vlan to only authorized machines. I am also considering setting up the Azure MFA extensions for NPS and just protect the web gui login with RADIUS that is itself backed by AD authentication and multifactor authentication via Authenticator app. That's not my first choice because an internet outage could lock me out of my web gui. (/post/1014732):You can always disable the antilockout rule for authenticated users lan and just allow authorised ip's
A good password on top is probably all you need.
AD authentication opens up another attack surface too.
as for 2fa, its a very bad idea for the exact reasons you just mentioned.Now, since ip's can be changed, mac's can be spoofed how much security is enough security for you.?
You could also utilise a jump-host
where you could ssh and portforward remote ports when needed, or use windows and rdp to the device first, and then login to pf.This is probably a topic for another thread. We have good wifi security (RADIUS backed authentication) and pretty good physical security (ie, no one is walking in and plugging in a laptop to an open network port). We have the guest VLANs blocked for any traffic to the web gui as well. So is this good enough? Probably, for the moment.
Over the years our security efforts have been focused towards external threats, but the company is getting large enough now I have to start thinking about internal actors as well. This is a conversation I'd like to have on this particular issue, because I have to start somewhere, but it's probably best to go into its own topic.