Not receiving down emails multi-wan in failover config in 24.03 SG1100
-
@Mission-Ghost said in Not receiving down emails multi-wan in failover config in 24.03 SG1100:
Is my understanding correct?
Yes it is The one exception there is that you cannot use a load-balancing gateway group as the default gateway whereas it can be used for policy routing. But if it's a failover gateway (gateways with different tiers) that should be fine.
I would check the routing table before and after failing over make sure the default route changes and the static routes to the monitoring IPs remain.
-
I’ll take a picture of the routing table to compare should I be able to catch this failure in the act. Often it’s between 3-4am when Starlink reboots a firmware update. I’m not usually awake then.
Looking, it’s curious that both Starlink gateways have the same ipv4 address of 100.64.0.1, but the two interfaces have different addresses: A=100.117.61.17, B=100.68.183.215
Is this a factor? How does pfSense figure which gateway to use to send out normal network traffic? (It does, the traffic graphs show it.) Is it the failover Tier that’s somehow steering the traffic to the tier 1 route when it’s up?
The default gateway in the routing table is 100.64.0.1. But which gateway is that?
-
@Mission-Ghost here is my current routing table; I’m confused how it’s working at all:
IPv4 Routes default 100.64.0.1 UGS 7 1500 mvneta0.4090 1.0.0.2 100.64.0.1 UGHS 25 1500 mvneta0.4090 1.1.1.2 100.64.0.1 UGHS 25 1500 mvneta0.4090 9.9.9.9 100.64.0.1 UGHS 25 1500 mvneta0.4090 10.10.10.1 link#7 UH 22 16384 lo0 34.120.255.244 link#12 UHS 4 1500 mvneta0.4092 100.64.0.0/10 link#12 U 9 1500 mvneta0.4092 100.68.183.215 link#7 UHS 1 16384 lo0 100.117.61.17 link#7 UHS 3 16384 lo0 127.0.0.1 link#7 UH 2 16384 lo0 149.112.112.112 100.64.0.1 UGHS 25 1500 mvneta0.4090 192.168.2.0/24 link#11 U 5 1500 mvneta0.4091 192.168.2.1 link#7 UHS 12 16384 lo0 192.168.10.0/24 link#13 U 15 1500 mvneta0.10 192.168.10.1 link#7 UHS 11 16384 lo0 192.168.20.0/24 link#14 U 10 1500 mvneta0.20 192.168.20.1 link#7 UHS 13 16384 lo0 192.168.30.0/24 link#15 U 14 1500 mvneta0.30 192.168.30.1 link#7 UHS 17 16384 lo0 192.168.40.0/24 link#16 U 16 1500 mvneta0.40 192.168.40.1 link#7 UHS 19 16384 lo0 192.168.50.0/24 link#17 U 18 1500 mvneta0.50 192.168.50.1 link#7 UHS 21 16384 lo0 192.168.60.0/24 link#18 U 20 1500 mvneta0.60 192.168.60.1 link#7 UHS 23 16384 lo0 192.168.100.0/24 link#12 U 8 1500 mvneta0.4092 192.168.100.1 link#12 UHS 4 1500 mvneta0.4092 192.168.100.2 link#7 UHS 3 16384 lo0 206.214.239.195 link#10 UHS 6 1500 mvneta0.4090
-
Ah, yes having the same gateway on both WANs is a problem. It can partially work for somethings that are routed via an interface specifically but many things will not work since it routes to the gateway IP.
I don't see the static routes to 8.8.8.8 or 8.8.4.4. Are those normally there? It should be unless you have deliberately checked the option not to add it.
But potentially gateway monitoring might not be using the correct link there.
Commonly one WAN would be put behind the ISP router NATing the connection to workaround this.
-
@stephenw10 said in Not receiving down emails multi-wan in failover config in 24.03 SG1100:
Ah, yes having the same gateway on both WANs is a problem. It can partially work for somethings that are routed via an interface specifically but many things will not work since it routes to the gateway IP.
I don't see the static routes to 8.8.8.8 or 8.8.4.4. Are those normally there? It should be unless you have deliberately checked the option not to add it.
But potentially gateway monitoring might not be using the correct link there.
Commonly one WAN would be put behind the ISP router NATing the connection to workaround this.
I don't know if the static routes to 8.8.8.8 and 8.8.4.4 are normally there. I've never had to know and haven't noted it previoiusly. It just worked. (And does appear to just work; monitoring seems to be functioning as expected).
It seems to see when one gateway drops offline without mistaking it for the other.
I have made sure the System/Routing/Gateways/Edit Static Route "Do not" box and is UNchecked on both gateways and never deliberately checked it as I saw no specific reason to do so.
Is something not working properly to show no static routes to the monitoring address?
Should I add a static route to the monitoring IPs given maybe it's not doing it as expected? Could this be a bug?
I've never (competently) set up static routes myself and don't know much about how to do it.
Unfortunately I have no direct configuration or router control over Starlink's address of the dish, which is the gateway. An quick look search hasn't suggested that there is even a sketchy unsupported way to do it using just Starlink's hardware.
Is there a way to work around it in pfSense?
Both Starlink routers are already in bypass mode. This means their router is completely disabled and the traffic passes straight from the dish to the pfSense SG1100 via Ethernet without their router interfering.
Starlink uses CGNAT in their network (it's actualy double-NATed [at least] but hasn't seemed to be an issue with our LAN use case).
-
@Mission-Ghost said in Not receiving down emails multi-wan in failover config in 24.03 SG1100:
Both Starlink routers are already in bypass mode.
Exactly. If you take one of them out of bypass mode it will become the gateway for that WAN with a different local subnet and thus pfSense will see a different gateway IP.
-
I've tried an experiment with virtual IPs but admit I don't know what I'm doing. This level of networking exceeds my experience, most of which is 30 years old.
I based my efforts on Tech with Shae's video on gaining access to the Starlink stats pages on the dish's own subnet 192.168.100.0/24 (stats pages now apparently removed by Starlink)...but still, maybe it might help pfSense find a consistent way out.
I have virtual IPs for each dish interface (A=192.168.100.2; B=192.168.100.3):
Virtual IP Address 192.168.100.2/32 STARLINKA IP Alias Starlink A dish management interface subnet access. 10.10.10.1/32 Localhost IP Alias pfB DNSBL - DO NOT EDIT 192.168.100.3/32 STARLINKB IP Alias Starlink B dish management interface subnet access.
I've then set up hybrid outboud NAT rules for both my network management .10 subnet and This Firewall (self) based on Shae's technique:
Mappings Interface Source Source Port Destination Destination Port NAT Address NAT Port Static Port Description Actions STARLINKA 192.168.10.0/24 * 192.168.100.0/24 * 192.168.100.2 (Starlink A dish management interface subnet access.) * Starlink A dish management interface subnet access. STARLINKA This Firewall (self) * 192.168.100.0/24 * 192.168.100.2 (Starlink A dish management interface subnet access.) * Starlink A dish management interface subnet access. STARLINKB 192.168.10.0/24 * 192.168.100.0/24 * 192.168.100.3 (Starlink B dish management interface subnet access.) * Starlink B dish management interface subnet access. STARLINKB This Firewall (self) * 192.168.100.0/24 * 192.168.100.3 (Starlink B dish management interface subnet access.) * Starlink B dish management interface subnet access. STARLINKA This Firewall (self) 123 (NTP) * 123 (NTP) STARLINKA address * NAT firewall NTP via Starlink A. Used to ensure NTP works w/o IPv6 errors. STARLINKB This Firewall (self) 123 (NTP) * 123 (NTP) STARLINKB address * NAT firewall NTP via Starlink B. Used to ensure NTP works w/o IPv6 errors. STARLINKA Addresses_All_VLANs Ports__VoIP_WiFi_Calling * Ports__VoIP_WiFi_Calling STARLINKA address * Static mapping WiFi-calling ports on Starlink. STARLINKA Addresses_Guest_Games_Network Ports__Games * Ports__Games STARLINKA address * Static mapping Games' ports on Starlink. STARLINKB Addresses_All_VLANs Ports__VoIP_WiFi_Calling * Ports__VoIP_WiFi_Calling STARLINKB address * Static mapping WiFi-calling ports on T-Mobile. STARLINKB Addresses_Guest_Games_Network Ports__Games * Ports__Games STARLINKB address * Static mapping Games' ports on T-Mobile.
What I've found is I now have a default route for each specific hardware interface for each dish (mvneta0.4092 and 4090) but the other monitoring and DNS routes appear to only be going out the A(lpha) Starlink dish on 4090, even the ones that should go out the B(ravo) dish on 4092:
IPv4 Routes default 100.64.0.1 UGS 0 1500 mvneta0.4092 default 100.64.0.1 UGS 0 1500 mvneta0.4090 1.0.0.2 100.64.0.1 UGHS 29 1500 mvneta0.4090 1.1.1.2 100.64.0.1 UGHS 29 1500 mvneta0.4090 8.8.4.4 100.64.0.1 UGHS 29 1500 mvneta0.4090 8.8.8.8 100.64.0.1 UGHS 29 1500 mvneta0.4090 9.9.9.9 100.64.0.1 UGHS 29 1500 mvneta0.4090 [...pfBlocker removed for brevity...] 34.120.255.244 link#12 UHS 4 1500 mvneta0.4092 100.64.0.0/10 link#12 U 9 1500 mvneta0.4092 100.68.183.215 link#7 UHS 7 16384 lo0 100.117.61.17 link#7 UHS 3 16384 lo0 127.0.0.1 link#7 UH 2 16384 lo0 149.112.112.112 100.64.0.1 UGHS 29 1500 mvneta0.4090 192.168.2.0/24 link#11 U 5 1500 mvneta0.4091 192.168.2.1 link#7 UHS 12 16384 lo0 [...VLANs removed for brevity...] 192.168.100.0/24 link#12 U 8 1500 mvneta0.4092 192.168.100.1 link#12 UHS 4 1500 mvneta0.4092 192.168.100.2 link#7 UHS 3 16384 lo0 192.168.100.3 link#7 UH 30 16384 lo0 206.214.239.195 link#10 UHS 24 1500 mvneta0.4090
This is confusing, as the monitoring for the Bravo dish seems to be working correctly...latency is different from the Alpha dish and so on.
So now I'm way out on a limb I don't understand and I don't know what's going on.
Any advice and insights would be appreciated. Thanks.
-
@stephenw10 said in Not receiving down emails multi-wan in failover config in 24.03 SG1100:
@Mission-Ghost said in Not receiving down emails multi-wan in failover config in 24.03 SG1100:
Both Starlink routers are already in bypass mode.
Exactly. If you take one of them out of bypass mode it will become the gateway for that WAN with a different local subnet and thus pfSense will see a different gateway IP.
Unfortunately then I would have the Starlink router blasting unneeded and unwanted WiFi all over the premesis and it won't send the Internet traffic into the pfSense router via the Ethernet adapter.
Bypass = Ethernet to pfSense
No-bypass = WiFi and no Ethernet -
Wow there's no Ethernet when it's acting as a router? That sucks.
Some other router in between then might be your only option then. Two gateways with the same address is a conflict and can never work correctly. The only exception to that is for PPPoE links because they are point to point connections. Butt even then some things will misbehave.
-
@stephenw10 said in Not receiving down emails multi-wan in failover config in 24.03 SG1100:
Wow there's no Ethernet when it's acting as a router? That sucks.
Some other router in between then might be your only option then. Two gateways with the same address is a conflict and can never work correctly. The only exception to that is for PPPoE links because they are point to point connections. Butt even then some things will misbehave.
I think I misspoke; the Ethernet apparently does work when their router is enabled and from some more reading it appears it does use the SL router's DHCP to serve a different IP address range to the Ethernet in the 192.168.1.0/24 set. Ok, yay. In bypass mode, the SL router's DHCP server is disabled and the dish's own 192.168.100.1 address is served to the pfSense. From the dish, so pfSense gets the same IP from each dish.
It certainly does suck, though, because I still want the WiFi completely off. I have my own access points and the SL WiFi will pollute the airwaves with traffic that is useless to me. I'll have to keep looking to see if there's a way to shut off WiFi without bypass mode, so I can keep the SL DHCP server delivering different IP addresses than the dish range. So far it doesn't seem so.
Starlink as a company behaves as if it were founded by a control freak. Strange.
While they have built a groundbreaking and well-functioning service in many respects, their terrestrial consumer-facing engineering seems to be where they assign the unpaid summer interns.
I appreciate your help and attention. Best regards.