Starlink - No Internet when "Reject leases from" Configured
-
@stephenw10 said in Starlink - No Internet when "Reject leases from" Configured:
You probably need to check what the IP address of the server is when it#s working correctly. It might not necessarily be the gateway address for example. It could be the local modem device still.
What do you see in the leases file in /var/db/dhclient.leases.xxx ?@stephenw10 FYI, I am using Starlink in Bypass router mode (I'm not using the Starlink Wi-Fi router, the dish is plugged directly into my Netgate SG-5100).
Attached is what I see in the file you referenced with the gateway Online and working. Note: This is without 192.168.100.1 in the Reject leases from field.
I specifically see:
fixed-address 98.97.98.186;
option routers 98.97.96.1; (this is currently my gateway IP address) -
Hmm, that looks like what I'd expect and should not be a problem rejecting leases from 192.168.100.1.
Do the dhcp logs show the same for dhclient? Is it seeing the dhcp replies from the upstream gateway IP? -
@stephenw10 I think this is what you are asking for...
Feb 27 09:57:46 dhclient 85970 RENEW
Feb 27 09:57:46 dhclient 86350 Creating resolv.conf
Feb 27 09:57:46 dhclient 74059 bound to 98.97.98.186 -- renewal in 150 seconds.
Feb 27 10:00:16 dhclient 74059 DHCPREQUEST on ix0 to 98.97.96.1 port 67
Feb 27 10:00:16 dhclient 74059 DHCPACK from 98.97.96.1
Feb 27 10:00:16 dhclient 74059 unknown dhcp option value 0x52I'm not sure how to check if it is seeing the dhcp replies from the upstream gateway IP. Could you tell me how to determine that, please?
-
@midihead7 said in Starlink - No Internet when "Reject leases from" Configured:
DHCPACK from 98.97.96.1
Yup it is coming from the gateway. I would not expect refusing leases from 192.168.100.1 to make any difference in that situation. So perhaps when it fails something else is happening.
-
@stephenw10 said in Starlink - No Internet when "Reject leases from" Configured:
Yup it is coming from the gateway. I would not expect refusing leases from 192.168.100.1 to make any difference in that situation. So perhaps when it fails something else is happening.
Good Morning (afternoon where you are?) Stephen,
Glad to see you on this. So far you are 2 for 2 on problems I've found and I just found this one last week. Like "midi" I too run a Netgate appliance (SG-4860) and I had not seen this problem until recently, But I'm having the same issue. I did not use the same setup that he presented for Starlink but came to much the same configuration. I was not having this issue on 22.05 that I was running on a different SG-4860. When I upgraded to 23.01 I decided it was time to convert to ZFS so I built out my backup hardware and moved to 23.01 (after you helped with the ichsmb0 flood).
I periodically have pulled the network connection from the Starlink power brick and put the Starlink router back online. Mainly to keep it somewhat current and to make sure the app on my phone updates accordingly. Before, when I went back to the pfSense connection my system would take about 5 minutes to come online, now with the different hardware (same model) and 23.01, it just would not connect with our reboots of both, and physically pulling and reconnecting the ethernet connection to WAN. Once it is established, it's rock solid and I can reboot at will with no issues. Trying to pin it down, I wanted to eliminate the new hardware, so I went back to the original box (also updated to ZFS/23.01) and same issue. So I reverted back to the backup hardware.So it's either something in 23.01 or something at Starlink. But if I remove the 192.168.100.1 it seems to boot much cleaner but still not seamless. That address is the default for the DISH itself, my system is running on 98.97.59.x gateway is on 100.64.0.x
I don't know if switching routers is confusing the Starlink box or if it's in pfSense? I'm going to put in a ticket with them this morning.
There is never an issue when I physically disconnect pfSense and plug in the Starlink native router to the Starlink box/power brick/POE injector/whatever it's called. It's only a problem when going back to pfSense.
Rick
-
@ramosel said in Starlink - No Internet when "Reject leases from" Configured:
Once it is established, it's rock solid and I can reboot at will with no issues.
Is that rebooting pfSense? If you reboot the Starlink device does it fail again?
There has definitely been a change in the timing at boot in 23.01 and it seems to be affecting some devices where the timing is more critical. Mostly we have seen this when both pfSense and an upstream modem are recovering from a power outage. If pfSense requests a dhcp lease before the upstream device has finished booting it can pull a bad lease or kill the dhcp client and fail to pull any lease. In al most every case you can simply add a delay to the pfSense boot to allow anything upstream to complete its boot first.
This sounds like you might be hitting that or something similar.
Steve
-
@stephenw10 said in Starlink - No Internet when "Reject leases from" Configured:
Is that rebooting pfSense? If you reboot the Starlink device does it fail again?
Yes to rebooting pfSense. I have not tested the Starlink reboot. I just made the change to the "Timeout" but haven't been able to test the disconnect/reconnect yet. (not "Select timeout" as Midi posted above.)
There has definitely been a change in the timing at boot in 23.01 and it seems to be affecting some devices where the timing is more critical. Mostly we have seen this when both pfSense and an upstream modem are recovering from a power outage. If pfSense requests a dhcp lease before the upstream device has finished booting it can pull a bad lease or kill the dhcp client and fail to pull any lease. In al most every case you can simply add a delay to the pfSense boot to allow anything upstream to complete its boot first.
Interesting... I run dpinger and I often (not every time) find this service stopped when I am trying to get the re-connection to link. I have to manually restart dpinger.
Is it important to have the "Reject leases from" populated with the Starlink Dish address?
I'll let you know, I should have a window to test tomorrow AM.
Thanks Steve,
Rick -
If the Starlink dish will hand out an IP itself before it has a link to the startlink network then that's what you would normally want to prevent by rejecting the leases. Devices usually do that to allow accessing the modem in order to troubleshoot the connection. But doing so means pfSense pulls a lease from that private subnet and will not pull the correct lease until it renews which is usuallt ~1hr.
To allow the upstream device to boot fully before pfSense you can create the file /boot/loader.conf.local and add to it:autoboot_delay="30"
Or whatever the minimum delay required is.
-
@stephenw10 said in Starlink - No Internet when "Reject leases from" Configured:
To allow the upstream device to boot fully before pfSense you can create the file /boot/loader.conf.local and add to it:
autoboot_delay="30"
Or whatever the minimum delay required is.
What is the unit of the value? seconds, minutes?
Do I set this value AND the Protocol timing/Timeout under the Interfaces/WAN -
It's in seconds. By default it's set to 3s to allow you to select something from the bootloader menu but can be set much higher.
Where that helps is if the upstream device bounces the link during it's boot. If pfSense tries to pull a lease at the point the link is down the dhclient just fails out and no delay setting within the client config will help. Setting a delay before boot prevents hitting that.Steve
-
@stephenw10 Thanks for the clarification on the boot delay seconds. I probably won't leave it at 30 but I do like having more than 3.
Here is the scenario. Working system with a valid WAN connection via the Starlink Brick. Reboot either device and pfSense gets a valid online condition with a v4/DHCP4 address.
Pull the WAN connection from the Starlink brick and plug in the Starlink router. Let it fully come up, the ethernet connection router is valid and workig. The WiFi connections are valid, everything works. Reboot the Starlink router, everything comes back online. Reboot the Starlink dish, everything comes back online.
Unplug the Starlink router from the brick and plug in the SG-4860 WAN port.
But, even with the Timeout value set and the delay, rebooting together, rebooting separately with one first then the other, it just does not want to connect if the 192.168.100.1 is set in the "Reject Leases from" field. I was monitoring both the console port on the SG-4860 and the GUI via a raspberry pi I had running there. After the Starlink brick is rebooted or power cycled, I would get an "online" condition that lasted about 10 second then dropped out. I would even see the v4/DHCP4 address on the console, but as I refreshed the screen, it too would drop. Then it is back to physically unplugging the wan connection at the Starlink brick multiple times and eventually you get a online condition. Once you have a valid connection you can reboot either and the system comes back online with a valid WAN address via DHCP. Which oddly enough is on the 100.91.143.x/10 network.
Just to add another piece to this puzzle. Spoke to someone local who (like Midi) is also running a Netgate 6100 with Starlink as her primary and a 5G wireless carrier as a back up WAN connection setup in failover. IF she loses Starlink and pfSense switches to the 5G wireless. It will never switch back to Starlink. Even if she physically unplugs the 5G connection, it will not reattach Starlink... until she plays with reboots and plug/unplug game. Before she had AT&T Uverse as her primary, same setup and the failover worked fine. She has not moved to 23.01 yet.
It's just getting that initial connection to work is the key. I know its probably something in the Starlink negotiation... but they say if it comes up OK on their router, it's not their problem.
Let me know if I can do anything or pull data for you.
Thanks,
Rick -
Hmm, so to be clear if you reject leases from 192.168.100.1 it never pulls a lease? Even with reconnecting the WAN etc?
But if you don't reject those leases pfSense ends up with a local lease from the Starlink brick and doesn't pull the CGN IP that allows it to connect out?
I think we tried this earlier in the thread but it would be good to check the logs to see what the DHCP server IP is that hands out the lease in each case.
It's hard to see why refusing those leases would prevent pulling leases from another server, if that's what's actually happening.
Steve
-
@stephenw10
I was watching the boot cycle during re-connect issue today. I kept refreshing the console and it did try to connect to 192.168.100.100. Then it dropped, when it finally connected it was to 100.91.143.xx/10.
The gateway shows 100.64.0.xx
My DDNS cached address is 98.97.61.xxI posted the real data to a direct chat.
There was another post on Reddit last night with a Starlink user having trouble with pfSense. Is anyone on the Dev team running Starlink?
Rick
-
@ramosel said in Starlink - No Internet when "Reject leases from" Configured:
Is anyone on the Dev team running Starlink?
Not yet as far as I know. That would make this a lot easier!
-
@stephenw10 said in Starlink - No Internet when "Reject leases from" Configured:
@ramosel said in Starlink - No Internet when "Reject leases from" Configured:
Is anyone on the Dev team running Starlink?
Not yet as far as I know. That would make this a lot easier!
It would, but I'm willing to help if anyone wants to look.
Otherwise, Austin is not that far from Boca Chica.
-
Hmm, there's enough Starlink users this has got to be a known issue.
Just to be clear if you remove the reject leases line from the config it ends up with a lease from that IP that cannot connect externally?
-
@stephenw10 Correct. It gets a lease on 192.168.100.1 for a short time, probably no more than 10 seconds (no subnet delineation) drops it and gets one on 100.91.143.1/10. The final one works.
-
Hmm, maybe I've lost track here then. What's the problem with not rejecting that then?
I thought that was required to prevent it ending up with a bad lease that could not connect.
-
@stephenw10 That's OK, I get lost and I'm in it....
If the reject address is there, it never gets the 192.168.100.100 address... so it just sits. Never connects.
If it's removed, it gets the 192.168.100.100, drops it and then you go into this mode where you need to reboot, unplug, replug... then it finally gets an address on 100.91.143.xx/10 and works. Then it will reboot and reconnect just fine.
But if there is a prolonged outage, or I plug in the Starlink Native router (to keep it up to date), then you start this whole dance over again.
What I'm hearing from the lady running Starlink as a Primary WAN and a Verizon 5g Home wireless gateway as a secondary in a failover is that if Starlink Fails over to Verizon, it never comes back unless she does the same reboot/unplug dance I'm doing on the Starlink only system
Hope that clears it up...
When it's working, it works great!
-
@ramosel Just FYI, the outage on Friday that started the latest round of reconnect fun was legit. Starlink had an expired cert on one of their ground stations. They were offline for half an hour or more. That was the cause of the lost connection.