Always Wan-ip but gateway is 100% packet loss
-
@AcidSleeper said in Always Wan-ip but gateway is 100% packet loss:
but if gateways says "Online" but I still cant browse the Internet I checked Services -> Dns Resolver and Zone was still not populating. Restarted and now I can surf.
sounds like 2 issues when you are in this state.
a) so you can get "online" and it stays that way ? or does it just go away on it's own.
then when you are "ONLINE"
b) DNS issue (because you can't browse).The lack of a running DNS when ONLINE just means you can't browse, but your monitor should still ping the IP.
So here I have turned the resolver off,
Status -> services
unbound isn't even listed.
(the gateway and monitor IP are still ONLINE and the remote IP (in this case I am monitoring 1.1.1.1) is reachable
but I can't browse it will timeout and tell me it can't get there.
So the gateway being up / and online has nothing to do with ability to browse. Of course if it is OFFLINE, the second problem doesn't matter for external browsing.
if you can't reliably get and keep a gateway, that's 1 thing (the first problem)
once you have an online gateway that stays up/online, and you can't browse, that's the second problem.Step 1 get to the point where your gateway is online and stays that way. Are we there?
Step 2 fix the DNS (if it is broken) out of the box (DNS Resolver) should just resolve. On the Resolver screen you have shown above, uncheck DNSSEC and check Enable Forwarding Mode. You already have DNS servers listed on your General Settings page so when online that's who it should forward to, instead of root servers. See if that changes anything. (of course your gateway has to be online/stable first)
-
One thing that I've encountered with my ISP is that their gateways tend to drop pings, etc. when they become congested and after a recent "upgrade" they're basically dropping all icmp, pings, almost 100% of the time. I've done a few things.
First in each Gateway (I have 1 IPv4 and 1 IPv6) I set up a different IP in each in the area "Monitor IP". For me I used two of Cloudflare's DNS addresses as they're pretty much always up (1.1.1.1 and 2606:4700:4700::1111). This makes the gateway monitoring a little more accurate as sometimes say the icmp request will make it through the ISP's gateway to the monitor IP and it will return those statistics. Still, as above, the gateways can be so bad that even requests to different addresses never make it there and back in time.
More importantly I checked off the box next to "Gateway Action" to 'disable gateway monitoring action' so that if their stops responding to echo requests, the gateway is still considered up and pfsense doesn't take any actions such as Kill states on gateway failure. You could instead also try changing that option to 'Do no kill states on gateway failure'.
I'm not savvy enough to know a better and more accurate way of interacting with my ISP's gateways (Comcast/Xfinity), this just cuts down on additional down time when the connection is congested but still up.
-
@jrey said in Always Wan-ip but gateway is 100% packet loss:
@johnpoz
Thanks for taking a look,from what I can see this is the summary
MC - Asus works
MC - pfSense fails
MC - Asus - pfSense worksThat is correct! and where MC- Pfsense meet there is 3 occations:
- Online (everything work):
- Here Im happy, but worring if next restart will be offline. - Online (ping works on ip, browsing dont work):
- Have figured out that if I stop Unbound and start it again I can browse. - Offline (100% packet loss, nothing works)
- Here is the real dilemma. It seems like its only luck that desides if I get a Online (1 or 2) on the gateway.
I was just going to suggest that the OP make sure the Named gateway is selected on system Routing Gateways, only to have the system Save the config again.
Checked the settings and my gateway is selected as default.
from the DHCP snippets of log, it looks like it is being assigned and IP and gateway, and it varies from connection to connection.
I'm still not convinced it isn't somehow MAC related. but the OP says he tried that. Not sure both devices where power cycled at that point. Fibre connection.
DNSSeq is also turned OFF as off today.
I have now tested Mac-spoofing again:
I copied my Asus mac-adress in Pfsense and saved. I halted Pfsense. Pfsense not plugged into MC yet. I let the MC be turned off a couple of minutes then started it and let it sit without any device connected to it. Then I plugged in the pfsense and started pfsense.I got Offline 100% packet loss. So I did this:
- Status -> Interfaces: Checked Relinquish Lease and pressed Release Wan.
- Diagnostics -> Command Prompt: ifconfig eth0 down. Waited 30seconds. Wrote ifconfig eth0 up.
- Status -> Interfaces: Looked at it and I was getting an IP but In/out packets (block) was in numbers equal to In/out packets (pass). Tried to browse Internet but no browsing possible.
- Status -> Services and stopped Unbound and started it back up after a few seconds. Browsing works.
Even if this works its a pretty long way to go just to start pfsense.
Now I have restarted pfsense 2 times and its up and running on both occassions.
Now I unplugged the wan-cabel between Pfsense and MC. Result:Plugg back the wan-cable and everything starts back up. Everything works.
Must try another reboot. Nope, Offline with 100% pack loss.
The logs for this:
pfsense-general-log231205.txt
pfsense-gateways-log231205.txt
pfsense-resolver-log231205.txt
pfsense-dhcp-log231205.txtWith the gateway offline I did what I did above:
- Status -> Interfaces: Checked Relinquish Lease and pressed Release Wan.
- Diagnostics -> Command Prompt: ifconfig eth0 down. Waited 30seconds. Wrote ifconfig eth0 up.
- Status -> Interfaces: Looked at it and I was getting an IP but In/out packets (block) was in numbers equal to In/out packets (pass). Tried to browse Internet but no browsing possible.
- Status -> Services and stopped Unbound and started it back up after a few seconds. Browsing works.
Then it works again.
- Online (everything work):
-
@Sorjal said in Always Wan-ip but gateway is 100% packet loss:
First in each Gateway (I have 1 IPv4 and 1 IPv6) I set up a different IP in each in the area "Monitor IP". For me I used two of Cloudflare's DNS addresses as they're pretty much always up (1.1.1.1 and 2606:4700:4700::1111). This makes the gateway monitoring a little more accurate as sometimes say the icmp request will make it through the ISP's gateway to the monitor IP and it will return those statistics. Still, as above, the gateways can be so bad that even requests to different addresses never make it there and back in time.
I only have IPv4 and I have added 8.8.8.8 to gateway monitoring.
More importantly I checked off the box next to "Gateway Action" to 'disable gateway monitoring action' so that if their stops responding to echo requests, the gateway is still considered up and pfsense doesn't take any actions such as Kill states on gateway failure.
I havent this box checked but I try anything. Will wait to do that until my answear to @jrey is looked at, so I dont mess up anything.
You could instead also try changing that option to 'Do no kill states on gateway failure'.
My settings:
-
A idea from a swedish forum.
What if there´s a problem when the link-speed is to be negotiated between MC and pfSense?
MC (Inteno) maximum 1GbE and PFsense 2.5GbE
Should I set the Speed and Duplex in Inferface -> Wan?
-
Possible, what speed is it showing?
Should I set the Speed and Duplex in Inferface -> Wan?
you can certainly force it to match, if it is not auto detecting the correct value.this sequence seems "interesting"
Dec 5 09:03:33 dhclient 77794 igc0 link state up -> down Dec 5 09:03:34 dhclient 77794 connection closed Dec 5 09:03:34 dhclient 77794 exiting. Dec 5 09:06:08 dhclient 71781 PREINIT Dec 5 09:06:08 dhclient 70811 DHCPREQUEST on igc0 to 255.255.255.255 port 67 Dec 5 09:06:09 dhclient 70811 DHCPACK from 192.121.XXX.2 Dec 5 09:06:09 dhclient 72733 REBOOT Dec 5 09:06:09 dhclient 74082 Starting add_new_address() Dec 5 09:06:09 dhclient 74466 ifconfig igc0 inet 192.121.XXX.50 netmask 255.255.255.128 broadcast 192.121.XXX.127 Dec 5 09:06:09 dhclient 75182 New IP Address (igc0): 192.121.XXX.50 Dec 5 09:06:09 dhclient 75852 New Subnet Mask (igc0): 255.255.255.128 Dec 5 09:06:09 dhclient 76620 New Broadcast Address (igc0): 192.121.XXX.127 Dec 5 09:06:09 dhclient 77335 New Routers (igc0): 192.121.XXX.1 Dec 5 09:06:09 dhclient 78369 Adding new routes to interface: igc0 Dec 5 09:06:09 dhclient 79802 /sbin/route add -host 192.121.XXX.1 -iface igc0 Dec 5 09:06:09 dhclient 80974 /sbin/route add default 192.121.XXX.1 Dec 5 09:06:09 dhclient 82065 Creating resolv.conf Dec 5 09:06:09 dhclient 70811 bound to 192.121.XXX.50 -- renewal in 43170 seconds. Dec 5 09:09:31 dhclient 2416 dhclient already running, pid: 77427. Dec 5 09:09:31 dhclient 2416 exiting. Dec 5 09:09:31 dhclient 2775 dhclient already running, pid: 77427. Dec 5 09:09:31 dhclient 2775 exiting. Dec 5 09:09:34 dhclient 59957 PREINIT Dec 5 09:09:34 dhclient 78067 DHCPREQUEST on igc0 to 255.255.255.255 port 67 Dec 5 09:09:34 dhclient 78067 DHCPACK from 192.121.XXX.2 Dec 5 09:09:34 dhclient 61008 REBOOT Dec 5 09:09:34 dhclient 62192 Starting add_new_address() Dec 5 09:09:34 dhclient 63168 ifconfig igc0 inet 192.121.XXX.50 netmask 255.255.255.128 broadcast 192.121.XXX.127 Dec 5 09:09:34 dhclient 64280 New IP Address (igc0): 192.121.XXX.50 Dec 5 09:09:34 dhclient 65065 New Subnet Mask (igc0): 255.255.255.128 Dec 5 09:09:34 dhclient 65381 New Broadcast Address (igc0): 192.121.XXX.127 Dec 5 09:09:34 dhclient 66014 New Routers (igc0): 192.121.XXX.1 Dec 5 09:09:34 dhclient 66921 Adding new routes to interface: igc0 Dec 5 09:09:34 dhclient 68062 Creating resolv.conf Dec 5 09:09:34 dhclient 78067 bound to 192.121.XXX.50 -- renewal in 43170 seconds.
Still have to piece together the other 3 log files to see the entire sequence of events, specifically
Dec 5 09:06:09 dhclient 77335 New Routers (igc0): 192.121.XXX.1 Dec 5 09:06:09 dhclient 78369 Adding new routes to interface: igc0 Dec 5 09:06:09 dhclient 79802 /sbin/route add -host 192.121.XXX.1 -iface igc0 Dec 5 09:06:09 dhclient 80974 /sbin/route add default 192.121.XXX.1 Dec 5 09:06:09 dhclient 82065 Creating resolv.conf
Dec 5 09:09:30 kernel Uptime: 12m25s Dec 5 09:09:30 kernel ---<<BOOT>>---
compared to the one 3 minutes later
Dec 5 09:09:34 dhclient 66014 New Routers (igc0): 192.121.XXX.1 Dec 5 09:09:34 dhclient 66921 Adding new routes to interface: igc0 Dec 5 09:09:34 dhclient 68062 Creating resolv.conf Dec 5 09:09:34 dhclient 78067 bound to 192.121.XXX.50 -- renewal in 43170 seconds.
would be interesting to know what is in /etc/resolv.conf under both conditions
"Online working" and "when not"also the contents of /var/db/dhclient.leases.igc0 under both conditions.
"Gateway Action" to 'disable gateway monitoring action'
this will turn off the monitoring, and you could try it, however you have cases where it is working. You are not pinging the ISP, you are pinging a remote. ISP shouldn't be blocking that. You can find out, when you are online and it is working... do a trace route to 1.1.1.1
you also have a couple of entries that imply the dhclient is not running
DHCP Client not running on wan (igc0)
start tracking if you can, when you get the IP assigned from their .2 vs their .3
ie when you get the address assigned from .2 does it work and from .3 not work or visa versa. -
Actually after another cup of coffee and more of your log files, it occurs to me that you might want to add one of the DHCP servers responding in
Interfaces -> WAN
put one of the addresses in here (not both) either the 192.121.xxx.2 or .3
it just seems odd that is such a small space /25 they would have 2 servers handing out addresses, unless (read my IM) the XXX is in different segment, which would also be an ISP why question? From the logs I've seen over the past day or so, you seem to get IP from .2 most often, so start by rejecting the .3 in this field,Let's see if that changes anything.
-
@jrey Just to clarify, I was mentioning the option to disable the monitoring actions, not the monitoring itself.
For example, here's a brief bit from my own gateway log...
If the monitoring actions were turned on and say killing states, I'd never connect to anything. Leaving monitoring on let's me send logs to my isp for them to ignore and say everything is fine. :)
-
@Sorjal said in Always Wan-ip but gateway is 100% packet loss:
I was mentioning the option to disable the monitoring actions
right you are, sorry --- I misread the option you suggested "actions" ---
@AcidSleeper
do this after the "Reject leases from" test but not at the same time.
the fact that there are a couple of times in your various logs, where it appears the gateway is offered, but doesn't appear to be set, would never let it run in the first place. -
@jrey said in Always Wan-ip but gateway is 100% packet loss:
Actually after another cup of coffee and more of your log files, it occurs to me that you might want to add one of the DHCP servers responding in
Interfaces -> WAN
put one of the addresses in here (not both) either the 192.121.xxx.2 or .3
it just seems odd that is such a small space /25 they would have 2 servers handing out addresses, unless (read my IM) the XXX is in different segment, which would also be an ISP why question? From the logs I've seen over the past day or so, you seem to get IP from .2 most often, so start by rejecting the .3 in this field,Let's see if that changes anything.
Nothing changed in gateway. It turned offline, so same problem.
Logs:
231206-pfsense-general-log.txt
231206-pfsense-gateway-log.txt
231206-pfsense-resolver-log.txt
231206-pfsense-dhcp-log.txt -
@Sorjal said in Always Wan-ip but gateway is 100% packet loss:
@jrey Just to clarify, I was mentioning the option to disable the monitoring actions, not the monitoring itself.
If the monitoring actions were turned on and say killing states, I'd never connect to anything. Leaving monitoring on let's me send logs to my isp for them to ignore and say everything is fine. :)
Sorry but it didnt work. I got this:
-
Maybe, but the DHCP client logging is certainly vastly different from previous samples.
let's take smaller steps.
can we set the exclude for the DHCP to the .3. -- after the change applies (give it a couple of minutes)
shutdown pfSense, pause
restart the MC, pause
then restart pf. pausethen login.
once it is up and running (online or not) let's have a look at DHCP log.
for this single test
and also the contents of /etc/resolv.conf
and
/var/db/dhclient.leases.igc0also Dec 6 16:13:25 php-fpm 400 /rc.linkup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf -p /var/run/dhclient.igc0.pid igc0 > /tmp/igc0_output 2> /tmp/igc0_error_output'
also the file in bold (if it exists)
Thanks
-
@jrey said in Always Wan-ip but gateway is 100% packet loss:
Maybe, but the DHCP client logging is certainly vastly different from previous samples.
let's take smaller steps.
can we set the exclude for the DHCP to the .3. -- after the change applies (give it a couple of minutes)
shutdown pfSense, pause
restart the MC, pause
then restart pf. pausethen login.
once it is up and running (online or not) let's have a look at DHCP log.
for this single test
and also the contents of /etc/resolv.conf
and
/var/db/dhclient.leases.igc0also Dec 6 16:13:25 php-fpm 400 /rc.linkup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf -p /var/run/dhclient.igc0.pid igc0 > /tmp/igc0_output 2> /tmp/igc0_error_output'
also the file in bold (if it exists)
Did it as your instructions, result:
Online (working 100%)LOGS:
231207-pfsense-dhcp-log.txt
231207-pfsense-etc.resolv.conf-log.txt
231207-pfsense-vi-dhclient.lease.txt
Nothing inside /tmp/igc0_error_outputAfter restart after a hour or so Im offline, nothing works.
LOGS:
231207-pfsense-dhcp-log-2.txt
231207-pfsense-vi-dhclient.lease-2.txt
Etc/resolv.conf is unchanged.
Nothing inside /tmp/igc0_error_outputWork continues.
-
If someone is wondering why there is no new posts its because @jrey is helping me directly in chat.
When a solution is found we will post it.
-
After several messages, and dealing with an unhelpful ISP, time away etc, this now seems resolved.
(Thanks to your neighbour, for letting you go next door and try it on a different ISP, although we didn't use the data collected there, the result spoke volumes)
Briefly, to recap, the issue was that pfSense was not obtaining an IP/gateway on the WAN, unless the "MC" was rebooted.
The ISP uses two DHCP relays in a /25 scope, and provides a third DHCP server as "next-server" in response.
We'll call them DHCP.2, DHCP.3 with the relays in the same /25 scope and DHCP.244 (the upstream in a different segment)once pfSense was able to obtain an IP (only after rebooting "MC" first) it would renew on schedule as expected. So the only issue was rebooting the pfSense without rebooting the "MC" first.
The solution that is now working was to:
Reject leases from (DHCP.3) address
change the Presets (timing) to "FreeBSD default"
add Send options: dhcp-server-identifier (DHCP.2)The WAN port now obtains a valid IP/gateway combination within the /25 scope.
Works when there is:
a reboot of pfSense only, or;
a reboot of both "MC" and pfSense;
and renews the lease as scheduled (day 2 now, since last reboot?)Happy New Year.
-
Thanks so much for the help! Couldnt have done it without you! Thanks yet again!