2.3.4->2.4.4 Upgrade 100% Packet Loss on WAN Interface
-
Did you try changing the gateway monitoring to a different IP?
With only one gateway though it will always be used even if it shows as off-line. I assume you cannot actually connect out at all?It is pulling a DHCP lease though so there is a link of some sort.
Check the routing table in Diag > Routes just to be sure there is a default route. Although the gateway is marked default you still have the selection set to automatic.
Steve
-
I have tried another monitoring address but it doesn't make a difference (I used 8.8.8.8). I did try changing the default gateway not to be automatic, explicitly choosing WAN_DHCP, but again to no avail. I have checked Diagnostics->Routes and there is a default route. You are correct though, I cannot connect out at all, nothing I do results in any packets on the WAN interface, but it does manage to lease a DHCP address.
-
Is it giving you a rational address via DHCP? Gateway IP in the same subnet?
Check the DHCP logs match what you are seeing on the interface.
Check the system logs for errors.
I assume it's pulling a DHCP lease directly from your ISP?
Steve
-
I did check for all these things before and they all seemed fine as I recall. I won't get another chance to check again until next weekend, I will report back then.
Thanks
Rob
-
I checked the DHCP logs and the gateway IP is in the same subnet as the leased IP address.
I do see some IPv6 errors, but I am assuming they are benign as my ISP is IPv4.
I looked in the system logs. I am getting some errors relating to the time of day (there is an oddity with FreeBSD not seeming to get the time from Hyper-V correctly). The errors are like this one:
rc.bootup: The command '/usr/bin/nice -n20 /usr/local/bin/rrdtool update /var/db/rrd/ipsec-packets.rrd N:U:U:U:U:U:U:U:U' returned exit code '1', the output was 'ERROR: /var/db/rrd/ipsec-packets.rrd: illegal attempt to update using time 1541839764 when last update time is 1571980920 (minimum one second step)'
In the Gateways part of the system log I see lots of errors from dpinger, I assume they are symptoms rather than cause though. Here they are:
send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr a.b.c.1 bind_addr a.b.c.86 identifier "WAN_DHCP "
WAN_DHCP a.b.c.1: Alarm latency 0us stddev 0us loss 100%
-
When DHCP works and nothing else does that's usually a bad firewall rule. Thouhg here that would be somewhere upstream unless you have a blocking floating OUT rule.
Try running a packet capture on WAN. Do you see the ping packets leaving? Do you see any packets coming back?
Steve
-
I tried a packet capture before, and I tried it again today. I don't see any packets originating from computers on the LAN going out on the WAN. I do see some DNS lookups that appear to be coming from pfSense itself. I have looked at the firewall rules and there doesn't really seem to be anything that would block traffic. Looking at the firewall logs I see traffic being blocked, but it all appears to be IPv6.
-
Are those DNS lookups actually working? Does Diag > DNS Lookup work?
Can you packet capture the DHCP exchange?
What is the MAC of the gateway? Maybe it's something odd. Though that should affect 2.3.4 just the same.
Steve
-
@rjarratt said in 2.3.4->2.4.4 Upgrade 100% Packet Loss on WAN Interface:
gateway IP is in the same subnet as the leased IP address.
And this is public IP or private IP?? This is a VM right... Are we sure interfaces are not moving about and changing order on update? If your pfsense can not talk to your gateway your going to have a problem.
Your gateway and public IP should be the same when on 2.3.4 as it is when you upgrade... I do not see your mac changing on your vm... So if you can ping your gateway when your on 2.3.4 and not when on 2.4.4 something really odd is going on..
-
DNS lookup from the configurator does not work. I have just noticed a small difference between the old version and the new version in ifconfig
Here is the output on the old version (edited because my post is being marked as spam):
hn1:
ether 00:c0:df:10:58:09
status: activeAnd here is the output on the new version:
hn1: ether 00:c0:df:10:58:09
hwaddr 00:15:5d:00:1f:06
status: activeThere is a new line for "hwaddr". I am not sure what the difference is, but the hwaddr value is the default MAC address I have set in Hyper-V, but I have also set the option to allow the Guest OS to change the MAC address. The MAC address reported by the Configurator is always the "ether" one.
Note that I can ping the gateway (192.168.0.1) in 2.4.4 from the LAN. The IP address I get in the ifconfig outputs above is the same in both cases.
I also noticed this error, is it relevant?
arpresolve: can't allocate llinfo for 82.28.4.1 on hn1 -
The arpresolve errors on WAN imply it is trying to ARP for that IP and failing. I assume 82.28.4.1 is the WAN gateway IP?
That does look like what you see if you hit the DHCP issue I mentioned in my first reply.
https://redmine.pfsense.org/issues/8507I expect to see a bad MTU if you are hitting that but it's worth adding the workaround line to your DHCP options anyway:
In the "Lease Requirements and Requests" section for WAN DHCP in the field "Option modifiers" add the text without quotes: "supersede interface-mtu 0"
Or trying a 2.4.5 snapshot which has that fixed already.
Steve
-
I had a closer look at the issue and tried it out with no luck. Before I started I did this:
: netstat -4rnW Routing tables Internet: Destination Gateway Flags Use Mtu Netif Expire default 82.28.4.1 UGS 234 1500 hn1 82.28.4.0/22 link#6 U 186 1500 hn1 82.28.4.86 link#6 UHS 0 16384 lo0 127.0.0.1 link#2 UH 31 16384 lo0 192.168.0.0/24 link#5 U 124 1500 hn0 192.168.0.1 link#5 UHS 0 16384 lo0
I then set the advanced option anyway, released and renewed the lease and I even rebooted after changing the option, without success. The netstat looked the same as above after setting the option and rebooting etc.
-
I look same issue on my APU2 pfSense upgraded to lastest release and after restoring configuration/reboot , dpinger fail trough PPPoE and monitor IP 8.8.4.4.
So I guess some kind of messing up related pfSense internals stuff.
My solution is simply: Delete all gateways / reboot /re assign interfaces from cli/ssh /reboot.
Yeah pfSense concept is beauty but it's not fully perfect we know.
-
If you were hitting that DHCP issue you would not see a change in the routing table only the interface MTU.
The ARP resolve errors imply the gateway is not responding to ARP. Try running a packet capture on WAN to be sure it is actually sending the ARP requests and that the gateway is really not responding.
Steve