"Unable to check for updates" after upgrade from from 23.05.1 to 23.09
-
@stephenw10 So after more testing the secondary node appears to be functioning normally. I then switch CARP to maintenance mode on primary node and proceeded with the upgrade of the primary node. The upgrade seemed to have gone well, I was even informed that my system is on the latest version. Next I preceded to check available packages. Unfortunately the list was empty. Trying to execute pkg-static -d update resulted in the page not refreshing, it seemed like the command hung.
I checked that DNS was setup correctly and it is, I'm able to resolve names to IP addresses. Surprisingly, I can't ping google.ca. I checked that System > Routing > Default gateway
is set to "WAMGW" and it was. I also tried rebooting the firewall, nothing changed. -
Does it have a default route present and correct in Diag > Routing?
It's better to run
pkg-static -d update
at the actual command line if you can. That way you can see the partial output and any errors while it's running. -
The gateway IP is our ISP provided gateway. The same as on the secondary firewall.
[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: pkg-static -d update DBG(1)[43703]> pkg initialized Updating pfSense-core repository catalogue... DBG(1)[43703]> PkgRepo: verifying update for pfSense-core DBG(1)[43703]> PkgRepo: need forced update of pfSense-core DBG(1)[43703]> Pkgrepo, begin update of '/var/db/pkg/repo-pfSense-core.sqlite' DBG(1)[43703]> Request to fetch pkg+https://pfsense-plus-pkg.netgate.com/pfSense _plus-v23_09_amd64-core/meta.conf DBG(1)[43703]> curl_open DBG(1)[43703]> Fetch: fetcher used: pkg+https DBG(1)[43703]> curl> fetching https://pfsense-plus-pkg.netgate.com/pfSense_plus- v23_09_amd64-core/meta.conf DBG(1)[43703]> CURL> attempting to fetch from , left retry 3 * Couldn't find host pfsense-plus-pkg00.atx.netgate.com in the .netrc file; usin g defaults * Trying 208.123.73.207:443... * Trying [2610:160:11:18::207]:443... * Immediate connect fail for 2610:160:11:18::207: No route to host * ipv4 connect timeout after 21175ms, move on! * Failed to connect to pfsense-plus-pkg00.atx.netgate.com port 443 after 30025 m s: Timeout was reached * Closing connection DBG(1)[43703]> CURL> attempting to fetch from , left retry 2
-
Can it ping
pfsense-plus-pkg00.atx.netgate.com
? Or208.123.73.207
? -
@stephenw10 I can not ping, both commands just hang there until ctr-c is pressed.
[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: ping pfsense-plus-pkg00.atx.netgate.com PING pfsense-plus-pkg00.atx.netgate.com (208.123.73.207): 56 data bytes ^C --- pfsense-plus-pkg00.atx.netgate.com ping statistics --- 52 packets transmitted, 0 packets received, 100.0% packet loss [23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: ping 208.123.73.207 PING 208.123.73.207 (208.123.73.207): 56 data bytes ^C --- 208.123.73.207 ping statistics --- 79 packets transmitted, 0 packets received, 100.0% packet loss [23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root:
-
Hmm, so is this with it still in maintenance mode? Running as backup?
Can it connect to anything? I assume it can ping internal hosts?
-
@stephenw10 Correct, it's running in maintenance m ode as backup. I can ping internal hosts but I'm unable to ping anything external.
-
Check the outbound NAT settings. Is it NATing it's own traffic to the CARP VIP? That will break WAN connectivity.
-
@stephenw10 For the CARP stuff, I followed a tutorial.
-
Hmm, should be fine.
Then next step I would start a ping from pfSense to something external then check the state table to see what states are opened for it on which interface.
-
I tried a simple look in https://firmware.netgate.com/pkg/
No versions higher than 23.01/2.4.4 are there.
-
Because that only includes versions from the old static repo system.
-
@stephenw10 I executed the following at the console and got the results below:
[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: nslookup google.ca ;; communications error to 127.0.0.1#53: timed out ;; communications error to 127.0.0.1#53: timed out ;; Got SERVFAIL reply from 127.0.0.1, trying next server Server: 172.22.1.1 Address: 172.22.1.1#53 Non-authoritative answer: Name: google.ca Address: 172.217.13.195 ;; Got SERVFAIL reply from 127.0.0.1, trying next server Name: google.ca Address: 2607:f8b0:4020:807::2003 [23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: [23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: ping 172.217.13.195 PING 172.217.13.195 (172.217.13.195): 56 data bytes
Searching the table for 172.217.13.195 yields one single entry:
WAN icmp 99.209.83.93:26986 -> 172.217.13.195:26986 0:0 64 / 0 5 KiB / 0 B
-
Ok that looks correct. I would make sure the other node can ping that IP in case it just doesn't respond to ping.
Assuming it does run a packet capture for that IP on the WAN on the node that's failing. Make sure it's actually sending from the WAN. Make sure the MAC addresses are correct in the pcap.
If those are all accurate I'd check the the gateway device. Perhaps you have a conflict somewhere?
-
@stephenw10 Thanks again for sticking with me. Yes, I can ping from the other node. In fact I did a capture from both nodes:
Problematic Node:
22:23:11.437391 IP 99.xxx.xxx.xxx> 172.217.13.195: ICMP echo request, id 52159, seq 0, length 64
22:23:12.453848 IP 99.xxx.xxx.xxx > 172.217.13.195: ICMP echo request, id 52159, seq 1, length 64
and the pattern repeats.Working node:
22:34:54.366632 IP 99.xxx.xxx.xxx > 172.217.13.195: ICMP echo request, id 37721, seq 0, length 64
22:34:54.370407 IP 172.217.13.195 > 99.xxx.xxx.xxx: ICMP echo reply, id 37721, seq 0, length 64
22:34:55.397837 IP 99.xxx.xxx.xxx > 172.217.13.195: ICMP echo request, id 37721, seq 1, length 64
22:34:55.401639 IP 172.217.13.195 > 99.xxx.xxx.xxx: ICMP echo reply, id 37721, seq 1, length 64
and the pattern repeats.Sorry, I couldn't figure out how to show the MAC addresses.
-
Either set the view to 'full' on the packet capture gui page. Or download the pcap and open in Wireshark or similar. Both will show the MAC addresses.
The gateway should be the same MAC from both nodes.
Is it using the correct address in the 99.x.x.x subnet?
-
The problematic firewall:
15:01:58.126525 90:ec:77:36:09:4e > 78:03:4f:ea:98:32, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 53891, offset 0, flags [none], proto ICMP (1), length 84, bad cksum 0 (->367b)!) 99.xxx.xxx.93 > 172.217.13.163: ICMP echo request, id 42036, seq 0, length 64 15:01:59.134481 90:ec:77:36:09:4e > 78:03:4f:ea:98:32, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 3010, offset 0, flags [none], proto ICMP (1), length 84, bad cksum 0 (->fd3c)!) 99.xxx.xxx.93 > 172.217.13.163: ICMP echo request, id 42036, seq 1, length 64
The functioning firewall:
15:01:58.126525 90:ec:77:36:09:4e > 78:03:4f:ea:98:32, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 53891, offset 0, flags [none], proto ICMP (1), length 84, bad cksum 0 (->367b)!) 99.xxx.xxx.93 > 172.217.13.163: ICMP echo request, id 42036, seq 0, length 64 15:02:53.122116 78:03:4f:ea:98:32 > 00:00:5e:00:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 117, id 0, offset 0, flags [none], proto ICMP (1), length 84) 172.217.13.163 > 99.xxx.xxx.93: ICMP echo reply, id 56675, seq 0, length 64
99.xxx.xxx.93 = out external (CARP) IP address.
For some additional troubleshooting I swapped the ports on witch each firewall is connected to our ISP equipment. The problem stayed with the firewall, not the port.,
-
Ok so that's a problem!
Those pings should be from the WAN IP not the CARP VIP.
Your outbound NAT rules don't look like they could catch that. Check the state table to be sure though, the pings should not show a NAT state.
-
@stephenw10 Here's that the state table entry looks like:
WAN icmp 99.209.83.93:12056 -> 172.217.13.163:12056 0:0 10 / 0 840 B / 0 B
I think I come across as more knowledgeable than I really am. I don't know what to look at in the state table to determine NAT state.
My Outgoing rules look like this:
Should I change the NAT address to the non VIP wan address?
-
That's the list of outbound NAT rules. And those look correct, neither of those rules should catch the traffic from the public WAN IPs.
Check the state table in Diag > States. Pings should show as being from the WAN IP directly with no NAT like: