"Unable to check for updates" after upgrade from from 23.05.1 to 23.09

Kajetan321

The gateway IP is our ISP provided gateway. The same as on the secondary firewall.

[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: pkg-static -d update
DBG(1)[43703]> pkg initialized
Updating pfSense-core repository catalogue...
DBG(1)[43703]> PkgRepo: verifying update for pfSense-core
DBG(1)[43703]> PkgRepo: need forced update of pfSense-core
DBG(1)[43703]> Pkgrepo, begin update of '/var/db/pkg/repo-pfSense-core.sqlite'
DBG(1)[43703]> Request to fetch pkg+https://pfsense-plus-pkg.netgate.com/pfSense                                                                                                                                                             _plus-v23_09_amd64-core/meta.conf
DBG(1)[43703]> curl_open
DBG(1)[43703]> Fetch: fetcher used: pkg+https
DBG(1)[43703]> curl> fetching https://pfsense-plus-pkg.netgate.com/pfSense_plus-                                                                                                                                                             v23_09_amd64-core/meta.conf

DBG(1)[43703]> CURL> attempting to fetch from , left retry 3

* Couldn't find host pfsense-plus-pkg00.atx.netgate.com in the .netrc file; usin                                                                                                                                                             g defaults
*   Trying 208.123.73.207:443...
*   Trying [2610:160:11:18::207]:443...
* Immediate connect fail for 2610:160:11:18::207: No route to host
* ipv4 connect timeout after 21175ms, move on!
* Failed to connect to pfsense-plus-pkg00.atx.netgate.com port 443 after 30025 m                                                                                                                                                             s: Timeout was reached
* Closing connection
DBG(1)[43703]> CURL> attempting to fetch from , left retry 2

stephenw10

Can it ping pfsense-plus-pkg00.atx.netgate.com ? Or 208.123.73.207 ?

Kajetan321

@stephenw10 I can not ping, both commands just hang there until ctr-c is pressed.

[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: ping pfsense-plus-pkg00.atx.netgate.com
PING pfsense-plus-pkg00.atx.netgate.com (208.123.73.207): 56 data bytes

^C
--- pfsense-plus-pkg00.atx.netgate.com ping statistics ---
52 packets transmitted, 0 packets received, 100.0% packet loss
[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: ping 208.123.73.207
PING 208.123.73.207 (208.123.73.207): 56 data bytes
^C
--- 208.123.73.207 ping statistics ---
79 packets transmitted, 0 packets received, 100.0% packet loss
[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root:

stephenw10

Hmm, so is this with it still in maintenance mode? Running as backup?

Can it connect to anything? I assume it can ping internal hosts?

Kajetan321

@stephenw10 Correct, it's running in maintenance m ode as backup. I can ping internal hosts but I'm unable to ping anything external.

stephenw10

Check the outbound NAT settings. Is it NATing it's own traffic to the CARP VIP? That will break WAN connectivity.

Kajetan321

@stephenw10 For the CARP stuff, I followed a tutorial.

stephenw10

Hmm, should be fine.

Then next step I would start a ping from pfSense to something external then check the state table to see what states are opened for it on which interface.

cole

I tried a simple look in https://firmware.netgate.com/pkg/

No versions higher than 23.01/2.4.4 are there.

stephenw10

Because that only includes versions from the old static repo system.

Kajetan321

@stephenw10 I executed the following at the console and got the results below:

[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: nslookup google.ca
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: timed out
;; Got SERVFAIL reply from 127.0.0.1, trying next server
Server:         172.22.1.1
Address:        172.22.1.1#53

Non-authoritative answer:
Name:   google.ca
Address: 172.217.13.195
;; Got SERVFAIL reply from 127.0.0.1, trying next server
Name:   google.ca
Address: 2607:f8b0:4020:807::2003

[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root:
[23.09-RELEASE][admin@pfsense1.lan.optiwave.com]/root: ping 172.217.13.195
PING 172.217.13.195 (172.217.13.195): 56 data bytes

Searching the table for 172.217.13.195 yields one single entry:

WAN icmp 99.209.83.93:26986 -> 172.217.13.195:26986 0:0 64 / 0 5 KiB / 0 B

stephenw10

Ok that looks correct. I would make sure the other node can ping that IP in case it just doesn't respond to ping.

Assuming it does run a packet capture for that IP on the WAN on the node that's failing. Make sure it's actually sending from the WAN. Make sure the MAC addresses are correct in the pcap.

If those are all accurate I'd check the the gateway device. Perhaps you have a conflict somewhere?

Kajetan321

@stephenw10 Thanks again for sticking with me. Yes, I can ping from the other node. In fact I did a capture from both nodes:

Problematic Node:
22:23:11.437391 IP 99.xxx.xxx.xxx> 172.217.13.195: ICMP echo request, id 52159, seq 0, length 64
22:23:12.453848 IP 99.xxx.xxx.xxx > 172.217.13.195: ICMP echo request, id 52159, seq 1, length 64
and the pattern repeats.

Working node:
22:34:54.366632 IP 99.xxx.xxx.xxx > 172.217.13.195: ICMP echo request, id 37721, seq 0, length 64
22:34:54.370407 IP 172.217.13.195 > 99.xxx.xxx.xxx: ICMP echo reply, id 37721, seq 0, length 64
22:34:55.397837 IP 99.xxx.xxx.xxx > 172.217.13.195: ICMP echo request, id 37721, seq 1, length 64
22:34:55.401639 IP 172.217.13.195 > 99.xxx.xxx.xxx: ICMP echo reply, id 37721, seq 1, length 64
and the pattern repeats.

Sorry, I couldn't figure out how to show the MAC addresses.

stephenw10

Either set the view to 'full' on the packet capture gui page. Or download the pcap and open in Wireshark or similar. Both will show the MAC addresses.

The gateway should be the same MAC from both nodes.

Is it using the correct address in the 99.x.x.x subnet?

Kajetan321

@stephenw10

The problematic firewall:

15:01:58.126525 90:ec:77:36:09:4e > 78:03:4f:ea:98:32, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 53891, offset 0, flags [none], proto ICMP (1), length 84, bad cksum 0 (->367b)!)
    99.xxx.xxx.93 > 172.217.13.163: ICMP echo request, id 42036, seq 0, length 64
15:01:59.134481 90:ec:77:36:09:4e > 78:03:4f:ea:98:32, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 3010, offset 0, flags [none], proto ICMP (1), length 84, bad cksum 0 (->fd3c)!)
    99.xxx.xxx.93 > 172.217.13.163: ICMP echo request, id 42036, seq 1, length 64

The functioning firewall:

15:01:58.126525 90:ec:77:36:09:4e > 78:03:4f:ea:98:32, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 53891, offset 0, flags [none], proto ICMP (1), length 84, bad cksum 0 (->367b)!)
    99.xxx.xxx.93 > 172.217.13.163: ICMP echo request, id 42036, seq 0, length 64
15:02:53.122116 78:03:4f:ea:98:32 > 00:00:5e:00:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 117, id 0, offset 0, flags [none], proto ICMP (1), length 84)
    172.217.13.163 > 99.xxx.xxx.93: ICMP echo reply, id 56675, seq 0, length 64

99.xxx.xxx.93 = out external (CARP) IP address.

For some additional troubleshooting I swapped the ports on witch each firewall is connected to our ISP equipment. The problem stayed with the firewall, not the port.,

stephenw10

Ok so that's a problem!

Those pings should be from the WAN IP not the CARP VIP.

Your outbound NAT rules don't look like they could catch that. Check the state table to be sure though, the pings should not show a NAT state.

Kajetan321

@stephenw10 Here's that the state table entry looks like:

WAN icmp 99.209.83.93:12056 -> 172.217.13.163:12056 0:0 10 / 0 840 B / 0 B

I think I come across as more knowledgeable than I really am. I don't know what to look at in the state table to determine NAT state.

My Outgoing rules look like this:

Should I change the NAT address to the non VIP wan address?

stephenw10

That's the list of outbound NAT rules. And those look correct, neither of those rules should catch the traffic from the public WAN IPs.

Check the state table in Diag > States. Pings should show as being from the WAN IP directly with no NAT like:

Screenshot from 2023-12-15 17-53-55.png

Kajetan321

It looks like the problem has been fixed, by ACCIDENT!

When moving cables around I accidentally unplugged the ISP equipment from power. After equipment rebooted, my pings started to work, my package manager was able to populate. It looks like the pfSense software update confused the ISP hardware? It all seems good now.

Thank you for all your help.