Ping not working on CLI or with "default" interface on GUI
-
If you resave the WAN config does it stop?
Was the WAN ever configured as DHCP? Is there a dhclient process running? Anything in the dhcp logs?
That's the only time seeing 0.0.0.0 might be expected.Steve
-
I have re-saved WAN interface configuration, and applied it. Problem still persists, unfortunately.
DHCP logs show one entry from yesterday (
2023-10-25 18:18:20.595596+00:00 dhclient 5829 Cannot open or create pidfile: No such file or directory
), and a few entries for the date when the server was deployed, on 20th of October.There seems to be a dhclient process running, which keeps changing the PID (second column) incrementally. Maybe it is restarting all the time?
$ ps aux | grep dhclient root 40010 0.0 0.0 12768 2424 0 S+ 17:50 0:00.00 grep dhclient $ ps aux | grep dhclient root 40711 0.0 0.0 12768 2432 0 S+ 17:50 0:00.00 grep dhclient $ ps aux | grep dhclient root 41154 0.0 0.0 12768 2432 0 S+ 17:50 0:00.00 grep dhclient $ ps aux | grep dhclient root 41588 0.0 0.0 12768 2432 0 S+ 17:50 0:00.00 grep dhclient
Thanks a lot!
-
Hmm, that's not right! There shouldn;t be any dhclient there if it's all static IPs.
If you run
ps auxwwd | grep dhclient
you should be able to see which interface it's running on.I would first try just killing that process.
-
Oh sorry, my brain stopped working after chasing this for too long today. The command above shows
grep dhclient
process which I am spinning up when runningps aux | grep dhclient
, notdhclient
process itself. That is why the PID kept incrementing, because I was invoking it all the time. Facepalm.So yes, there is no
dhclient
process running, and therefore we cannot kill it.My apologies. I will continue the research tomorrow and try to figure out why WAN is using
0.0.0.0
as SRC when no address is specified, as it is clear brain fog is kicking in. -
Ah, yes, I should have spotted that!
I guess I'd try disabling wireguard as a test. There's not much else it could be....
-
Hmm, do you have a Wireguard gateway defined?
Is your default IPv4 gateway still set to automatic in System > Routing > Gateways?
If it is try setting it to the WAN gateway. -
Hello again @stephenw10 . New day, new time to chase this mystery.
I have stopped WireGuard service, but issue still persists.
In regards to IPv4 gateway, it is not automatic - it is statically defined. It is the only Gateway in this system (no WireGuard gateways or anything extra), and it is also set as default. Puzzling.
I have tried to add a NAT outbound rule on top of the list, selecting "any" as source on WAN interface, and translating it to the "interface address", and ping works now when not specifying an interface. But this is a dirty patch, as I assume WAN is still sending the packets with 0.0.0.0 as source, just that we are translating that address locally. On top of that, nslookup still does not work even with that workaround, ie:
$ nslookup dns.google ;; UDP setup with 1.1.1.2#53(1.1.1.2) for dns.google failed: host unreachable. ;; UDP setup with 1.1.1.2#53(1.1.1.2) for dns.google failed: host unreachable. ;; UDP setup with 1.1.1.2#53(1.1.1.2) for dns.google failed: host unreachable. ;; UDP setup with 9.9.9.9#53(9.9.9.9) for dns.google failed: host unreachable.
And pfsense updates also don't work. So I am reverting that NAT outbound rule.
I will continue to chase it for a few hours more and, if no luck, I will try reinstalling the system and observe behaviour when freshly installed and no config.xml restores.
-
Here is some additional info, in case it helps
$ ifconfig -a igb0: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 3c:ec:ef:a7:22:e8 media: Ethernet autoselect status: no carrier nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> igb1: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 3c:ec:ef:a7:22:e9 media: Ethernet autoselect status: no carrier nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> ix0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: WAN options=48138b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,NOMAP> ether 9c:69:b4:64:96:7a inet6 fe80::9e69:b4ff:fe64:967a%ix0 prefixlen 64 scopeid 0x3 inet 123.123.123.123(redacted) netmask 0xffffff80 broadcast 123.123.123.1(redacted) media: Ethernet autoselect (10Gbase-SR <full-duplex,rxpause,txpause>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> ix1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: OPT1 options=48138b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,NOMAP> ether 9c:69:b4:64:96:7b inet6 fe80::9e69:b4ff:fe64:967b%ix1 prefixlen 64 scopeid 0x4 media: Ethernet autoselect status: no carrier nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> enc0: flags=0<> metric 0 mtu 1536 groups: enc nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x6 inet 127.0.0.1 netmask 0xff000000 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> pflog0: flags=100<PROMISC> metric 0 mtu 33152 groups: pflog pfsync0: flags=0<> metric 0 mtu 1500 maxupd: 128 defer: off syncok: 1 groups: pfsync tun_wg0: flags=80c1<UP,RUNNING,NOARP,MULTICAST> metric 0 mtu 1420 description: WG_EG options=80000<LINKSTATE> inet 80.80.80.1(redacted) netmask 0xffffff00 inet 90.90.90.1(redacted) netmask 0xffffff00 inet 10.95.32.99 netmask 0xffffff00 groups: wg WireGuard nd6 options=101<PERFORMNUD,NO_DAD>
I also verified that in the uploaded .pcap file, the MAC address of the source matches the one in ix0 interface.
-
@llavecita said in Ping not working on CLI or with "default" interface on GUI:
inet 123.123.123.123(redacted) netmask 0xffffff80 broadcast 123.123.123.1(redacted)
Is that the other way around? I expect the broadcast address to be .127 on a /25.
-
@stephenw10 I redacted the real values, but you are indeed correct - shown broadcast address in ifconfig for interface ix0 is a .127 (and gateway is a .126)
-
Hmm, OK. We discussed this internally last night and there's nothing immediately that presents like that.
Can I assume this behaviour survives a reboot?
-
@stephenw10 Correct, the server has been rebooted several times to no avail. I will find some time today to reinstall it from scratch, without dragging any previous config.xml, to see if behaviour persists on a fresh install. I will report back later today.
Thanks again for your time and dedication!
-
Yeah, I really like to know what's causing that. Must be something very obscure as I see almost zero references to it.
-
@stephenw10 I have just reinstalled the system, and zero-to-ping works well after disabling igb0/igb1, enabling ix0/ix1, and setting static IP and gateway for the relevant WAN interface. This means the problem lies in my restored config.xml, which seems to be breaking things although it comes from another functional system which is in production. The other system is virtualised, and not a dedicated environment, so I only changed "vtnet0" for "ix0" in the ported config.xml, but perhaps additional changes are needed.
I will try to restore a couple of config.xml's to see if results differ, and if it still breaks I will manually configure this system. It would be great to know the exact cause of the issue though!
Thanks again for your assistance and dedication. I will post again if I have additional findings which are relevant.
-
@stephenw10 More wizardry. So, before reinstalling the system, I downloaded the config.xml. Now, after reinstalling it I restored exactly the same config.xml (no changes at all!), and now everything works: ping and nslookup, updates, etc. So I cannot confirm my config.xml was at fault!
The good thing is that everything works now, and no manual configuration is required (phew...). The bad side is that we do not know what was (seriously) wrong. But maybe instead of spending 3 days troubleshooting, next time I could try to reinstall first, and if the same result happens, then troubleshoot. A learning experience :D.
Thanks again!
-
Hmm, so something dynamic that survives a reboot but is not in the config.