WAN doesn't work after reboot, how to debug?
I have a brand new Netgate SG-4860. I plugged the LAN into my computer; I plugged the WAN into my office wall plug. I could access the web configurator no problem, I used the setup wizard to choose DHCP for the WAN. I got an IP address and could reach the Internet. I updated the device from 2.3.2 (I think) to 2.3.2p1. That worked.
The update of course rebooted the device and since then the WAN doesn't work. It shows as "up" (green arrow icon) but the IP is 0.0.0.0.
I've tried turning IPv6 off, turned 'block private networks' off, tried static IP instead of DHCP, tried rebooting, but just can't reach the Internet anymore.
How do I debug this further? Thanks.
Have you asked your ISP? What kind of service is it? WAN has to be configured to match how the line is provisioned.
For the moment, I'm not trying to use the pfsense as the actual firewall device between my network and my ISP (but it will eventually be replacing a sonicwall NSA 2400 device in that role).
For now, I'm just getting to know the device and trying to set up a little test network. My sonicwall does DHCP and like I said it gave my pfsense an IP the first time I turned it on. Then I just updated the OS, rebooted, and now get 0.0.0.0. I didn't touch any settings after the initial setup wizard because I figured it best to apply the OS update straight away.
DHCP WAN just works. Anything in the DHCP logs on either pfSense or the sonicwall? System logs?
Since you control the network you should be able to set a static IP address outside the DHCP range.
If you get really stumped run a packet capture on WAN and see what's there. Should see at least DHCPDISCOVERs being sent in DHCP mode and should see at least ARP requests if static.
On the pfsense DHCP logs, lots of this repeated:
Nov 13 03:49:30 dhclient 89642 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 12
Nov 13 03:49:42 dhclient 89642 No DHCPOFFERS received.
Nov 13 03:49:42 dhclient 89642 No working leases in persistent database - sleeping.
Nov 13 03:49:42 dhclient FAIL
In System > General:
Nov 12 23:53:18 php-fpm 6086 /index.php: Successful login for user 'admin' from: 192.168.1.100
Nov 13 00:07:13 php-fpm 84814 /status_interfaces.php: Shutting down Router Advertisment daemon cleanly
Nov 13 00:08:00 php-fpm 99518 /status_interfaces.php: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb1 > /tmp/igb1_output 2> /tmp/igb1_error_output' returned exit code '15', the output was ''
I tried packet capture with full verbosity on the WAN:
04:18:44.784450 00:08:a2:12:34:56 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328)
0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from 00:08:a2:12:34:56, length 300, xid 0x656f86e2, secs 39, Flags [none] (0x0000)
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Discover
Client-ID Option 61, length 7: ether 00:08:a2:12:34:56
Hostname Option 12, length 7: "pfsense"
Parameter-Request Option 55, length 9:
Subnet-Mask, BR, Time-Zone, Classless-Static-Route
Default-Gateway, Domain-Name, Domain-Name-Server, Hostname
I can’t find anything in the sonicwall logs. Over many years, we’ve never had issues with the sonicwall giving out DCHP, it “just works”. I’ve tried adding a DHCP reservation, I’ve tried adding a static ARP entry, only thing left to try is a magic reboot of the sonicwall, but that would be disruptive.
Do the above log snippits suggest anything to you? Thanks.
Static ARP is almost certainly not necessary. If you are seeing DHCPOFFER on WAN and nothing being returned, you probably want to figure out what's going on there or likely nothing will work, DHCP or static. (Hint: Look upstream of pfSense WAN.)
What I find odd is that it worked the first time. So on a hunch I tried to repro on my home setup, and I can.
My home setup is: pfsense box (SG-2220) connected to my ISP-provided SmartRG SR505N (not a sonicwall). The pfsense uses DCHP to get its IP from the SmartRG. It's been up and running fine for days.
So I simply did a clean reboot of the pfsense, and after the reboot it didn't get an IP on the WAN. (Though instead of “0.0.0.0” it shows “n/a”.)
Then I rebooted the SmartRG, and presto the pfsense gets an IP. I've done this 3 times now. Every reboot of the pfsense requires a reboot of the SmartRG.
If you are sending DHCPDISCOVERs and not getting DHCPOFFERs in reply, then you need to get with whatever the DHCP server is and figure out why. All the WAN port can do is send the discover. If the server doesn't respond you either have a problem with layer 2 or the server.
I tried something a bit different: power down the pfsense, then boot it up. Once booted, in Status>Interfaces>WAN it shows Gateway IPv4: 192.168.2.1, which is right.
On my SmartRG though, its ARP screen shows:
IP addr - Flags - HW Address - Device
192.168.2.2 - Incomplete - 00:00:00:00:00:00 - br0
192.168.2.2 is the DHCP reservation for the pfsense, based on its MAC.
In the pfsense boot logs I see:
Nov 15 00:26:38 php-cgi rc.bootup: ROUTING: setting default route to 192.168.2.1
Nov 15 00:26:38 php-cgi rc.bootup: The command '/sbin/route change -inet default '192.168.2.1'' returned exit code '1', the output was 'route: writing to routing socket: No such process route: writing to routing socket: Network is unreachable change net default: gateway 192.168.2.1 fib 0: Network is unreachable'
Alas, even after setting my SmartRG logs to 'debug', I can't tease much out of them. I have tried toggling all the various DCHP options: IGMP proxy on/off, IGMP snooping on/off, DHCPv6 on/off. But always the same. Quite confusing all in all. :(
It sounds like you have gotten kind of clicky-clicky trying to fix this. Static ARP is almost certainly not necessary.
If the pfSense firewall is asking for DHCP and receiving no response, the problem is either at layer 2 or in the DHCP server. The fact that logs there leave something to be desired is not pfSense's fault.
There is nothing special in IPv4 DHCP client on pfSense. There are thousands and thousands and thousands of installations doing just that. Any problems are pretty much invariably issues with cable modems needing to be restarted due to the nature of those particular beasts.
You have two out of two that are not working. Sounds like something systemic there.