PFSense machine stops responding (temporarily) when restarting DHCPD
-
We just set up a new office with a PfSense VM (esxi, 2 core, 1GB) and every time we change settings in the DNS Forwarder, apply settings in the DHCP Server, or apply settings in some other places, all interfaces stop responding to pings or routing packets for around 110sec. During this time, on the console of the VM, it can ping out to other things and get responses fine, other boxes just don't get responses from it. It's super annoying because just a simple DNS change takes down the internet for everyone. After some testing it only happens when DHCPD is restarted, which PFSense does when you change almost anything.
Some of the troubleshooting steps I have tried:
I set up a new VM on a different host.
ESXi 6.0, 2 cores, 1GB RAM, 30GB HDD.
e1000 vNICs each connected to a physical NIC in
em0 = LAN (192.168.20.0/23)
em1 = WAN (public /29 subnet)Booted VM with new PfSense 2.3.2 installer ISO attached
Installed using all default/quick settings, no customizations
Rebooted and detached ISO
assigned interfaces as seen above by matching the MACs it showed to the MACs vmware assigned for each interface
Set the IP addresses for LAN and WAN with DHCP disabled (so as not to disturb people using the main router). IPs were main router +1, so instead of 192.168.20.1 (main router) this test router has 192.168.20.2
Changed my default gateway locally to 192.168.20.2 and set two terminal windows pinging 192.168.20.2 and 4.2.2.2 to monitor if it is working. Pings are returning fine.
Logged into the web interface and went through the setup. I didn't change anything on those screens except for the host name ("router2"), the domain (an internal one we use), the DNS server of 4.2.2.2, and the password. Saved those changes and restarted.
– So far, no issues with it not responding --
Went to Services > DHCP Server > GUEST, adde a range of 192.168.20.100-192.168.20.199 and clicked "Save".
As soon as I clicked Save, the browser went to "Waiting" (for response) and the pings stopped coming back from 192.168.20.2 and 4.2.2.2 for about 105sec. After that time the page finally loaded and (at the same time) the pings resumed.I wanted to test the IO that another user mentioned, so I went in to System > Advanced > Admin Access to check "Enable Secure Shell" and then save and even that made it stop responding to pings for about 30sec.
I was able to copy a 10MB file to the hard drive in under a second.
I disabled SSH and it went offline for about another 30sec as it saved that setting.To see if the issue is DHCP itself, I went in and turned the DHCP for the Guest interface back off by unchecking "Enable DHCP server on GUEST interface" and hitting save. That saved immediately, not non-responding time. I then went back in to re-enable SSH and that saved immediately as well. seems as if the issue stems from DHCP.
Just stopping the DHCPD service in Status > Services doesn't cause issues, but as soon as I start it back up the system stops responding again for a little while.