[Solved] Upgrade from 2.2.x to 2.3.3, android can't get a dhcp lease anymore



  • Hello everyone,

    after the last update (I was on 2.2.something) I decided it was time to update to the latest version. All went well, and I didn't even notice this until a couple of days later I realized I had been using lots of my phone's data plan. The phone (Motorola G4, with stock android N) often gets stuck on "Obtaining ip.." and when occasionally it finally manages to get one, and as soon as it's time to renew it I'm as good as disconnected.

    The simptoms can be seen in the dhcpd logs, a metric ton of offers, but for some reason the DHCPACK takes forever to come through.

    
    Mar 26 23:25:30 	dhcpd 		DHCPACK on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:25:30 	dhcpd 		DHCPREQUEST for 192.168.70.3 (192.168.70.1) from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:25:30 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:25:30 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:35 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:35 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:17 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:17 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:09 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:09 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:05 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:05 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:03 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:24:03 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:52 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:52 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:38 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:38 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:30 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:30 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:26 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:26 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:24 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:24 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:16 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:23:16 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:59 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:59 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:51 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:51 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:46 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:46 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:44 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:44 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:35 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:35 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:20 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:20 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:12 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:12 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:08 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:08 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:05 	dhcpd 		DHCPOFFER on 192.168.70.3 to f4:f5:24:76:f3:d6 via re2
    Mar 26 23:22:05 	dhcpd 		DHCPDISCOVER from f4:f5:24:76:f3:d6 via re2
    Mar 26 23:21:59 	dhcpleases 		Sending HUP signal to dns daemon(86103)
    Mar 26 23:21:59 	dhcpleases 		Sending HUP signal to dns daemon(86103)
    Mar 26 23:21:59 	dhcpd 		Server starting service.
    Mar 26 23:21:59 	dhcpd 		Sending on Socket/fallback/fallback-net
    Mar 26 23:21:59 	dhcpd 		Sending on BPF/re1/00:0d:b9:42:c3:39/192.168.69.0/24
    Mar 26 23:21:59 	dhcpd 		Listening on BPF/re1/00:0d:b9:42:c3:39/192.168.69.0/24 
    
    

    Just before that, on the system general logs:

    
    Mar 26 23:21:58 	dhcpleases 		kqueue error: unkown
    Mar 26 23:21:57 	dhcpleases 		Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
    Mar 26 23:21:57 	dhcpleases 		/var/etc/hosts changed size from original!
    Mar 26 23:21:56 	check_reload_status 		Syncing firewall
    Mar 26 23:21:37 	dhcpleases 		kqueue error: unkown
    Mar 26 23:21:37 	dhcpleases 		Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
    Mar 26 23:21:37 	dhcpleases 		/var/etc/hosts changed size from original!
    Mar 26 23:21:36 	check_reload_status 		Syncing firewall
    Mar 26 23:19:49 	dhcpleases 		kqueue error: unkown
    Mar 26 23:19:47 	dhcpleases 		kqueue error: unkown
    Mar 26 23:19:46 	dhcpleases 		Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
    Mar 26 23:19:46 	dhcpleases 		/var/etc/hosts changed size from original!
    Mar 26 23:19:42 	check_reload_status 		Syncing firewall
    Mar 26 23:16:19 	dhcpleases 		kqueue error: unkown
    Mar 26 23:16:16 	dhcpleases 		kqueue error: unkown
    Mar 26 23:16:16 	dhcpleases 		Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
    Mar 26 23:16:13 	check_reload_status 		Syncing firewall 
    
    

    I have 2 DHCP servers configured, one for each subnet: LAN (192.168.69.0) and WIFI (192.168.70.0). The lan dhcp mostly rolls a default config, with 5 or 6 static mappings. The wifi has 5 or 6 mappings itself, but on top of that the option to deny unknown clients is checked.

    The DNS resolver has pretty much basic configuration with the following changes:

    Register DHCP leases in the DNS Resolver -> ON
    Register DHCP static mappings in the DNS Resolver -> ON
    System Domain Local Zone Type -> Refuse (Default is Transparent)

    Meanwhile, on the same WIFI a Windows 10 Tablet has no problems whatsoever getting an IP and holding on to it. Same hardware with PfSense 2.2 didn't have any issues.

    I tried every possible sane combination of enabling and disabling all the afore mentioned options (both for the DNS resolver and DHCP), and came up with no relief whatsoever. Does anyone have a clue what might be happening?



  • Well, I figured out the issue: it might have been just an unfortunate coincidence that threw me off the debugging path.

    The missing part from the report above is that on the 70.x subnet, the wifi AP is a ubiquity, which relays DHCP requests to pfsense (it sits comfortably on the 70.2 ip, 70.1 is the OPT1 if in pfsense). 70.3+ and until 70.100 are all reserved for static mappings, 70.101+ are the DHCP pool.

    At some point I realized that when a DHCPACK came through, it was not completely random. By inspecting the phone's logcat I started to notice a pattern in which the failures to receive the DHCPOFFERS coincided with the phone starting the conversation over the 2.4ghz channel, and then being moved over to the 5ghz channel by the band steering protocols on the ubiquity. Hilarity ensued, since for over a year I had no problem with the same hardware and a unified SSID for both frequencies.

    What probably threw me off was the fact that the ubiquity received a firmware upgrade (which likely introduced this issue) right around the same time I manually upgraded pfsense  >:(

    Solution was: separate the two wifi networks into their own dedicated SSIDs and disable band steering. Once that was done the DHCP protocol did not have any issues completing any handshake, over 2.4ghz or 5ghz.

    I'm leaving this here for posterity, might be useful to someone one day.

    Cheers