Pfsense 2.3.4 Slow DHCP connect on 1G Ethernet LAN.
-
Hi, I'm not the best at network troubleshooting and need some help. I have a Win7_32 PC talking to my pfsense box which captures an IP address pretty fast. In fact I can disable/enable the connection from Windows and it comes back within 4 seconds. At boot time the LAN connection also establishes pretty fast whilst other apps are still booting.
I have a Win7_64 PC on the same 1Gb switch that takes ages to establish the LAN and internet connection. If I disable and renable the interface from within Windows I can be waiting around 45-60 seconds for the connection to re-establish. Both machines have different Ethernet NIC hardware, but have the latest drivers. I've arp'd -a at the start of making the DHCP connection and nothing shows up on the slow connecting PC for ages. I've copied the DHCP log for the slow connection so I wonder if somebody could explain what's going on and why some stages are taking a long time?
In desperation I've ordered a PCIEe LAN card for the slow PC to see if its some hardware issue between the pfsense box NIC and the slow PC Host NIC. What takes so long from 15:04:00 to 15:07:05, Any ideas? Since LAN connection is first most important step, I want to get this right and understand why the slow PC takes so long. I notice the LAN activity led shows occasional blinks until the arp table fills as though pfsense is too busy doing something else.
un 13 15:07:05 dhcpd DHCPACK to 192.168.1.3 (08:60:6e:7b:a2:e5) via re1
Jun 13 15:07:05 dhcpd DHCPINFORM from 192.168.1.3 via re1
Jun 13 15:06:56 dhcpd DHCPACK on 192.168.1.3 to 08:60:6e:7b:a2:e5 (Derek1) via re1
Jun 13 15:06:56 dhcpd DHCPREQUEST for 192.168.1.3 from 08:60:6e:7b:a2:e5 (Derek1) via re1
Jun 13 15:06:56 dhcpd reuse_lease: lease age 310 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.3
Jun 13 15:06:56 dhcpd DHCPACK on 192.168.1.3 to 08:60:6e:7b:a2:e5 (Derek1) via re1
Jun 13 15:06:56 dhcpd DHCPREQUEST for 192.168.1.3 from 08:60:6e:7b:a2:e5 (Derek1) via re1
Jun 13 15:06:56 dhcpd reuse_lease: lease age 310 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.3
Jun 13 15:04:00 dhcpd Server starting service.
Jun 13 15:04:00 dhcpd Sending on Socket/fallback/fallback-net
Jun 13 15:04:00 dhcpd Sending on BPF/re1/00:0e:c4:d0:51:64/192.168.1.0/24
Jun 13 15:04:00 dhcpd Listening on BPF/re1/00:0e:c4:d0:51:64/192.168.1.0/24
Jun 13 15:04:00 dhcpd Wrote 2 leases to leases file.
Jun 13 15:04:00 dhcpd For info, please visit https://www.isc.org/software/dhcp/
Jun 13 15:04:00 dhcpd All rights reserved.
Jun 13 15:04:00 dhcpd Copyright 2004-2016 Internet Systems Consortium.
Jun 13 15:04:00 dhcpd Internet Systems Consortium DHCP Server 4.3.5
Jun 13 15:04:00 dhcpd PID file: /var/run/dhcpd.pid
Jun 13 15:04:00 dhcpd Database file: /var/db/dhcpd.leases
Jun 13 15:04:00 dhcpd Config file: /etc/dhcpd.conf
Jun 13 15:04:00 dhcpd For info, please visit https://www.isc.org/software/dhcp/
Jun 13 15:04:00 dhcpd All rights reserved.
Jun 13 15:04:00 dhcpd Copyright 2004-2016 Internet Systems Consortium.
Jun 13 15:04:00 dhcpd Internet Systems Consortium DHCP Server 4.3.5 -
try a different cable on the slow machine.
-
Thanks, tried 3 with the same results? Are there any Windows tools that might help me capture or get a trace of the LAN connection handshaking with time stamps in more detail?
-
I'm assuming here that you've got a consumer level dumb switch, in which case the most likely fixable explanation would be that you've got a bad link and the NIC is taking a long time to negotiate down. If you've changed the cable then there's not much else you can do; some NICs just have a slow handshake.
If you've got a managed switch, then check out the settings for spanning tree, vlan trunking, etc.: some of those can take a while to negotiate if the host isn't actually using them.
If it's really a crisis, try the NIC model that's in the fast machine. I'd personally not worry about it much, as a few seconds of network latency during startup isn't generally all that noticeable.
-
Thanks, the PC NICs are embedded in the mobos, but as I said I'm waiting for a PCIe NIC to try in the slow machine. I'm used to watching the windows network orb on the taskbar during startup. If the machines were on 24/7 there wouldn't be a problem.
When I watch the orb on the old 32 bit faster network booting PC it rotates around blue for about 4 seconds then the icon changes to network connected. On the 64 slow connecting PC the orb runs around blue then abruptly stops with the Win red exclamation mark. During this fairly long time there is nothing in the arp table. Then the orb changes to a yellow exclamation mark and I have local webgui access but no internet (WAN port flashing green). After another wait I get the normal connected icon. It's the sudden 'everything stops' at the beginning that's puzzling me. I've now discovered the Win System event logs and having a good look at those. I've repeated the boot in Windows safe mode with network option and get the same long time for the LAN to establish connection.
I've got pfsense openVPN running with squid web cache and DNS caching, but both PCs are seeing the same pfsense setup.
-
I guess that the slow pc used a Broadcom NIC ?
Grtz
DeLorean -
No, It's Intel 82579V embedded in an Asus P9X79pro mobo. I've read some past issues with this combo, but can't prove anything until I try a different PCIe NIC. Then there is the complexity of Windoze layered on top. I went back and checked the fast connecting PC and if I disable and re-enable the LAN connection, the connection is almost instant, which is what I would expect from a LAN interface. The pfsense box DHCP is only set to serve 10 IP addresses on the local subnet so it shouldn't be working that hard to authenticate.
-
OK so I fitted a second NIC (Realtek PCIe GBE) same isssue. Swapped the pfsense box for the original home router, same problem. Checked on older Win32 PC, still good and fast.
Had a look at the Win system events log and I could see what was happening. During Boot to the login GUI there was a lot going on in the background, but the network processes seemed to be all getting shutdown. After password login the network orb went to disconnected and the event log confirmed a PCIe disconnect from the Windows process controller. This was then followed by several network processes being shut down and restarted. The first to come up after 40 seconds was the LAN with no internet from the WAN. That then came up at the end of about 90 seconds from start of boot - too long.
I think I proved the issue was in Windoze OS authentication/handshake and not hardware or cable related. Having gone around the houses updating NIC drivers and several windows tweaks to no avail, I finally bust the OS network stack.
Re-loaded a backup image from 3 weeks ago and like magic everything came good. Problem gone, but I should have switched to Linux!
The last problem was with a cheap Chinese LB-Link router with AP bridge mode. It all seemed to work with portable devices connected wi-fi to the pfsense box with a DHCP allocated IP address, but no browsing or data transfer. Finally in desperation I managed to find a firmware update which it swallowed and then after a factory setting reset, all came good.
Case Closed - Thanks
-
Although I closed the topic and fixed the problem with a windows saved backup image, I can only now say what the problem was and a big hole for others who may fall into it! The problem shows up as no LAN connection (red cross on network icon) or a slow to connect LAN. Sometimes the LAN will connect when Windows shows a yellow exclamation. The pfsense webgui works but there is no internet connection. The WAN connection on the dashboard can be in one of 2 states - WAN down red, or WAN up green but no remote IP address. There is no problem with pfsense! Changing the pfsense configuration will drive you mad going around in circles!!
During the pfsense learning and experimenting period many things will get tried and tested - DHCP, fixed IP, DHCP server IP pool range, DNS server local or remote changes. During these changes Windows keeps remembering everything in its DNS cache and the cache isn't emptied after windows reboots. My backup recovery of windows solved my problem because the previous DNS cache wasn't full of junk.
The answer is to use Wndows console to flush the local DNS. There are plenty of how tos, but this link has a nifty basic executeable you can plant on your desktop to flush frequently whilst setting up pfsense:
https://www.eightforums.com/tutorials/30136-dns-resolver-cache-flush-reset-windows.html