Since a few day's losing the wan (dhcp) ip address
-
Yes, I'm running a nanoBSD variant. Currently:
2.1-BETA1 (i386)
built on Thu Feb 28 04:26:51 EST 2013(but over the last several days I've been trying 2.0.1, 2.0.2, 2.0.3-SNAPSHOT, and a few 2.1 snapshots.
This is running on: Intel(R) Atom(TM) CPU 330 @ 1.60GHz
It definitely appears to be a dhclient thing at this point. If I manually kill the running dhclient process and re-start it, things come right back up.
[2.1-BETA1][root@gw.tubas.net]/sbin(86): /sbin/dhclient -c /var/etc/dhclient_wan.conf ue0 dhclient already running, pid: 72467. exiting. [2.1-BETA1][root@gw.tubas.net]/sbin(87): kill -KILL 72467 [2.1-BETA1][root@gw.tubas.net]/sbin(88): /sbin/dhclient -c /var/etc/dhclient_wan.conf ue0 dhclient: PREINIT ifconfig: ioctl (SIOCAIFADDR): File exists dhclient: Starting delete_old_states() OLD_IP: not found dhclient: Comparing IPs: Old: New: dhclient: Comparing Routers: Old: New: DHCPREQUEST on ue0 to 255.255.255.255 port 67 DHCPACK from 73.129.106.1 bound to 68.48.155.52 -- renewal in 158108 seconds.
-
Beginning to believe I have two problems at play. :-\
#1 is that the USB Ethernet adaptor doesn't seem to be very stable. This is the 2nd adaptor I've tried but same make/model as the first. I swapped the former outside (ue0) and inside (re0) interfaces around so now ue0 is the LAN. After a little while, I saw it hiccup again, but since it's static it survived:
Mar 1 04:01:38 gw check_reload_status: Linkup starting ue0 Mar 1 04:01:38 gw kernel: ue0: link state changed to DOWN Mar 1 04:01:38 gw kernel: ue0: link state changed to UP Mar 1 04:01:38 gw check_reload_status: Linkup starting ue0 Mar 1 04:01:40 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 1 04:01:40 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 1 04:01:41 gw check_reload_status: rc.newwanip starting ue0 Mar 1 04:01:43 gw php: : rc.newwanip: Informational is starting ue0. Mar 1 04:01:43 gw php: : rc.newwanip: on (IP address: 192.168.100.1) (interface: lan) (real interface: ue0). Mar 1 04:01:44 gw check_reload_status: Reloading filter Mar 1 04:01:49 gw php: : Resyncing OpenVPN instances for interface LAN. Mar 1 04:01:49 gw php: : Creating rrd update script Mar 1 04:01:51 gw php: : pfSense package system has detected an ip change 0.0.0.0 -> 192.168.100.1 ... Restarting packages. Mar 1 04:01:51 gw check_reload_status: Starting packages Mar 1 04:01:54 gw php: : Restarting/Starting all packages.
The second problem seems to be that the quick down/up apparently plays havoc with the dhclient (what we're seeing in the earlier posts). Related, but different issues from what I see.
I don't consider this fixed - it's just a bandaid solution. What other info can I provide to help here?
-
i can confirm disabling openvpn doesnt solve the problem. :-[
-
i can confirm disabling openvpn doesnt solve the problem. :-[
[/quote]Yeah, I left it out of my clean re-install and no change.
Since I moved the USB interface to a static addressed interface, I'm still seeing it flap occasionally, but it doesn't get hung up with dhcp like it did on the WAN side. I think that eliminates my cablemodem from the cause of the flapping problem (it was connected directly to the cablemodem; now ue0 connects to a cisco switch).
ue(4) driver problem? usb issue?
When it flaps, it happens quickly enough where I don't even lose any pings when running a continuous test from another host.
-
Sure looks like this would be the bug in question, but seems like the fix may not be working.
http://redmine.pfsense.org/issues/2792
Edit: Multiple restarts at once?
ar 2 20:02:54 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:54 gw kernel: ue0: link state changed to DOWN Mar 2 20:02:54 gw kernel: ue0: link state changed to UP Mar 2 20:02:54 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:54 gw kernel: ue0: link state changed to DOWN Mar 2 20:02:54 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:54 gw kernel: ue0: link state changed to UP Mar 2 20:02:54 gw kernel: ue0: link state changed to DOWN Mar 2 20:02:54 gw kernel: ue0: link state changed to UP Mar 2 20:02:54 gw kernel: ue0: link state changed to DOWN Mar 2 20:02:54 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:54 gw kernel: ue0: link state changed to UP Mar 2 20:02:55 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:55 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:55 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:55 gw check_reload_status: Linkup starting ue0 Mar 2 20:02:57 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:57 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:57 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:57 gw check_reload_status: rc.newwanip starting ue0 Mar 2 20:02:58 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:58 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:58 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:58 gw check_reload_status: rc.newwanip starting ue0 Mar 2 20:02:58 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:58 gw php: : Hotplug event detected for LAN(lan) but ignoring since interface is configured with static IP (192.168.100.1 ) Mar 2 20:02:58 gw check_reload_status: rc.newwanip starting ue0 Mar 2 20:02:58 gw check_reload_status: rc.newwanip starting ue0 Mar 2 20:02:59 gw php: : rc.newwanip: Informational is starting ue0. Mar 2 20:02:59 gw php: : rc.newwanip: on (IP address: 192.168.100.1) (interface: lan) (real interface: ue0). Mar 2 20:03:00 gw php: : rc.newwanip: Informational is starting ue0. Mar 2 20:03:00 gw php: : rc.newwanip: on (IP address: 192.168.100.1) (interface: lan) (real interface: ue0). Mar 2 20:03:00 gw php: : rc.newwanip: Informational is starting ue0. Mar 2 20:03:00 gw php: : rc.newwanip: on (IP address: 192.168.100.1) (interface: lan) (real interface: ue0). Mar 2 20:03:00 gw php: : rc.newwanip: Informational is starting ue0. Mar 2 20:03:00 gw php: : rc.newwanip: on (IP address: 192.168.100.1) (interface: lan) (real interface: ue0). Mar 2 20:03:01 gw check_reload_status: Reloading filter Mar 2 20:03:01 gw dhcpleases: kqueue error: unkown Mar 2 20:03:06 gw php: : Resyncing OpenVPN instances for interface LAN. Mar 2 20:03:06 gw php: : Resyncing OpenVPN instances for interface LAN. Mar 2 20:03:07 gw php: : Resyncing OpenVPN instances for interface LAN. Mar 2 20:03:07 gw php: : Resyncing OpenVPN instances for interface LAN.
-
i gave up and went back to 2.0.1, hopefully ill be problem free
-
i gave up and went back to 2.0.1, hopefully ill be problem free
I tried that, didn't seem to help, but curious what you find.
-
Since the "bad" interface now connects into my switch, I can look at the other end of the link. No sign of it flapping/bouncing on the switchport side. No errors of any sort, and this is thru a couple of flapping events on the pfSense side:
GigabitEthernet0/8 is up, line protocol is up (connected) Hardware is Gigabit Ethernet, address is 000f.f752.6308 (bia 000f.f752.6308) MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 100Mb/s input flow-control is off, output flow-control is off ARP type: ARPA, ARP Timeout 04:00:00 Last input never, output 00:00:01, output hang never Last clearing of "show interface" counters 00:33:36 Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 637000 bits/sec, 37 packets/sec 5 minute output rate 23000 bits/sec, 24 packets/sec 172453 packets input, 245147952 bytes, 0 no buffer Received 30 broadcasts (0 multicast) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 0 multicast, 0 pause input 0 input packets with dribble condition detected 110118 packets output, 10263992 bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 PAUSE output 0 output buffer failures, 0 output buffers swapped out
-
I tried that, didn't seem to help, but curious what you find.
2.0.1 was rock solid for me before, even the early betas of 2.1 didn't present the problem (but i was updating on a weekly basis, so no beta install had a run time of greater than 7 days). if this doesn't work, the issue is probably a hardware problem on my end, nothing has changed hardware wise, so fingers crossed! :P
-
I tried that, didn't seem to help, but curious what you find.
2.0.1 was rock solid for me before, even the early betas of 2.1 didn't present the problem (but i was updating on a weekly basis, so no beta install had a run time of greater than 7 days). if this doesn't work, the issue is probably a hardware problem on my end, nothing has changed hardware wise, so fingers crossed! :P
Interested to hear how it goes. At some point along the way I tried going back to 2.0.1 as well, but it didn't seem to help. I've swapped out 100% of the hardware at this time, at one time or another. I have two of the USB ethernet adaptors, I'm trying the "other" on in a different host (ubuntu box) looking for signs that it's flaky, and if it seems stable there I'll put to back on the pfSense box again. It's connected to the exact same switch that the one on pfSense connects to now, so that should eliminate switch problems.
I'll provide one of these USB adaptors to one of the developers if it'll help diagnose things!
-
To add a little more information to this issue. Since the moment I went back to the 2.0 series, with identical setup (and used the 2.1 config on the 2.0 installation) I did not suffer from any netwerk droppings or loosing the wan ip address.
And for the first period I was running 2.1 beta's there was not a problem either. It occured somewhere 2nd half of februari and the moment I started this topic.
So unless in that period there was a kernel update I think that (at least in my case) it is not a driver issue, but something that changed in the behaviour of the dhcpclient / handling of signals. Guess it is a by product of another fix, as everything was smooth before.
-
To add a little more information to this issue. Since the moment I went back to the 2.0 series, with identical setup (and used the 2.1 config on the 2.0 installation) I did not suffer from any netwerk droppings or loosing the wan ip address.
And for the first period I was running 2.1 beta's there was not a problem either. It occured somewhere 2nd half of februari and the moment I started this topic.
So unless in that period there was a kernel update I think that (at least in my case) it is not a driver issue, but something that changed in the behaviour of the dhcpclient / handling of signals. Guess it is a by product of another fix, as everything was smooth before.
my situation is exactly the same, the 2.1 betas began having issues mid feb, and 2.0.1 and 2.0.2 have been problem free for me. currently on 2.0.2.
-
Slight update on this. After a couple months away from using pfSense I decided to give it another try. Same hardware as before - this time the USB Interface is my "inside" NIC.
Same behavior - ue0 starts flapping for no apparent reason, and it never recovers. I have to reboot to bring it back. I even locked the speed/duplex down on the Cisco switch that the inside USB NIC plugs into.
Back to the Airport Extreme for me, sadly. I love the extra capabilities that pfSense brings.
-
good info, i was thinking of trying it again, but I guess not. i tried 2.0.3 when it came out had a similar issue, no issue with 2.0.2 though, been running it (2.0.2) fine for the last few months.
-
I still suspect that it's something to do with the USB Ethernet driver, since the flapping problem follows it. I still think there's two problems with it not recovering correctly when it returns. The first time it lost it's (static) LAN address, re-plugging the USB ethernet adaptor restored the IP but not the other services like routing or the gui. (Had to reboot)
-
im not using an usb adaptor, but have almost identical symptoms. maybe its a freebsd driver issue rather than an issue with pfsense? if i had a spare nic, id give it a try, but no extras lying around right now…
-
I am running on 2.1RC0 Jun12 build and am also having ue0 flapping but the connection to the internet seems not to be affected.
Running inside a virtualbox (latest 4.2.16) with lan interface bridged to the host's lan and wan using trendnet's usb adapter.Just updated to latest snapshot from friday and the problem is present.
Again, strangely enough pppoe running on the same physical interface is not affected and keeps running despite this flapping. -
I was experiencing dhclient deaths when pfSense booted up and the modem wasn't online: http://forum.pfsense.org/index.php/topic,64296.msg348994.html#msg348994
This morning, running the "Sun Jul 21 21:47:35 EDT 2013" build, I tested this situation again (involuntarily again). This time, however, everything went fine. While I still got the message that dhclient died upon bootup, pfSense happily aquired a DHCP resitration when the modem became available.
So…nice that it appears to work for me now - but can someone else check if this issue is really solved?
-
This morning, I upgraded to the Jul 26 15:34:51 EDT 2013 version. Later on, I had to cut power to modem and pfSense, and this time, pfSense again failed to get a DHCP address via WAN. Same situation as decribed earlier, modem and pfSense boot up, pfSense starts dhclient (apparently before the modem has established the physical link), dhclient dies and doesn't come up again (this time, I waited for a few minutes only, not hours).
Either the issue has reappeared in one of the later builds or it happens somewhat randomly…
-
pfSense again failed to get a DHCP address via WAN. Same situation as decribed earlier, modem and pfSense boot up, pfSense starts dhclient (apparently before the modem has established the physical link), dhclient dies and doesn't come up again (this time, I waited for a few minutes only, not hours).
Please provide the relevant sections of the system log and DHCP log. (It seems dhclient logging has recently moved from the system log to DHCP log which probably makes it a bit harder to determine the ordering of link-up/link-down, apinger, dhclient actions.)