WAN interface keeps dropping after upgrade to 1.2.1
-
So everything seems normal.
Except that the lease extension requests are apparently ignored.
Are the lease extension requests getting out on the wire? (May be hard to test without specialised equipment.)
If so, why aren't they provoking a response? (Your ISP will probably have to help with than one.)
Are there any link status change reports on rl1 at these times? (Does pulling out the rl1 cable generate a link status change on rl1 report?)
-
So everything seems normal.
Except that the lease extension requests are apparently ignored.
Yes.
Are the lease extension requests getting out on the wire? (May be hard to test without specialised equipment.)
I don't know. I will try to get an answer.
If so, why aren't they provoking a response? (Your ISP will probably have to help with than one.)
We own the modem (Zyxel P660R-D1 in half bridge mode), so the ISP won't help. I will send the question to Zyxel.
Are there any link status change reports on rl1 at these times?
I don't know. I'm waiting for the next connection drop.
(Does pulling out the rl1 cable generate a link status change on rl1 report?)
Yes.
-
Are there any link status change reports on rl1 at these times?
I don't know. I'm waiting for the next connection drop.
Ok. I have just had another connection drop. This is interesting:
Here are the last four DHCP leases. There is something "bizarre" ont the last one: the renew time seems to be wrong. Taking into account the previous leases, it should be around 52,5 (rebinding time) minutes from the previous one but instead of that, it's exactly one hour later (lease time):
#cat /var/db/dhclient.leases.rl1 [...] lease { interface "rl1"; fixed-address rl1_EXTERNAL_IP; option subnet-mask 255.255.255.0; option routers rl1_EXTERNAL_IP; option domain-name-servers 80.10.246.130,81.253.149.10; option host-name "dhcppc0"; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 192.168.1.1; option dhcp-renewal-time 1800; option dhcp-rebinding-time 3150; renew 3 2009/1/21 11:58:06; rebind 3 2009/1/21 12:20:36; expire 3 2009/1/21 12:28:06; } lease { interface "rl1"; fixed-address rl1_EXTERNAL_IP; option subnet-mask 255.255.255.0; option routers rl1_EXTERNAL_IP; option domain-name-servers 80.10.246.130,81.253.149.10; option host-name "dhcppc0"; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 192.168.1.1; option dhcp-renewal-time 1800; option dhcp-rebinding-time 3150; renew 3 2009/1/21 12:52:31; rebind 3 2009/1/21 13:15:01; expire 3 2009/1/21 13:22:31; } lease { interface "rl1"; fixed-address rl1_EXTERNAL_IP; option subnet-mask 255.255.255.0; option routers rl1_EXTERNAL_IP; option domain-name-servers 80.10.246.130,81.253.149.10; option host-name "dhcppc0"; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 192.168.1.1; option dhcp-renewal-time 1800; option dhcp-rebinding-time 3150; renew 3 2009/1/21 13:47:14; rebind 3 2009/1/21 14:09:44; expire 3 2009/1/21 14:17:14; } lease { interface "rl1"; fixed-address rl1_EXTERNAL_IP; option subnet-mask 255.255.255.0; option routers rl1_EXTERNAL_IP; option domain-name-servers 80.10.246.130,81.253.149.10; option host-name "dhcppc0"; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 192.168.1.1; option dhcp-renewal-time 1800; option dhcp-rebinding-time 3150; renew 3 2009/1/21 14:47:23; rebind 3 2009/1/21 15:09:53; expire 3 2009/1/21 15:17:23; }
And this is what shows the system log (arplookup problem at 15:17:15, the end of the last DHCP lease):
$ cat /var/log/system.log [...] Jan 21 14:17:14 pfsense dhclient[308]: DHCPREQUEST on rl1 to 255.255.255.255 port 67 Jan 21 14:17:14 pfsense dhclient[308]: DHCPACK from 192.168.1.1 Jan 21 14:17:14 pfsense dhclient[308]: bound to rl1_EXTERNAL_IP -- renewal in 1800 seconds. Jan 21 14:47:14 pfsense dhclient[308]: DHCPREQUEST on rl1 to 192.168.1.1 port 67 Jan 21 14:47:14 pfsense dhclient[308]: SENDING DIRECT Jan 21 14:47:15 pfsense dhclient[308]: DHCPREQUEST on rl1 to 192.168.1.1 port 67 Jan 21 14:47:15 pfsense dhclient[308]: SENDING DIRECT [...] Jan 21 15:09:28 pfsense dhclient[308]: DHCPREQUEST on rl1 to 192.168.1.1 port 67 Jan 21 15:09:28 pfsense dhclient[308]: SENDING DIRECT Jan 21 15:14:40 pfsense dnsmasq[834]: reading /var/dhcpd/var/db/dhcpd.leases Jan 21 15:17:15 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:15 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:15 pfsense dhclient[308]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 2 Jan 21 15:17:17 pfsense dhclient[308]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 4 Jan 21 15:17:19 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:19 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:20 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:20 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:21 pfsense dhclient[308]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 5 Jan 21 15:17:21 pfsense dhclient[308]: DHCPOFFER from 192.168.1.1 Jan 21 15:17:21 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:21 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:21 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:21 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:23 pfsense dhclient[308]: DHCPREQUEST on rl1 to 255.255.255.255 port 67 Jan 21 15:17:23 pfsense dhclient[308]: DHCPACK from 192.168.1.1 Jan 21 15:17:23 pfsense dhclient[308]: bound to rl1_EXTERNAL_IP -- renewal in 1800 seconds. Jan 21 15:17:24 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:24 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:24 pfsense check_reload_status: rc.newwanip starting Jan 21 15:17:25 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:25 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:25 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:25 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:25 pfsense php: : Informational: rc.newwanip is starting . Jan 21 15:17:25 pfsense php: : rc.newwanip working with (IP address: HIDDEN_IP) (interface: wan) (interface real: rl0). Jan 21 15:17:27 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:27 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:28 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:28 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:29 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:29 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:31 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:31 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 21 15:17:33 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 21 15:17:33 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP [...]
As usual, the connection on rl1 has been dropped and a reboot is needed to bring everything back to normal!
So I think that definitely, the problem has something to do with DHCP.
There's another interesting fact: once they have begun, the arplookup/arpresolve problems continue even after rl1 gets a new lease from the DHCP server.
…And today this has happened on the 30th address renewal from the last reboot.
Any ideas?
-
The arplookup/arpresolve messages suggest to me that you have lost communication with rl1_EXTERNAL_IP which should be responding to the ARPs. Once that happens you won't get a DHCP response until (possibly) the DHCP broadcast which does not need a mapping from IP address to MAC address. Its strange though that you then apparently get a DHCP response but no ARP response.
When this happens can you ssh into the pfSense box? Can you login on the console? (Perhaps there is a kernel memory leak and when DHCP fails in this way its because there is a little bit of free memory but not enough to always get the DHCP/ARP responses.) If you can login on the console or ssh in there is at least a chance of testing this theory. If you can't login then some cunning will be needed.
If you can login while this is happening, type command
netstat -i -I rl1
a few times and watch what happens to the rl1 error counters.
-
So everything seems normal.
Except that the lease extension requests are apparently ignored.
Are the lease extension requests getting out on the wire? (May be hard to test without specialised equipment.)
The lease extension requests are not getting out: I have connected the pfsense box, the Zyxel modem and a laptop to a small hub. Using Wireshark on the laptop, I have done some monitoring and I don't see the DHCPREQUEST packets from pfsense to the zyxel modem.
-
I don't see the DHCPREQUEST packets from pfsense to the zyxel modem.
That would explain why you don't see responses.
This sounds like a hardware problem or a driver bug. Do you get the same behaviour on other ports?
-
Do you have a free RTL8100C port on your machine?
I'd suggest using that one (at least temporarily) to see if it's a hardware issue. Sounds pretty much like it.
Using another CAT cable might help as well. -
I don't see the DHCPREQUEST packets from pfsense to the zyxel modem.
This sounds like a hardware problem or a driver bug. Do you get the same behaviour on other ports?
I think the problem could come from the fact that our two WAN "modems" (on rl0 and rl1) have the same "internal" IP address (192.168.1.1).
I cannot change it on our primary conection (the modem -WIMAX interface / PPPOE- is owned by the ISP and they won't do it) and for the moment I have not been able to change it on the Zyxel P660R neither (half-bridge PPPOA): After changing the internal IP of the P660R, it reverts to the default 192.168.1.1 when I turn half-bridge mode on, and if I turn on half-bridge first, it does the address change when I power cycle the modem!
What puzzles me is that everything has worked fine with the 1.2 version of pfsense for almost a year :-?
Could this be the origin of the problem?
-
Yesterday my rl0 interface got into a state where it either wasn't sending or wasn't receiving. My WAN interface to a Zyxel ADSL modem/router uses rl0.
This is suspiciously like your configuration.
When it was in this state the activity light would flash regularly, seemingly in time with the ping but there was no ping response reported. I could login on the console and I could ssh in over the LAN interface (vr0). I was unable to reactivate rl0 by a ifconfig down/up sequence, nor by reassigning interfaces. There were no errors reported on rl0 EXCEPT over 2000 collisions which is an unexpected event on a full duplex link to a modem. I couldn't work out any other way of recovering than rebooting. The system has now been up over 21 hours with no errors and no collisions reported on rl0.
Perhaps there is an rl driver bug or hardware error in dealing with an "unusual" situation.
Are you sure your 10/100 interfaces are RTL8100C? I think the 8100C should be controlled by the re driver. I suspect that you have a number of 8139 ports.
-
Yesterday my rl0 interface got into a state where it either wasn't sending or wasn't receiving. My WAN interface to a Zyxel ADSL modem/router uses rl0.
This is suspiciously like your configuration.
Which model? Are you also using half-bridge?
When it was in this state the activity light would flash regularly, seemingly in time with the ping but there was no ping response reported. I could login on the console and I could ssh in over the LAN interface (vr0). I was unable to reactivate rl0 by a ifconfig down/up sequence, nor by reassigning interfaces. There were no errors reported on rl0 EXCEPT over 2000 collisions which is an unexpected event on a full duplex link to a modem. I couldn't work out any other way of recovering than rebooting. The system has now been up over 21 hours with no errors and no collisions reported on rl0.
I left mine on over the weekend. As expected, the problem appeared again about a day after the previous reboot. After several hours of :
Jan 25 05:54:18 pfsense kernel: arplookup EXTERNAL_IP_ADDRESS failed: host is not on local network Jan 25 05:54:18 pfsense kernel: arpresolve: can't allocate route for EXTERNAL_IP_ADDRESS
The connection got reestablished:
Jan 25 05:53:56 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:53:56 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:53:56 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 1 Jan 25 05:53:56 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 2 Jan 25 05:53:56 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:53:56 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:53:58 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:53:58 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:53:58 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:53:58 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:53:58 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 4 Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:01 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:01 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:02 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 5 Jan 25 05:54:02 pfsense dhclient[52491]: DHCPOFFER from 192.168.1.1 Jan 25 05:54:03 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:03 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:04 pfsense dhclient[52491]: DHCPREQUEST on rl1 to 255.255.255.255 port 67 Jan 25 05:54:04 pfsense dhclient[52491]: DHCPACK from 192.168.1.1 Jan 25 05:54:04 pfsense slbd[5935]: ICMP poll failed for rl1_EXTERNAL_IP, marking service DOWN Jan 25 05:54:04 pfsense slbd[5935]: Service WAN2FailsToWAN1 changed status, reloading filter policy Jan 25 05:54:04 pfsense slbd[5935]: ICMP poll failed for rl1_EXTERNAL_IP, marking service DOWN Jan 25 05:54:04 pfsense slbd[5935]: Service LoadBalance changed status, reloading filter policy Jan 25 05:54:04 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:04 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:04 pfsense dhclient[52491]: bound to rl1_EXTERNAL_IP -- renewal in 1800 seconds. Jan 25 05:54:06 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:06 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:07 pfsense check_reload_status: rc.newwanip starting Jan 25 05:54:08 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:08 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:09 pfsense php: : Informational: rc.newwanip is starting . Jan 25 05:54:09 pfsense php: : rc.newwanip working with (IP address: rl0_EXTERNAL_IP) (interface: wan) (interface real: rl0). Jan 25 05:54:09 pfsense slbd[5935]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP Jan 25 05:54:09 pfsense slbd[5935]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP Jan 25 05:54:09 pfsense slbd[5935]: Service WAN2FailsToWAN1 changed status, reloading filter policy Jan 25 05:54:09 pfsense slbd[5935]: Service LoadBalance changed status, reloading filter policy Jan 25 05:54:10 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:10 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:10 pfsense slbd[5935]: ICMP poll failed for rl1_EXTERNAL_IP, marking service DOWN Jan 25 05:54:11 pfsense slbd[5935]: Service WAN1FailsToWAN2 changed status, reloading filter policy Jan 25 05:54:12 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:12 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:14 pfsense php: : Informational: DHClient spawned /etc/rc.newwanip and the new ip is wan - rl0_EXTERNAL_IP. Jan 25 05:54:14 pfsense php: : Creating rrd update script Jan 25 05:54:14 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:14 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:16 pfsense slbd[5935]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP Jan 25 05:54:16 pfsense slbd[5935]: Service WAN1FailsToWAN2 changed status, reloading filter policy Jan 25 05:54:16 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:16 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:17 pfsense php: : Resyncing configuration for all packages. Jan 25 05:54:18 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network Jan 25 05:54:18 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP Jan 25 05:54:18 pfsense php: : Resyncing configuration for all packages. Jan 25 05:54:18 pfsense php: : pfSense package system has detected an ip change rl1_EXTERNAL_IP -> rl0_EXTERNAL_IP ... Restarting packages. Jan 25 05:54:21 pfsense php: : Configuring slbd Jan 25 05:54:21 pfsense check_reload_status: reloading filter Jan 25 05:54:21 pfsense slbd[50991]: ICMP poll succeeded for rl0_GATEWAY_IP, marking service UP Jan 25 05:54:21 pfsense slbd[50991]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP Jan 25 05:54:21 pfsense slbd[50991]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP Jan 25 05:54:22 pfsense slbd[50991]: ICMP poll succeeded for rl0_GATEWAY_IP, marking service UP Jan 25 05:54:22 pfsense slbd[50991]: ICMP poll succeeded for rl0_GATEWAY_IP, marking service UP Jan 25 05:54:22 pfsense slbd[50991]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP Jan 25 05:54:26 pfsense check_reload_status: updating dyndns
And then everything began once again…
Are you sure your 10/100 interfaces are RTL8100C? I think the 8100C should be controlled by the re driver. I suspect that you have a number of 8139 ports.
No. That's what page V "Capter 4 Software Installation - LAN Utility & Driver (RTL8100C & RTL8110S)" of the manual says <http: download.fabiatech.com.tw="" manual="" m5620.pdf="">. Pfsense apparently detects them as RTL8139:
pfsense kernel: rl1: <realtek 10="" 8139="" 100basetx="">port 0xe800-0xe8ff mem 0xdffffe00-0xdffffeff irq 11 at device 13.0 on pci0
Why do you think that the RTL8100C should be controlled by the re driver?
I don't see any mention of the RTL8100C on the freeBSD hardware pages <http: www.freebsd.org="" releases="" 6.3r="" hardware-i386.html="">or <http: www.freebsd.org="" relnotes="" current="" hardware="" i386="" support.html#ethernet="">, but the other rtl100c port that I have in use works perfectly (rl0, also detected as an 8139).</http:></http:></realtek></http:>
-
I'm using a Zyxel 660H-61 as a router, not half bridge.
My pfSense has a Jetway mini-ITX board and has a daughter board with a RTL8139 NIC on it. The same model daughter board is now promoted as having a RTL8100C NIC on it. I suspect the 8100C is promoted as a "plug in " replacement for the 8139 and the the newer variants of the daughterboard are the same as the older variants except the 8100C has replaced the 8139. MAYBE a similar thing has happened with your box and you have an older box and have been looking at newer documentation.
I think the 8100C should be controlled by the rl driver because I've been looking at the re and rl driver sources recently. The rl driver appears to include recognition of an 8100 device. If you like we can go into the details, but I just wanted to warn you that your NICs may not have quite what you think they are.
-
I'm using a Zyxel 660H-61 as a router, not half bridge.
Ok. So it's pointless to try router mode here, something I had been considering. Thankyou.
MAYBE a similar thing has happened with your box and you have an older box and have been looking at newer documentation.
Yes, that's possible. That's why I said that I was not sure of my network cards ;-)
I have sent the question to Fabiatech. They say on their site that they will reply in 24 hours…
I think the 8100C should be controlled by the rl driver because I've been looking at the re and rl driver sources recently. The rl driver appears to include recognition of an 8100 device.
Even if system.log says that our box has RTL8139 ports, they are controlled byt the rl driver (rl0 & rl1). That seems to indicate that they are 8100C.
-
This sounds like a hardware problem or a driver bug. Do you get the same behaviour on other ports?
Yes, at least on the 10/100 ports (RTL8100C or RTL8139).
-
When this happens can you ssh into the pfSense box?
Yes.
If you can login while this is happening, type command
netstat -i -I rl1
a few times and watch what happens to the rl1 error counters.
Not very much, I think:
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll rl1 1500 <link#2>00:04:a7:__:__:__ 54513 0 27948 0 94 rl1 1500 fe80:2::___:_ fe80:2::___:____: 0 - 1 - - rl1 1500 80.___.___.0 LRouen-___-___-9-2 410 - 0 - - [... after about 5 minutes] Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll rl1 1500 <link#2>00:04:a7:__:__:__ 54520 0 27954 0 94 rl1 1500 fe80:2::___:_ fe80:2::___:____: 0 - 1 - - rl1 1500 80.___.___.0 LRouen-___-___-9-2 582 - 0 - -</link#2></link#2>
-
Ok. I have no more ideas.
No answer from Fabiatech after their "guaranteed 24 hour reply delay" (not too surprised).
"pciconf -lv" shows:
rl0@pci0:0:12:0: class=0x020000 card=0x813910ec chip=0x813910ec rev=0x10 hdr=0x00 class = network subclass = ethernet
So apparently the 10/100 cards are "RT8139 (A/B/C/810x/813x/C+) Fast Ethernet Adapter" and they seem to be supported by the "rl" driver. I have found several posts showing problems with this card under freeBSD 7.0 that were solved after upgrading to freeBSD 7.1 (like this one: <http: daemonforums.org="" showthread.php?p="10070">.
I'm not very willing to go back to 1.2, so any idea of when pfsense 1.2.3 will be released?</http:>
-
Have you tried a current pfSense 1.2.3 Pre-Release snapshot to see if it helps with your problem?
http://blog.pfsense.org/?p=364 -
No. This is a "production environment" that keeps working even if it's somewhat handicapped and the statement "Please test in a non-production environment and let us know how it goes on the forum." scares me. Unfortunately, I have only one of these Fabiatech devices :-(
-
If it is a on board nic I would check for bios update.
-
I am having a similar issue with a new 1.2.2 box. The buildup was done on an old Dell Optiplex 260 with P4 and 512MB RAM plus the onboad NIC (lan). Its a multiwan setup with pppoe on the wan and cable on opt1. Both are on pci dlink DFE-538TX cards (brand spanking new). The vr0 card (pppoe) has a bunch of collisions. After 7 hours the total collision count is around 1700. I get the same style arpresolve messages in the logs and the load balancer is constantly up/downing links (in the logs). I plan to leave things alone overnight to see if it is a provider issue that sorts itself out as I am not sure what else to look at. That is the only thing that is wrong with this box, everything else is working fine it seems. For now I have added some outgoing lan rules to direct the critical traffic toward the cable modem interface until I figure this out.
One more thing, the modem is at least 4 years old on the pppoe connection so I am wondering if it is starting to get flakey.
-
If it is a on board nic I would check for bios update.
Done. I was already on the last version.
I was also having problems with DNS resolution (SLOOOOOW, in the order of several seconds), so I have transfered everything I could to the working WAN connection and now I'm rebooting the pfsense box from time to time and just waiting for the 1.2.3 release, hoping that it will fix this problem.