Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    WAN interface keeps dropping after upgrade to 1.2.1

    Scheduled Pinned Locked Moved General pfSense Questions
    43 Posts 6 Posters 21.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • W
      wallabybob
      last edited by

      The arplookup/arpresolve messages suggest to me that you have lost communication with rl1_EXTERNAL_IP which should be responding to the ARPs. Once that happens you won't get a DHCP response until (possibly) the DHCP broadcast which does not need a mapping from IP address to MAC address. Its strange though that you then apparently get a DHCP response but no ARP response.

      When this happens can you ssh into the pfSense box? Can you login on the console? (Perhaps there is a kernel memory leak and when DHCP fails in this way its because there is a little bit of free memory but not enough to always get the DHCP/ARP responses.) If you can login on the console or ssh in there is at least a chance of testing this theory. If you can't login then some cunning will be needed.

      If you can login while this is happening, type command

      netstat -i -I rl1

      a few times and watch what happens to the rl1 error counters.

      1 Reply Last reply Reply Quote 0
      • R
        rigius
        last edited by

        @wallabybob:

        So everything seems normal.

        Except that the lease extension requests are apparently ignored.

        Are the lease extension requests getting out on the wire? (May be hard to test without specialised equipment.)

        The lease extension requests are not getting out: I have connected the pfsense box, the Zyxel modem and a laptop to a small hub. Using Wireshark on the laptop, I have done some monitoring and I don't see the DHCPREQUEST packets from pfsense to the zyxel modem.

        1 Reply Last reply Reply Quote 0
        • W
          wallabybob
          last edited by

          I don't see the DHCPREQUEST packets from pfsense to the zyxel modem.

          That would explain why you don't see responses.

          This sounds like a hardware problem or a driver bug. Do you get the same behaviour on other ports?

          1 Reply Last reply Reply Quote 0
          • jahonixJ
            jahonix
            last edited by

            Do you have a free RTL8100C port on your machine?
            I'd suggest using that one (at least temporarily) to see if it's a hardware issue. Sounds pretty much like it.
            Using another CAT cable might help as well.

            1 Reply Last reply Reply Quote 0
            • R
              rigius
              last edited by

              @wallabybob:

              I don't see the DHCPREQUEST packets from pfsense to the zyxel modem.

              This sounds like a hardware problem or a driver bug. Do you get the same behaviour on other ports?

              I think the problem could come from the fact that our two WAN "modems" (on rl0 and rl1) have the same "internal" IP address (192.168.1.1).

              I cannot change it on our primary conection (the modem -WIMAX interface / PPPOE- is owned by the ISP and they won't do it) and for the moment I have not been able to change it on the Zyxel P660R neither (half-bridge PPPOA): After changing the internal IP of the P660R, it reverts to the default 192.168.1.1 when I turn half-bridge mode on, and if I turn on half-bridge first, it does the address change when I power cycle the modem!

              What puzzles me is that everything has worked fine with the 1.2 version of pfsense for almost a year :-?

              Could this be the origin of the problem?

              1 Reply Last reply Reply Quote 0
              • W
                wallabybob
                last edited by

                Yesterday my rl0 interface got into a state where it either wasn't sending or wasn't receiving. My WAN interface to a Zyxel ADSL modem/router uses rl0.

                This is suspiciously like your configuration.

                When it was in this state the activity light would flash regularly, seemingly in time with the ping but there was no ping response reported. I could login on the console and I could ssh in over the LAN interface (vr0). I was unable to reactivate rl0 by a ifconfig down/up sequence, nor by reassigning interfaces. There were no errors reported on rl0 EXCEPT over 2000 collisions which is an unexpected event on a full duplex link to a modem. I couldn't work out any other way of recovering than rebooting. The system has now been up over 21 hours with no errors and no collisions reported on rl0.

                Perhaps there is an rl driver bug or hardware error in dealing with an "unusual" situation.

                Are you sure your 10/100 interfaces are RTL8100C? I think the 8100C should be controlled by the re driver. I suspect that you have a number of 8139 ports.

                1 Reply Last reply Reply Quote 0
                • R
                  rigius
                  last edited by

                  @wallabybob:

                  Yesterday my rl0 interface got into a state where it either wasn't sending or wasn't receiving. My WAN interface to a Zyxel ADSL modem/router uses rl0.

                  This is suspiciously like your configuration.

                  Which model? Are you also using half-bridge?

                  When it was in this state the activity light would flash regularly, seemingly in time with the ping but there was no ping response reported. I could login on the console and I could ssh in over the LAN interface (vr0). I was unable to reactivate rl0 by a ifconfig down/up sequence, nor by reassigning interfaces. There were no errors reported on rl0 EXCEPT over 2000 collisions which is an unexpected event on a full duplex link to a modem. I couldn't work out any other way of recovering than rebooting. The system has now been up over 21 hours with no errors and no collisions reported on rl0.

                  I left mine on over the weekend. As expected, the problem appeared again about a day after the previous reboot. After several hours of :

                  
                  Jan 25 05:54:18 pfsense kernel: arplookup EXTERNAL_IP_ADDRESS failed: host is not on local network
                  Jan 25 05:54:18 pfsense kernel: arpresolve: can't allocate route for EXTERNAL_IP_ADDRESS
                  
                  

                  The connection got reestablished:

                  
                  Jan 25 05:53:56 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:53:56 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:53:56 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 1
                  Jan 25 05:53:56 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 2
                  Jan 25 05:53:56 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:53:56 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:53:58 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:53:58 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:53:58 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:53:58 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:53:58 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 4
                  Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:00 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:00 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:01 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:01 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:02 pfsense dhclient[52491]: DHCPDISCOVER on rl1 to 255.255.255.255 port 67 interval 5
                  Jan 25 05:54:02 pfsense dhclient[52491]: DHCPOFFER from 192.168.1.1
                  Jan 25 05:54:03 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:03 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:04 pfsense dhclient[52491]: DHCPREQUEST on rl1 to 255.255.255.255 port 67
                  Jan 25 05:54:04 pfsense dhclient[52491]: DHCPACK from 192.168.1.1
                  Jan 25 05:54:04 pfsense slbd[5935]: ICMP poll failed for rl1_EXTERNAL_IP, marking service DOWN
                  Jan 25 05:54:04 pfsense slbd[5935]: Service WAN2FailsToWAN1 changed status, reloading filter policy
                  Jan 25 05:54:04 pfsense slbd[5935]: ICMP poll failed for rl1_EXTERNAL_IP, marking service DOWN
                  Jan 25 05:54:04 pfsense slbd[5935]: Service LoadBalance changed status, reloading filter policy
                  Jan 25 05:54:04 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:04 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:04 pfsense dhclient[52491]: bound to rl1_EXTERNAL_IP -- renewal in 1800 seconds.
                  Jan 25 05:54:06 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:06 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:07 pfsense check_reload_status: rc.newwanip starting
                  Jan 25 05:54:08 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:08 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:09 pfsense php: : Informational: rc.newwanip is starting .
                  Jan 25 05:54:09 pfsense php: : rc.newwanip working with (IP address: rl0_EXTERNAL_IP) (interface: wan) (interface real: rl0).
                  Jan 25 05:54:09 pfsense slbd[5935]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP
                  Jan 25 05:54:09 pfsense slbd[5935]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP
                  Jan 25 05:54:09 pfsense slbd[5935]: Service WAN2FailsToWAN1 changed status, reloading filter policy
                  Jan 25 05:54:09 pfsense slbd[5935]: Service LoadBalance changed status, reloading filter policy
                  Jan 25 05:54:10 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:10 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:10 pfsense slbd[5935]: ICMP poll failed for rl1_EXTERNAL_IP, marking service DOWN
                  Jan 25 05:54:11 pfsense slbd[5935]: Service WAN1FailsToWAN2 changed status, reloading filter policy
                  Jan 25 05:54:12 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:12 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:14 pfsense php: : Informational: DHClient spawned /etc/rc.newwanip and the new ip is wan - rl0_EXTERNAL_IP.
                  Jan 25 05:54:14 pfsense php: : Creating rrd update script
                  Jan 25 05:54:14 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:14 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:16 pfsense slbd[5935]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP
                  Jan 25 05:54:16 pfsense slbd[5935]: Service WAN1FailsToWAN2 changed status, reloading filter policy
                  Jan 25 05:54:16 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:16 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:17 pfsense php: : Resyncing configuration for all packages.
                  Jan 25 05:54:18 pfsense kernel: arplookup rl1_EXTERNAL_IP failed: host is not on local network
                  Jan 25 05:54:18 pfsense kernel: arpresolve: can't allocate route for rl1_EXTERNAL_IP
                  Jan 25 05:54:18 pfsense php: : Resyncing configuration for all packages.
                  Jan 25 05:54:18 pfsense php: : pfSense package system has detected an ip change rl1_EXTERNAL_IP ->  rl0_EXTERNAL_IP ... Restarting packages.
                  Jan 25 05:54:21 pfsense php: : Configuring slbd
                  Jan 25 05:54:21 pfsense check_reload_status: reloading filter
                  Jan 25 05:54:21 pfsense slbd[50991]: ICMP poll succeeded for rl0_GATEWAY_IP, marking service UP
                  Jan 25 05:54:21 pfsense slbd[50991]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP
                  Jan 25 05:54:21 pfsense slbd[50991]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP
                  Jan 25 05:54:22 pfsense slbd[50991]: ICMP poll succeeded for rl0_GATEWAY_IP, marking service UP
                  Jan 25 05:54:22 pfsense slbd[50991]: ICMP poll succeeded for rl0_GATEWAY_IP, marking service UP
                  Jan 25 05:54:22 pfsense slbd[50991]: ICMP poll succeeded for rl1_EXTERNAL_IP, marking service UP
                  Jan 25 05:54:26 pfsense check_reload_status: updating dyndns
                  
                  

                  And then everything began once again…

                  Are you sure your 10/100 interfaces are RTL8100C? I think the 8100C should be controlled by the re driver. I suspect that you have a number of 8139 ports.

                  No. That's what page V "Capter 4 Software Installation - LAN Utility & Driver (RTL8100C & RTL8110S)" of the manual says <http: download.fabiatech.com.tw="" manual="" m5620.pdf="">. Pfsense apparently detects them as RTL8139:

                  pfsense kernel: rl1: <realtek 10="" 8139="" 100basetx="">port 0xe800-0xe8ff mem 0xdffffe00-0xdffffeff irq 11 at device 13.0 on pci0

                  Why do you think that the RTL8100C should be controlled by the re driver?

                  I don't see any mention of the RTL8100C on the freeBSD hardware pages <http: www.freebsd.org="" releases="" 6.3r="" hardware-i386.html="">or <http: www.freebsd.org="" relnotes="" current="" hardware="" i386="" support.html#ethernet="">, but the other rtl100c port that I have in use works perfectly (rl0, also detected as an 8139).</http:></http:></realtek></http:>

                  1 Reply Last reply Reply Quote 0
                  • W
                    wallabybob
                    last edited by

                    I'm using a Zyxel 660H-61 as a router, not half bridge.

                    My pfSense has a Jetway mini-ITX board and has a daughter board with a RTL8139 NIC on it. The same model daughter board is now promoted as having a RTL8100C NIC on it. I suspect the 8100C is promoted as a "plug in " replacement for the 8139 and the the newer variants of the daughterboard are the same as the older variants except the 8100C has replaced the 8139. MAYBE a similar thing has happened with your box and you have an older box and have been looking at newer documentation.

                    I think the 8100C should be controlled by the rl driver because I've been looking at the re and rl driver sources recently. The rl driver appears to include recognition of an 8100 device. If you like we can go into the details, but I just wanted to warn you that your NICs may not have quite what you think they are.

                    1 Reply Last reply Reply Quote 0
                    • R
                      rigius
                      last edited by

                      @wallabybob:

                      I'm using a Zyxel 660H-61 as a router, not half bridge.

                      Ok. So it's pointless to try router mode here, something I had been considering. Thankyou.

                      MAYBE a similar thing has happened with your box and you have an older box and have been looking at newer documentation.

                      Yes, that's possible. That's why I said that I was not sure of my network cards ;-)

                      I have sent the question to Fabiatech. They say on their site that they will reply in 24 hours…

                      I think the 8100C should be controlled by the rl driver because I've been looking at the re and rl driver sources recently. The rl driver appears to include recognition of an 8100 device.

                      Even if system.log says that our box has RTL8139 ports, they are controlled byt the rl driver (rl0 & rl1). That seems to indicate that they are 8100C.

                      1 Reply Last reply Reply Quote 0
                      • R
                        rigius
                        last edited by

                        @wallabybob:

                        This sounds like a hardware problem or a driver bug. Do you get the same behaviour on other ports?

                        Yes, at least on the 10/100 ports (RTL8100C or RTL8139).

                        1 Reply Last reply Reply Quote 0
                        • R
                          rigius
                          last edited by

                          @wallabybob:

                          When this happens can you ssh into the pfSense box?

                          Yes.

                          If you can login while this is happening, type command

                          netstat -i -I rl1

                          a few times and watch what happens to the rl1 error counters.

                          Not very much, I think:

                          
                          Name    Mtu Network       Address              Ipkts Ierrs    Opkts Oerrs  Coll
                          rl1    1500 <link#2>00:04:a7:__:__:__    54513     0    27948     0    94
                          rl1    1500 fe80:2::___:_ fe80:2::___:____:        0     -        1     -     -
                          rl1    1500 80.___.___.0   LRouen-___-___-9-2      410     -        0     -     -
                          [... after about 5 minutes]
                          Name    Mtu Network       Address              Ipkts Ierrs    Opkts Oerrs  Coll
                          rl1    1500 <link#2>00:04:a7:__:__:__    54520     0    27954     0    94
                          rl1    1500 fe80:2::___:_ fe80:2::___:____:        0     -        1     -     -
                          rl1    1500 80.___.___.0   LRouen-___-___-9-2      582     -        0     -     -</link#2></link#2> 
                          
                          1 Reply Last reply Reply Quote 0
                          • R
                            rigius
                            last edited by

                            Ok. I have no more ideas.

                            No answer from Fabiatech after their "guaranteed 24 hour reply delay" (not too surprised).

                            "pciconf -lv" shows:

                            
                            rl0@pci0:0:12:0:	class=0x020000 card=0x813910ec chip=0x813910ec rev=0x10 hdr=0x00
                                class      = network
                                subclass   = ethernet
                            
                            

                            So apparently the 10/100 cards are "RT8139 (A/B/C/810x/813x/C+) Fast Ethernet Adapter" and they seem to be supported by the "rl" driver. I have found several posts showing problems with this card under freeBSD 7.0 that were solved after upgrading to freeBSD 7.1 (like this one: <http: daemonforums.org="" showthread.php?p="10070">.

                            I'm not very willing to go back to 1.2, so any idea of when pfsense 1.2.3 will be released?</http:>

                            1 Reply Last reply Reply Quote 0
                            • jahonixJ
                              jahonix
                              last edited by

                              Have you tried a current pfSense 1.2.3 Pre-Release snapshot to see if it helps with your problem?
                              http://blog.pfsense.org/?p=364

                              1 Reply Last reply Reply Quote 0
                              • R
                                rigius
                                last edited by

                                No. This is a "production environment" that keeps working even if it's somewhat handicapped and the statement "Please test in a non-production environment and let us know how it goes on the forum." scares me. Unfortunately, I have only one of these Fabiatech devices :-(

                                1 Reply Last reply Reply Quote 0
                                • P
                                  Perry
                                  last edited by

                                  If it is a on board nic I would check for bios update.

                                  /Perry
                                  doc.pfsense.org

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    MrEmbedded
                                    last edited by

                                    I am having a similar issue with a new 1.2.2 box.  The buildup was done on an old Dell Optiplex 260 with P4 and 512MB RAM plus the onboad NIC (lan).  Its a multiwan setup with pppoe on the wan and cable on opt1.  Both are on pci dlink DFE-538TX cards (brand spanking new).  The vr0 card (pppoe) has a bunch of collisions.  After 7 hours the total collision count is around 1700.  I get the same style arpresolve messages in the logs and the load balancer is constantly up/downing links (in the logs).  I plan to leave things alone overnight to see if it is a provider issue that sorts itself out as I am not sure what else to look at.  That is the only thing that is wrong with this box, everything else is working fine it seems.  For now I have added some outgoing lan rules to direct the critical traffic toward the cable modem interface until I figure this out.

                                    One more thing, the modem is at least 4 years old on the pppoe connection so I am wondering if it is starting to get flakey.

                                    1 Reply Last reply Reply Quote 0
                                    • R
                                      rigius
                                      last edited by

                                      @Perry:

                                      If it is a on board nic I would check for bios update.

                                      Done. I was already on the last version.

                                      I was also having problems with DNS resolution (SLOOOOOW, in the order of several seconds), so I have transfered everything I could to the working WAN connection and now I'm rebooting the pfsense box from time to time and just waiting for the 1.2.3 release, hoping that it will fix this problem.

                                      1 Reply Last reply Reply Quote 0
                                      • M
                                        MrEmbedded
                                        last edited by

                                        My issue was completely resolved by a modem replacement.  Thought I would mention that.

                                        1 Reply Last reply Reply Quote 0
                                        • R
                                          rigius
                                          last edited by

                                          Hi,

                                          I tried to make half bridge work on OPT1 (apparently Realtek 8139 recognised as rl1) with versions 1.2.1 and 1.2.3-RC1 using two different modems (Linksys AM200 and Zyxel P660R). I was not successful, even if everything worked perfectly with pfsense v1.2. Finally, I put one of the modems on router mode and pfsense in its DMZ and since then everything is working fine.

                                          Cheers,

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.