Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense randomly loses connection, and reboot is only solution.

    Scheduled Pinned Locked Moved Hardware
    28 Posts 7 Posters 10.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      rocketdog
      last edited by

      Forget to say that the USB-NIC is Asix ax88772b, using axe_4.ko

      [2.1-RELEASE][admin@firewall.ninya.org]/root(4): usbconfig -u 4 dump_device_desc
      
      ugen4.2: <product 0x772a="" vendor="" 0x0b95="">at usbus4, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON
      
        bLength = 0x0012
        bDescriptorType = 0x0001
        bcdUSB = 0x0200
        bDeviceClass = 0x00ff
        bDeviceSubClass = 0x00ff
        bDeviceProtocol = 0x0000
        bMaxPacketSize0 = 0x0040
        idVendor = 0x0b95
        idProduct = 0x772a
        bcdDevice = 0x0001
        iManufacturer = 0x0001  <asix elec.="" corp.="">iProduct = 0x0002  <ax88x72a>iSerialNumber = 0x0003  <000002>
        bNumConfigurations = 0x0001</ax88x72a></asix></product> 
      
      1 Reply Last reply Reply Quote 0
      • B
        BeerHat
        last edited by

        Did you already do the MBUF tweak?  In /boot/loader.conf… you might try setting kern.ipc.nmbclusters="32768".  The default I believe is 0.  You'll need to reboot the firewall after the change.

        This fixed some goofy NIC behavior in 2 of my remote office deployments.

        1 Reply Last reply Reply Quote 0
        • M
          mikeisfly
          last edited by

          I had the same issue using a USB nic. The fix was to ditch the USB nic and do both lan and wan on the same port using vlans. Of course you will need a switch capable of vlans to make this work.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            @rocketdog:

            Forget to say that the USB-NIC is Asix ax88772b, using axe_4.ko

            Do you just mean its using the axe driver or you have actually loaded some alternative kernel module? I'm not familiar with 'axe_4.ko'.

            Steve

            1 Reply Last reply Reply Quote 0
            • R
              rocketdog
              last edited by

              @BeerHat:

              Did you already do the MBUF tweak?  In /boot/loader.conf… you might try setting kern.ipc.nmbclusters="32768".  The default I believe is 0.  You'll need to reboot the firewall after the change.

              This fixed some goofy NIC behavior in 2 of my remote office deployments.

              Thanks! I'll give this a try.

              P.S. The FW hasn't dropped in 24 hours now!

              @mikeisfly:

              I had the same issue using a USB nic. The fix was to ditch the USB nic and do both lan and wan on the same port using vlans. Of course you will need a switch capable of vlans to make this work.

              The thing is, I don't want to get rid of my USB-nic.

              @stephenw10:

              Do you just mean its using the axe driver or you have actually loaded some alternative kernel module? I'm not familiar with 'axe_4.ko'.

              Steve

              Yeah, I just meant it's using the axe driver. I dunno why I wrote "axe_4.ko"

              1 Reply Last reply Reply Quote 0
              • R
                rocketdog
                last edited by

                I noticed this during bootup:

                ukphy0: <generic ieee="" 802.3u="" media="" interface=""> PHY 16 on miibus1
                ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
                ue0: <usb ethernet=""> on axe0
                usb_alloc_device: set address 2 failed (USB_ERR_STALLED, ignored)
                usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_STALLED
                ZFS NOTICE: Prefetch is disabled by default on i386 -- to enable,
                            add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
                ZFS WARNING: Recommended minimum kmem_size is 512MB; expect unstable behavior.
                             Consider tuning vm.kmem_size and vm.kmem_size_max
                             in /boot/loader.conf.</usb></generic>
                

                Anyone knows what those STALLED-errors means?

                And how should the loader.conf look?

                vm.kmem_size="4355443200"
                vm.kmem_size_max="4355443200"
                

                or```
                vm.kmem_size=4355443200
                vm.kmem_size_max=4355443200

                1 Reply Last reply Reply Quote 0
                • R
                  rocketdog
                  last edited by

                  Google didnt gave me much, so I give it a shot here:
                  Another problem I've found is that I can't reach http://192.168.0.1 with Mozilla (with or without safemode), only through Chrome. With Mozilla it just keeps "reading 192.168.0.1", while in Chrome it's in matter of milliseconds.

                  And by the way, the comp I'm running the current FW on is an Dell SX280, I'm not sure if you guys can see it directly through dmesg, but I cannot get the thermal sensors to work. Any ideas?

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    The thermal sensors dashboard widget (I assume that's what you mean?) relies on the sensor selection in System: Advanced: Miscellaneous:. Since the SX280 is pre-Core architecture it can only use ACPI to read the CPU temperature which relies on Dell having written a bios that passes that info to a non Windows OS.
                    If you just want to know what the CPU temp is you can probably use the mbmon FreeBSD package to read it but it doesn't talk to the dashboard widget.

                    Steve

                    1 Reply Last reply Reply Quote 0
                    • R
                      rocketdog
                      last edited by

                      @stephenw10:

                      The thermal sensors dashboard widget (I assume that's what you mean?) relies on the sensor selection in System: Advanced: Miscellaneous:. Since the SX280 is pre-Core architecture it can only use ACPI to read the CPU temperature which relies on Dell having written a bios that passes that info to a non Windows OS.
                      If you just want to know what the CPU temp is you can probably use the mbmon FreeBSD package to read it but it doesn't talk to the dashboard widget.

                      Steve

                      Excatly. After a lot of testing, I've come to realize that neither the ACPI-thing nore mbmon works.

                      [2.1-RELEASE][admin@firewall.ninya.org]/root(4): mbmon
                      ioctl(smb0:open): No such file or directory
                      No Hardware Monitor found!!
                      InitMBInfo: Bad file descriptor
                      
                      
                      1 Reply Last reply Reply Quote 0
                      • R
                        rocketdog
                        last edited by

                        I'm not sure what wrong, but suddenly my FW starts to act like a monkey. It loses all connection to my gateway, and the only solution is, as topis say, reboot. 'dmesg' gives nothing. Anyone got a clue? This is really frustrating since I'm running TOR, http, mumble etc.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          So you can still access the webgui? And can you, like previously, still ping the gateway?

                          What is the gateway? Is the pfSense box behind another router or does it have a public IP on WAN? How is it connected?

                          You need to methodically go through and identify exactly which part of the connection is failing. What you can ping and what you can't. Is DNS working? Does the routing table look reasonable?

                          It's interesting that neither ACPI or mbmon work with that box. There must be a few people running identical hardware since they are so common. There will at least be people running FreeBSD on it. Do you have the most recent bios?

                          Steve

                          1 Reply Last reply Reply Quote 0
                          • R
                            rocketdog
                            last edited by

                            Yeah, I can access the webGui. The GW is down, cannot be pinged (red-flagged @ webGui) etc. DNS:es can't be reached..

                            I have a static ip, static GW, and the FW is connected directly to the "wall-jack".

                            Not sure about the BIOS, but could take a look..Been thinking about things getting overheated, but then why would just the GW drop?
                            The box seems no the have any support for temp-support, so I have no idea about if it is a overheat-problem..

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Ok so the gateway device is some box at your ISP?
                              When the gateway is marked down there will be something in the system logs. Often it will report the reason for marking it down as either excessive packet loss or delay. If it does not it usually means the connection has gone down. Is the WAN interface still showing as UP?
                              It may be the remote box doesn't like being pinged continuously. You could try altering the ping interval (1s be default) or disabling apinger entirely in System: Routing: Gateways: (edit gateway - advanced section for apinger tuning).

                              Steve

                              1 Reply Last reply Reply Quote 0
                              • D
                                dreamslacker
                                last edited by

                                @rocketdog:

                                Not sure about the BIOS, but could take a look..Been thinking about things getting overheated, but then why would just the GW drop?

                                If you do a Dmesg after the GW drops, does it show the link flapping on the ue0 nic?

                                If so, you either have a failing NIC or just general instability with the USB NIC (these aren't exactly what I would consider to be stable).

                                If the System Logs do not show Apinger alarm, restarting the NIC and followed by a filter reload, then the problem probably lies with the NIC (I've had this with a failing NIC before).

                                1 Reply Last reply Reply Quote 0
                                • R
                                  rocketdog
                                  last edited by

                                  @dreamslacker:

                                  If you do a Dmesg after the GW drops, does it show the link flapping on the ue0 nic?

                                  If so, you either have a failing NIC or just general instability with the USB NIC (these aren't exactly what I would consider to be stable).

                                  If the System Logs do not show Apinger alarm, restarting the NIC and followed by a filter reload, then the problem probably lies with the NIC (I've had this with a failing NIC before).

                                  Yeah, tons of ups and downs. The FW has worked as a charm for the last few days…I tuned some stuff on a server (did some ifconfig eth0 RX downtune IIRC), and the TOR-server went down about 20%...Before, and when the FW-problems occured, the tor-relay was full throttle.. So I guess it's just this damn USB NIC. Too much traffic and it goes bananas..

                                  How do you reload filters?

                                  1 Reply Last reply Reply Quote 0
                                  • R
                                    rocketdog
                                    last edited by

                                    Looking at dmesg, I've lost connections several times…probably while I've not been using internet etc. And I have no idea for how long the ue0 is down, and why it comes back up again. Ideas?

                                    ukphy0: <generic ieee="" 802.3u="" media="" interface=""> PHY 16 on miibus1
                                    ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
                                    ue0: <usb ethernet=""> on axe0
                                    ZFS WARNING: Recommended minimum kmem_size is 512MB; expect unstable behavior.
                                                 Consider tuning vm.kmem_size and vm.kmem_size_max
                                                 in /boot/loader.conf.
                                    ZFS filesystem version 5
                                    ZFS storage pool version 28
                                    ue0: link state changed to DOWN
                                    bge0: link state changed to DOWN
                                    pflog0: promiscuous mode enabled
                                    ue0: link state changed to UP
                                    bge0: link state changed to UP
                                    ue0: link state changed to DOWN
                                    ue0: link state changed to UP
                                    ue0: link state changed to DOWN
                                    ue0: link state changed to UP
                                    ue0: promiscuous mode enabled
                                    ue0: link state changed to DOWN
                                    ue0: link state changed to UP
                                    ue0: link state changed to DOWN
                                    ue0: link state changed to UP
                                    ue0: link state changed to DOWN
                                    ue0: link state changed to UP
                                    ue0: link state changed to DOWN
                                    ue0: link state changed to UP</usb></generic>
                                    

                                    Found this! What could cause this? 1-5 minutes downtime, but no "uplink" messages?

                                    Edit: Missed to get these lines in the screenshot.

                                    Feb 10 19:35:41 	apinger: Starting Alarm Pinger, apinger(17013)
                                    Feb 10 19:35:51 	apinger: ALARM: WANGW(188.133.122.1) *** down ***
                                    Feb 10 21:20:37 	apinger: ALARM: GW_WAN(188.122.133.1) *** down ***
                                    Feb 11 00:39:54 	apinger: Starting Alarm Pinger, apinger(13674)
                                    Feb 11 00:40:05 	apinger: ALARM: WANGW(188.133.122.1) *** down ***
                                    Feb 11 03:03:19 	apinger: ALARM: GW_WAN(188.122.133.1) *** down ***
                                    Feb 11 03:31:54 	apinger: Starting Alarm Pinger, apinger(15720)
                                    Feb 11 03:32:04 	apinger: ALARM: WANGW(188.133.122.1) *** down ***
                                    Feb 13 00:17:28 	apinger: ALARM: GW_WAN(188.122.133.1) *** loss ***
                                    Feb 13 00:47:10 	apinger: alarm canceled: GW_WAN(188.122.133.1) *** loss ***
                                    

                                    "Starting Alarm Pinger", then 10 seconds later I'm offline. Where can I find options about this thingy?

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      How far apart are these up down events?
                                      Why is it using promiscuous mode? Is it bridged?

                                      Steve

                                      1 Reply Last reply Reply Quote 0
                                      • R
                                        rocketdog
                                        last edited by

                                        @stephenw10:

                                        How far apart are these up down events?
                                        Why is it using promiscuous mode? Is it bridged?

                                        Steve

                                        Promiscuous mode? I have no idea. I've never seen options like that. It is no bridged.

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Usually a NIC would only need to use promiscuous mode if it has to be able to process frames addressed to other MACs.  This is the case if it is part of a bridge or has been used for packet capturing among others.

                                          Steve

                                          1 Reply Last reply Reply Quote 0
                                          • R
                                            rocketdog
                                            last edited by

                                            @stephenw10:

                                            Usually a NIC would only need to use promiscuous mode if it has to be able to process frames addressed to other MACs.  This is the case if it is part of a bridge or has been used for packet capturing among others.

                                            Steve

                                            According to "Diagnostics > Packet Capture" promiscuous mode is disabled. By the way did, did you see my edit on my previous post?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.