Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    WAN interface cycle thought down and up state

    General pfSense Questions
    3
    15
    860
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      Draiget @Gertjan
      last edited by

      @gertjan
      Reasonable with Mellanox, I'll try Intel one, but such pfSense behavior is not looking good, so in case if NIC had a problems which may occur once in a while, whole firewall will go offline "just because"? I believe we can rate-limit such infinite-loops in check_reload_status or whatever calls that function.

      GertjanG 1 Reply Last reply Reply Quote 0
      • GertjanG
        Gertjan @Draiget
        last edited by Gertjan

        @draiget

        Hummm.

        Image this : your "mlxen" UP en DOWN boncing has nothing to do with "Could not connect to /var/run/php-fpm.socket"
        The latter is a 'socket file', created by the PHP process, and used, amongst others by nginx, the GUI, so it can 'use and speak' PHP.

        The PHP (php-fpm) process should be running since system boot.
        It (nginx, php-fpm) might get restarted when something happens with an interface, like a link going DOWN to UP, but these are rather rare events.

        I guess these

        Jun 20 17:46:42 fw1 check_reload_status[60338]: Could not connect to /var/run/php-fpm.socket

        will be gone as soon as you use NIC's that work.

        @draiget said in WAN interface cycle thought down and up state:

        so in case if NIC had a problems which may occur once in a while, whole firewall will go offline "just because"?

        Like a car. Remove just one wheel (out of 4 or more) while speeding on the high way.
        This WILL influence your driving comfort.

        edit :

        Another - better ;) - example :

        A switch accepts far more easily the fact you remove a cable, or put one back in : a switch does not contain 'programs' but shift, compare, lookup registers. They will get reset set flushed whatever during a clock cycle of the switch.
        A software router (as is pfSense) is another beast : a huge bunch of process 'have to know' that an interface went down, or came back. This often means : it's restarted with the new situation as initial parameters.
        Thus a very good reasons to stop flapping interfaces.

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          @draiget said in WAN interface cycle thought down and up state:

          mlx4_en

          Mmm, I would definitely try a different NIC first. I have an older Mellanox card that initially seemed promising but it always behaved strangely. There's a lot going on with those cards. It could be a firmware or firmware config issue even.

          Steve

          1 Reply Last reply Reply Quote 0
          • D
            Draiget @Gertjan
            last edited by

            @gertjan said in WAN interface cycle thought down and up state:

            Most easy solution : use another NIC.

            What NIC will work fine as WAN?
            I have Intel X520-DA2 but it does not working either (unsupported sfp, boot options have no affect on it).

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Doesn't work in what way? What are you connecting to it?

              I would expect that NIC to work fine.

              Steve

              D 1 Reply Last reply Reply Quote 0
              • D
                Draiget @stephenw10
                last edited by

                @stephenw10 said in WAN interface cycle thought down and up state:

                Doesn't work in what way? What are you connecting to it?

                I would expect that NIC to work fine.

                Steve

                I'm not sure it should work my way, but I use it as a WAN for ISP uplink (not more than 500 meters to closest switch). DLink media-converter and MLX worked fine that way before, I see this is -DA card, which is probably only for a SAN connection :
                Actually, I have these problems with MLX only now when is pretty hot outside, last year it was fine, it's just not able to handle up/down glitches properly, but Intel one just stay silent.

                From dmesg it seems fine and interface are visible in both UI and ifconfig:

                ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.3.24> port 0xecc0-0xecdf mem 0xdf300000-0xdf37ffff,0xdf2f8000-0xdf2fbfff irq 38 at device 0.0 on pci6
                ix0: Using MSI-X interrupts with 9 vectors
                ix0: Ethernet address: 90:e2:ba:74:96:5c
                ix0: PCI Express Bus: Speed 5.0GT/s Width x8
                ix1: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.3.24> port 0xece0-0xecff mem 0xdf380000-0xdf3fffff,0xdf2fc000-0xdf2fffff irq 45 at device 0.1 on pci6
                ix1: Using MSI-X interrupts with 9 vectors
                ix1: Ethernet address: 90:e2:ba:74:96:5d
                ix1: PCI Express Bus: Speed 5.0GT/s Width x8
                

                But it stays in no carrier mode, maybe because it need different SFP modules.

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  What module are you trying to use?

                  Does it show the module is present in: ifconfig -vvvm ix0

                  D 1 Reply Last reply Reply Quote 0
                  • D
                    Draiget @stephenw10
                    last edited by

                    @stephenw10 said in WAN interface cycle thought down and up state:

                    ifconfig -vvvm ix0

                    ix0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
                            description: WAN_IX0
                            options=e503bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
                            capabilities=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
                            ether 90:e2:ba:74:96:5c
                            inet6 fe80::92e2:baff:fe74:965c%ix0 prefixlen 64 scopeid 0x5
                            media: Ethernet autoselect
                            status: no carrier
                            supported media:
                                    media autoselect
                            nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                            plugged: SFP/SFP+/SFP28 Unknown (SC)
                            vendor: Gateray PN: GR-S1-W313S-D SN: W19090202129 DATE: 2019-09-03
                            module temperature: 53.00 C Voltage: 3.30 Volts
                            RX: 0.04 mW (-13.80 dBm) TX: 0.15 mW (-8.02 dBm)
                    
                            SFF8472 DUMP (0xA0 0..127 range):
                            03 04 01 00 00 00 00 12 00 01 01 01 0D 00 03 1E
                            00 00 00 00 47 61 74 65 72 61 79 20 20 20 20 20
                            20 20 20 20 00 00 00 00 47 52 2D 53 31 2D 57 33
                            31 33 53 2D 44 20 20 20 31 2E 30 20 05 1E 00 93
                            00 1A 00 00 57 31 39 30 39 30 32 30 32 31 32 39
                            20 20 20 20 31 39 30 39 30 33 20 20 68 F0 01 F3
                            2D 00 11 FB 5D 59 65 F4 D2 C7 92 AC 1A 76 D5 93
                            78 65 66 00 00 00 00 00 00 00 00 00 A1 AB DE E6
                    
                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Ok, well, good news: It allows the NIC to attach. It can talk to the module. The module sees incoming signal.

                      Bad news: It doesn't offer any fixed link speeds and that looks like a 1G module. It's common for an ix card to requite setting to 1G fixed to link at 1G.

                      The only option you may have there is to set the available advertised speeds to 1G only:
                      Create the file /boot/loader.conf.local
                      Add to it:

                      hw.ix.advertise_speed=2
                      

                      Reboot. Then check sysctl -a | grep advertise_speed

                      It's not always effective though. For example:

                      [21.05-RELEASE][admin@7100.stevew.lan]/root: sysctl -a | grep advertise_speed
                      hw.ix.advertise_speed: 2
                      dev.ix.3.advertise_speed: 0
                      dev.ix.2.advertise_speed: 0
                      dev.ix.1.advertise_speed: 0
                      dev.ix.0.advertise_speed: 7
                      dev.ixl.1.advertise_speed: 6
                      dev.ixl.0.advertise_speed: 6
                      

                      Steve

                      D 1 Reply Last reply Reply Quote 0
                      • D
                        Draiget @stephenw10
                        last edited by

                        @stephenw10

                        There's some interesting messages in dmesg:

                        ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.3.24> port 0xecc0-0xecdf mem 0xdf300000-0xdf37ffff,0xdf2f8000-0xdf2fbfff irq 34 at device 0.0 on pci4
                        ix0: Using MSI-X interrupts with 9 vectors
                        ix0: Ethernet address: 90:e2:ba:74:96:5c
                        ix0: PCI Express Bus: Speed 5.0GT/s Width x4
                        ix0: Advertised speed can only be set on copper or multispeed fiber media types.
                        Setting sysctl dev.ix.0.advertise_speed failed: 22
                        

                        Looks like it doesn't work well:

                        ix0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                description: WAN_IX0
                                options=8500b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO>
                                capabilities=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
                                ether 90:e2:ba:74:96:5c
                                inet6 fe80::92e2:baff:fe74:965c%ix0 prefixlen 64 scopeid 0x5
                                media: Ethernet autoselect
                                status: no carrier
                                supported media:
                                        media autoselect
                                nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                        

                        But in case of Mellanox it works fine:

                        mlxen0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                description: WAN_MLX0
                                options=ed03bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
                                ether f4:52:14:7a:0d:70
                                inet6 fe80::f652:14ff:fe7a:d70%mlxen0 prefixlen 64 scopeid 0xc
                                media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>)
                                status: active
                                nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                        

                        I had an issue with fiber optics, yesterday it was fixed (I hope for a longer time) and MLX now works without issues, but having Intel one I think is better to use it to prevent such up/down stuff in future :)

                        Any ideas? Maybe patch driver to use only 1G (build it locally)?

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          If you run ifconfig -vvvm against the Mellanox NIC does it show different media options available?

                          Anything is possible it's just a small matter of programming. ๐Ÿ˜‰
                          Not something I've seen attempted though.

                          Steve

                          D 1 Reply Last reply Reply Quote 0
                          • D
                            Draiget @stephenw10
                            last edited by

                            @stephenw10

                            Yes, there's different options for mlx:

                            mlxen0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                    description: WAN_MLX0
                                    options=ed03bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
                                    capabilities=ed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
                                    ether f4:52:14:7a:0d:70
                                    inet6 fe80::f652:14ff:fe7a:d70%mlxen0 prefixlen 64 scopeid 0xc
                                    media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>)
                                    status: active
                                    supported media:
                                            media autoselect
                                            media 40Gbase-CR4 mediaopt full-duplex
                                            media 10Gbase-CX4 mediaopt full-duplex
                                            media 10Gbase-SR mediaopt full-duplex
                                            media 1000baseT mediaopt full-duplex
                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                            
                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Hmm, not sure why the ix NIC doesn't see it then. ๐Ÿ˜•

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.