Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Chinese I226-V on 23.05.1, problems

    Scheduled Pinned Locked Moved Hardware
    74 Posts 5 Posters 13.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • w0wW
      w0w
      last edited by

      I have two firewalls working in CARP, both are connected to one Buffalo BS-MP2008 managed switch on the LAN side and have i226-V intel cards installed that are used for both LAN and WAN, WAN side also connected to one cheap unmanaged 2.5Gbit switch. Statistics on the WAN side seems OK, no errors, no collisions. Statistics on the LAN side have errors and collisions, where error number is different for both firewalls and accepted since it was always have been, but collision number is same for both firewalls and have not been appeared with other cards I have been tried—I225-V and X550 and realtek before, no collisions were reported, only erorrs.
      I replaced 225 with 226 in order to diagnose why sometimes the WAN (igc0) port state is shown as down, once a day, which causes the connection to break. The funny thing is, the connection is broken on the WAN (igc0) interface only when the suricata is running on the LAN (igc1) interface in Inline mode, dunno, maybe some pci-e controller buffer overrun, yes it looks like those card have some asmedia pcie controller on board.
      The cards are, of course, a Chinese product from AliExpress, but there are no problems with it under load, like iperf testing for hours, the connection usually breaks randomly once a day or two, regardless of the load with suricata inline mode enabled. With the suricata disabled, everything can work for weeks without those disconnections.
      Yes, I know that buying network cards on aliexpress is not a good idea, but I'm not entirely sure that this is the case at all.
      Does anyone else using those 225/226 cards, may be embedded and have seen something similar?

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Never seen that on the i226 or i225 NICs in the Netgate devices.
        When it shows as down is it actually down? Does the NIC link LED show down? The other end of the link show a disconnection?

        The LAN NIC still works? Even though that's the side running with netmap?

        Steve

        w0wW 1 Reply Last reply Reply Quote 1
        • w0wW
          w0w @stephenw10
          last edited by

          @stephenw10
          Probably explained rather chaotically 🙄
          This down/up happening in millisecond, I think, and randomly, so I did not check the LED status, this looks like mission possible, only if I put some camera to record this.
          The second statement about suricata was definitely wrong, it just changed the period of those random disconnections, sometimes it is about 48+ hours, but I did not tested this for a long time.

          @stephenw10 said in Chinese I226-V on 23.05.1, problems:

          The LAN NIC still works? Even though that's the side running with netmap?

          So far, igc1 worked just fine. I am planning to do some test. Since I have two WANs and different NIC brands, I want to swap this igc0 with re0, so PPPoE moves to the re0 and will see what happening then.

          BTW on the main unit I have now installed x550-T2 and running PPPoE on ix0 for more than 5 days rock stable, all other setting are the same, just reconfigured interfaces and port speeds.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Hmm, so what actually happens when it loses link? You see it logged I assume? Does it instantly reconnect?

            w0wW 1 Reply Last reply Reply Quote 0
            • w0wW
              w0w @stephenw10
              last edited by

              @stephenw10
              In logs, I see igc0 down event, then a bunch of other events that happening right after, like PPPoE disconnection and so on, nothing unusual, like you just pulled the cable out for a second, maybe, I don't know. The port down event is somewhere between all this mess, I believe a bit later then down even. Currently, I have no logs anymore, sorry, but I remember, it was in the same minute at least, if not the same second.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                What's it actually connected to? Can you put a switch between as a test?

                w0wW 2 Replies Last reply Reply Quote 1
                • w0wW
                  w0w @stephenw10
                  last edited by

                  @stephenw10
                  Tried 3 different switches. Currently connected to 2.5G dumb zyxel switch, tried 1G some tp-link and 2.5G tp-link. No difference at all.

                  1 Reply Last reply Reply Quote 0
                  • w0wW
                    w0w @stephenw10
                    last edited by w0w

                    @stephenw10
                    Hmm....
                    After six days, something similar happened now on the ix0 interface on the main unit

                    Jul 24 02:10:11 	kernel 		ix0: link state changed to UP
                    Jul 24 02:10:11 	check_reload_status 	480 	Linkup starting ix0
                    Jul 24 02:10:11 	check_reload_status 	480 	Reloading filter
                    Jul 24 02:10:11 	php-fpm 	15509 	/rc.linkup: Removing static route for monitor 8.8.8.8 and adding a new route through 199.0.100.1
                    Jul 24 02:10:11 	php-fpm 	15509 	/rc.linkup: Shutting down Router Advertisement daemon cleanly
                    Jul 24 02:10:11 	ppp 	39147 	[wan] IFACE: Set description "WAN"
                    Jul 24 02:10:11 	ppp 	39147 	[wan] IFACE: Rename interface pppoe0 to pppoe0
                    Jul 24 02:10:11 	ppp 	39147 	[wan] IFACE: Down event
                    Jul 24 02:10:11 	check_reload_status 	480 	Rewriting resolv.conf
                    Jul 24 02:10:09 	php-cgi 	1355 	rc.kill_states: rc.kill_states: Removing states for interface pppoe0
                    Jul 24 02:10:09 	php-cgi 	1355 	rc.kill_states: rc.kill_states: Removing states for IP fe80::a236:9fff:fec3:4a2c%pppoe0/32
                    Jul 24 02:10:09 	ppp 	39147 	[wan] IPV6CP: LayerDown
                    Jul 24 02:10:09 	ppp 	39147 	[wan] error writing len 8 frame to b0: Network is down
                    Jul 24 02:10:09 	ppp 	39147 	[wan] IPV6CP: SendTerminateReq #2
                    Jul 24 02:10:09 	ppp 	39147 	[wan] IPV6CP: state change Opened --> Closing
                    Jul 24 02:10:09 	ppp 	39147 	[wan] IPV6CP: Close event
                    Jul 24 02:10:09 	ppp 	39147 	[wan] IFACE: Removing IPv4 address from pppoe0 failed(IGNORING for now. This should be only for PPPoE friendly!): Can't assign requested address
                    Jul 24 02:10:09 	check_reload_status 	480 	Rewriting resolv.conf
                    Jul 24 02:10:09 	php-cgi 	98827 	rc.kill_states: rc.kill_states: Removing states for interface pppoe0
                    Jul 24 02:10:08 	php-cgi 	98827 	rc.kill_states: rc.kill_states: Removing states for IP xx.yy.21.204/32
                    Jul 24 02:10:08 	ppp 	39147 	[wan] IPCP: LayerDown
                    Jul 24 02:10:08 	ppp 	39147 	[wan] error writing len 8 frame to b0: Network is down
                    Jul 24 02:10:08 	ppp 	39147 	[wan] IPCP: SendTerminateReq #4
                    Jul 24 02:10:08 	ppp 	39147 	[wan] IPCP: state change Opened --> Closing
                    Jul 24 02:10:08 	ppp 	39147 	[wan] IPCP: Close event
                    Jul 24 02:10:08 	ppp 	39147 	[wan] IFACE: Close event
                    Jul 24 02:10:08 	ppp 	39147 	caught fatal signal TERM
                    Jul 24 02:10:07 	php-fpm 	15509 	/rc.linkup: DEVD Ethernet detached event for opt1
                    Jul 24 02:10:07 	php-fpm 	15509 	/rc.linkup: Hotplug event detected for ISP_LAN(opt1) dynamic IP address (4: dhcp)
                    Jul 24 02:10:05 	kernel 		ix0: link state changed to DOWN
                    Jul 24 02:10:05 	check_reload_status 	480 	Linkup starting ix0 
                    Jul 24 01:01:46 	php-cgi 	74734 	notify_monitor.php: Message sent to -@gmail.com OK 
                    

                    Those two lines in question…

                    Jul 24 02:10:05 	kernel 		ix0: link state changed to DOWN
                    Jul 24 02:10:05 	check_reload_status 	480 	Linkup starting ix0 
                    

                    Since timestamp is the same…
                    check_reload_status I believe that happened later because on ix0 down, is not it? 🙄

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Do the Suricata logs show it restarting when that happens? In inline mode (netmap) it will bounce the link if the netmap interface is recreated. But that should be on the LAN side....

                      w0wW 1 Reply Last reply Reply Quote 0
                      • w0wW
                        w0w @stephenw10
                        last edited by w0w

                        @stephenw10
                        I will clarify and draw your attention to the fact that this is my main or primary unit where igc replaced for test with ix (x550-t2) card. There is nothing in suricata logs. I do not think it's suricata or netmap. It looks more like card or driver or some kernel part failure… or i don't know what else it can be.

                        Is there something in the FreeBSD that ix and igc can use at some level?

                        Now testing secondary unit igc0 under iperf 100Mbit load, port speed is 2500, switch is placed between test server and igc0 port.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Well the logs show the link bouncing. So either it really did bounce in which case I'd be trying to confirm that from logs in the switch. Or it was a virtual interface of some sort like netmap. The only time I have seen the link bounced by something in software (other than a NIC config change) is when using Snort or Suricata in in-line mode.

                          w0wW 1 Reply Last reply Reply Quote 1
                          • w0wW
                            w0w @stephenw10
                            last edited by

                            @stephenw10
                            Well, how can I disable netmap completely?

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Put Suricata in legacy mode. It uses netmap for in-line mode.

                              w0wW 1 Reply Last reply Reply Quote 0
                              • w0wW
                                w0w @stephenw10
                                last edited by

                                @stephenw10
                                I've removed suricata completely, but I still see
                                ix1: netmap queues/slots: TX 4/2048, RX 4/2048 in dmesg output for all the cards, is this normal?

                                stephenw10S 1 Reply Last reply Reply Quote 0
                                • w0wW
                                  w0w
                                  last edited by

                                  I don’t want to jump to conclusions, but at the moment there is a suspicion that there is some kind of dependence between these connection breaks and the netmap, which is built into the kernel, as I understand it, and PPPoE. I can’t imagine what kind of dependence, but if the port is used on 226 as a normal DHCP through a similar connection, as in the case of PPPoE, the link is stable.
                                  At the moment, I have replaced both cards on both testlab firewalls with the original Intel X550-T2 running latest firmware.
                                  Also I removed the suricata, and we will see if the situation repeats itself next month.

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator @w0w
                                    last edited by

                                    @w0w said in Chinese I226-V on 23.05.1, problems:

                                    is this normal?

                                    Yes. The driver shows that as available queues when it attaches:

                                    ix0: <Intel(R) X553 N (SFP+)> mem 0x80400000-0x805fffff,0x80604000-0x80607fff at device 0.0 on pci9
                                    ix0: Using 2048 TX descriptors and 2048 RX descriptors
                                    ix0: Using 4 RX queues 4 TX queues
                                    ix0: Using MSI-X interrupts with 5 vectors
                                    ix0: allocated for 4 queues
                                    ix0: allocated for 4 rx queues
                                    ix0: Ethernet address: 00:08:a2:12:17:7e
                                    ix0: eTrack 0x8000084b PHY FW V65535
                                    ix0: netmap queues/slots: TX 4/2048, RX 4/2048
                                    

                                    That doesn't mean that netmap itself is in use.

                                    1 Reply Last reply Reply Quote 1
                                    • w0wW
                                      w0w
                                      last edited by

                                      Preliminary information on one of the firewalls — X550-T2 works without problems. If the connection is interrupted, then only from the provider, every known amount of days. Link never going down as it did with igc.

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Hmm, so the physical link stays up and only the PPPoE is restarted?

                                        w0wW 1 Reply Last reply Reply Quote 0
                                        • w0wW
                                          w0w @stephenw10
                                          last edited by

                                          @stephenw10
                                          Yes, exactly.

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Hmm, I have no idea what would cause that on igc but not ix. Unless it's actually the igc link dropping causing PPPoE to reset. I've never seen it on any of our igc NICs but that is the reported symptom from those early igc NICs in Linux or Windows.

                                            w0wW 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.