Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    NIC issues and advise

    Scheduled Pinned Locked Moved Hardware
    28 Posts 3 Posters 3.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      swansense
      last edited by swansense

      Currently have 2.7.0-RELEASE (amd64) built on Wed Jun 28 03:53:34 UTC 2023 running as vm on ESXI 7(was 6.7 up until yesterday).

      I have a NIC passthrough to the VM for the WAN connection and LAN connections is virtual. I have noticed a few times over the past few months that sometimes the internet just drops out all together and the only thing that resolves it is me restarting the ESXI host. I never have time to fully investigate as i have the kids giving out they can't go on facebook, playstation etc.. but i suspect the issue is due to the passthrough NIC.

      Output of Current NIC from pciconf -lv is

      em2@pci0:3:0:0:	class=0x020000 rev=0x06 hdr=0x00 vendor=0x8086 device=0x105e subvendor=0x8086 subdevice=0x135e
          vendor     = 'Intel Corporation'
          device     = '82571EB/82571GB Gigabit Ethernet Controller D0/D1 (copper applications)'
          class      = network
          subclass   = ethernet
      

      I have another Realtek NIC but it doesnt seem to be fully recognized. as you can see it shows as none1

      none1@pci0:11:0:0:	class=0x020000 rev=0x05 hdr=0x00 vendor=0x10ec device=0x8125 subvendor=0x10ec subdevice=0x0123
          vendor     = 'Realtek Semiconductor Co., Ltd.'
          device     = 'RTL8125 2.5GbE Controller'
          class      = network
          subclass   = ethernet
      

      So now to my questions.

      1. If the connection drops again where should i look to figure out what the issue is?
      2. Is there anyway to get the Realtek card working as i have been upgraded to a 2Gbps connection so am not fully availing of that with the current NIC. Also i need to use the x16 slot the current NIC is using for something else so need a card that PCI x1 as thats the only free slot.
      3. I read on other threads here people recommending only intel NICs and people suggesting the QNAP QXG-2G1T-I225 Single Port 2.5GbE 4-Speed Network Card. Are there any known issues with this card that i should be concerned about or alternative cards?
      4. Is there any advantage if i get a dual card and use it for both WAN and LAN?

      Thanks for any help in advance.

      GertjanG 1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Check to see if it loses link or throws any errors in the system logs.

        The 2.5G Realtek NIC needs the realtek-kmod pkg driver.

        Some early i225 chips had known issues with losing link. I think that card is an i225-LM though which should be OK.

        A dual NIC card only needs one PCIe slot. 😉 But other than that not really.

        Steve

        S 1 Reply Last reply Reply Quote 1
        • S
          swansense @stephenw10
          last edited by swansense

          @stephenw10 said in NIC issues and advise:

          realtek-kmod

          Thank you.

          When trying to install i was getting an error '"libssl.so.30" not found' so i updated to 2.7.2 and was able to install the driver but even after a restart the nic still shows as none.

          pkg install -f -y realtek-re-kmod
          Updating pfSense-core repository catalogue...
          Fetching meta.conf:   0%
          Fetching packagesite.pkg:   0%
          pfSense-core repository is up to date.
          Updating pfSense repository catalogue...
          Fetching meta.conf:   0%
          Fetching packagesite.pkg:   0%
          pfSense repository is up to date.
          All repositories are up to date.
          Checking integrity... done (0 conflicting)
          The following 1 package(s) will be affected (of 0 checked):
          
          Installed packages to be REINSTALLED:
          	realtek-re-kmod-198.00_3 [pfSense]
          
          Number of packages to be reinstalled: 1
          [1/1] Reinstalling realtek-re-kmod-198.00_3...
          [1/1] Extracting realtek-re-kmod-198.00_3: 100%      4 B   0.0kB/s    00:01
          

          there does seem to be a newer version of the driver in 13 repos. https://pkg.opnsense.org/FreeBSD:13:amd64/snapshots/latest/All/realtek-re-kmod-199.00_1.pkg

          any other suggestions?

          UPDATE

          Ignore im a dumbass and forgot to add to bottloader

          if_re_load="YES"
          if_re_name="/boot/modules/if_re.ko"

          1 Reply Last reply Reply Quote 0
          • S
            swansense
            last edited by swansense

            after install the Realtek NIC i can can get it working with the same settings as intel NIC. but if i do something like a speedtest the connection drops.

            It looks like the connection is stopped from the wan side. Maybe they dont like my NIC?

            Symptoms are the exact same as the drop out i mentioned in first post but in this case i can switch the NIC to use from WAN from realtek to Intel and it will work again. If i leave it as realtek it wont work.

            attached are the ppp logs when realtek connected and when intel is connected.
            ppp-logs.zip

            Should i be changing the frame size maybe or any suggestions?

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Hmm, so it just stops responding on the server side after 2mins? Is it possible you have some other client trying to login with the same credentials somewhere?

              Do you see the link physically go down at any point or is it just the ppp session that fails?

              S 1 Reply Last reply Reply Quote 1
              • S
                swansense @stephenw10
                last edited by

                @stephenw10 said in NIC issues and advise:

                Hmm, so it just stops responding on the server side after 2mins? Is it possible you have some other client trying to login with the same credentials somewhere?

                Do you see the link physically go down at any point or is it just the ppp session that fails?

                The link shows flashing and it show up in the UI.

                No there is definitely nothing else connected to the wan port. The wan cable come from the ISP huawei module that you can see here https://forum.netgate.com/post/1125275

                The credentials dont actually matter here as i found that even with gibberish entered the connection still comes up.

                Can you suggest any other logs to check or any other suggestions?

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  I have seen PPPoE providers that will connect you to a test account with the wrong credentials but you still need to real login to get the correct bandwidth. And perhaps other restrictions.

                  Can you see what login the Huawei router was using?

                  S 1 Reply Last reply Reply Quote 0
                  • S
                    swansense @stephenw10
                    last edited by

                    @stephenw10

                    I should be more clear here its not a hauwei router it the fiber break point. have a look at the screenshot i linked to in my last post.

                    I have been using the intel NIC for months without much issue for WAN and i am using the exact same credentials for the realtek NIC.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Hmm, hard to say then. I could imagine the NIC losing link for some reason but that would be logged. Perhaps the ONT is trying to negotiate some N-base rate and the NIC falls back to 1G?

                      S 1 Reply Last reply Reply Quote 0
                      • S
                        swansense @stephenw10
                        last edited by

                        @stephenw10 said in NIC issues and advise:

                        Hmm, hard to say then. I could imagine the NIC losing link for some reason but that would be logged. Perhaps the ONT is trying to negotiate some N-base rate and the NIC falls back to 1G?

                        If that was the case would it not be logged and would the NIC not be able to fallback to 1G?

                        As i said something similar happens with the Intel NIC but its rare.

                        Can you think of anything else i can do to debug whats going on. My ISP support is absolutely useless so i am not even going to try and waste time contacting them.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Is there nothing additionally logged in the system log when it disconnects?

                          You could run a pcap on the parent NIC when it fails and see if the server is responding at all.

                          1 Reply Last reply Reply Quote 1
                          • GertjanG
                            Gertjan @swansense
                            last edited by

                            @swansense

                            Swap the two interfaces.
                            If the Intel was LAN, and the Realtek was WAN, make the Intel now WAN, etc.
                            If the issues follows the NIC, you know it's not the cable neither the equipment in front of the NIC.

                            Also, keep in mind, when a connection becomes unstable (see the system log, the UP and DOWN events - and a realtek NIC is involved), stop whatever you are doing, throw the Realtek out of the window, get an Intel NIC, and be member of the club "should have done that earlier".

                            No "help me" PM's please. Use the forum, the community will thank you.
                            Edit : and where are the logs ??

                            1 Reply Last reply Reply Quote 1
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              To be fair the Realtek 2.5G NICs seem OK so far. I haven't seen any catastrophic behaviour...yet.

                              1 Reply Last reply Reply Quote 1
                              • S
                                swansense
                                last edited by

                                So the disconnection issue happened again with the intel NIC and i was lucky enough to get a few minutes to look into it. Seems to be a different issue than the realtek issue.

                                So from logs it looks like the PPPoe connection went down and came back up and after it came back up pfsense still have internet access(tested ping to google DNS) while everything on LAN was unable to ping google DNS.

                                After i changed the wan interface and changed it back to the intel again everything worked as expected.

                                Attached are the ppp and the logs around the time from system.

                                Archive.zip

                                GertjanG 1 Reply Last reply Reply Quote 0
                                • GertjanG
                                  Gertjan @swansense
                                  last edited by Gertjan

                                  @swansense

                                  ...
                                  Feb 14 04:10:28 pfSense php-fpm[28297]: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.16.8.10 -> 10.16.8.10 - Restarting packages.
                                  Feb 14 04:10:28 pfSense check_reload_status[425]: Starting packages
                                  Feb 14 04:10:28 pfSense check_reload_status[425]: Reloading filter
                                  Feb 14 04:10:29 pfSense php-fpm[386]: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.199.8.3 -> 10.199.8.3 - Restarting packages.
                                  ...
                                  within one seconds : 2 WAN IPs ?
                                  10.16.8.10 and 10.199.8.3 ?

                                  You use a VPN ?

                                  No "help me" PM's please. Use the forum, the community will thank you.
                                  Edit : and where are the logs ??

                                  S 1 Reply Last reply Reply Quote 1
                                  • S
                                    swansense @Gertjan
                                    last edited by swansense

                                    @Gertjan
                                    yes there is a number of openvpn clients running on this machine. those connections are separate and only some traffic is routed through them. those IPs in that log are from the openvpn clients

                                    ovpnc3: flags=1008043<UP,BROADCAST,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 1500
                                    	options=80000<LINKSTATE>
                                    	inet 10.16.8.10 netmask 0xffffff00 broadcast 10.16.8.255
                                    	inet6 fe80::20c:29ff:fe82:caec%ovpnc3 prefixlen 64 scopeid 0xe
                                    	groups: tun openvpn
                                    	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                    	Opened by PID 55805
                                    ovpnc5: flags=1008043<UP,BROADCAST,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 1500
                                    	options=80000<LINKSTATE>
                                    	inet 10.199.8.3 netmask 0xffffff00 broadcast 10.199.8.255
                                    	inet6 fe80::20c:29ff:fe82:caec%ovpnc5 prefixlen 64 scopeid 0xf
                                    	groups: tun openvpn
                                    	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                    	Opened by PID 71191
                                    
                                    GertjanG 1 Reply Last reply Reply Quote 0
                                    • GertjanG
                                      Gertjan @swansense
                                      last edited by Gertjan

                                      @swansense

                                      A, ok, thanks, that explains the rapid sequential IP attribution.
                                      Your real WAN was/is

                                      ...
                                      110.79.143.248 -> 110.79.210.9-four
                                      ....

                                      and the attribution of that one looks normal.

                                      As the logs show, when your pppoe time out, and reconnects, many (like a lot) of 'packages == system processes restart.
                                      And because you have not one WAN IP, but 3 (the two VPNs) processes get restarted even more often and faster.

                                      No need to be a specialist to draw a simple conclusion from this :

                                      ...
                                      Feb 14 04:10:24 pfSense tail_pfb[83338]: [pfBlockerNG] Firewall Filter Service stopped
                                      Feb 14 04:10:24 pfSense php_pfb[84182]: [pfBlockerNG] filterlog daemon stopped
                                      Feb 14 04:10:24 pfSense php-fpm[11175]: /rc.start_packages: The command '/usr/local/etc/rc.d/pfb_filter.sh stop' returned exit code '1', the output was 'kill: 15788: No such process'
                                      Feb 14 04:10:24 pfSense php-fpm[11175]: [pfBlockerNG] Restarting firewall filter daemon
                                      Feb 14 04:10:25 pfSense tail_pfb[89730]: [pfBlockerNG] Firewall Filter Service stopped
                                      Feb 14 04:10:25 pfSense php_pfb[93021]: [pfBlockerNG] filterlog daemon stopped
                                      Feb 14 04:10:25 pfSense lighttpd_pfb[97157]: [pfBlockerNG] DNSBL Webserver stopped
                                      Feb 14 04:10:25 pfSense tail_pfb[97178]: [pfBlockerNG] Firewall Filter Service stopped
                                      Feb 14 04:10:25 pfSense php_pfb[99329]: [pfBlockerNG] filterlog daemon stopped
                                      Feb 14 04:10:25 pfSense tail_pfb[22]: [pfBlockerNG] Firewall Filter Service started
                                      Feb 14 04:10:25 pfSense lighttpd_pfb[3769]: [pfBlockerNG] DNSBL Webserver started
                                      Feb 14 04:10:25 pfSense tail_pfb[6878]: [pfBlockerNG] Firewall Filter Service started
                                      Feb 14 04:10:25 pfSense php_pfb[2261]: [pfBlockerNG] filterlog daemon started
                                      Feb 14 04:10:25 pfSense php[6957]: [pfBlockerNG] DNSBL parser daemon started
                                      Feb 14 04:10:25 pfSense check_reload_status[425]: Rewriting resolv.conf
                                      Feb 14 04:10:25 pfSense php_pfb[7747]: [pfBlockerNG] filterlog daemon started
                                      ...

                                      Note that pfSense couldn't even follow the pace : a first instance of "pfb_filter.sh" (a PHP stand alone process) was already gone before it could have been signaled to stop ...

                                      I know, I'm not very helpful here.

                                      Your logs by themselves show me a successful reconnect.
                                      I advise you to check if all processes that are restarted are actually 'up and running' :
                                      Example :

                                      dig @127.0.0.1 google.fr +short
                                      dig @192.168.1.1 google.fr +short
                                      

                                      If 192.168.1.1 is your pfSense LAN interface.
                                      Etc.

                                      No "help me" PM's please. Use the forum, the community will thank you.
                                      Edit : and where are the logs ??

                                      S 1 Reply Last reply Reply Quote 1
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Make sure you have set the default IPv4 gateway to PPPOE_GW in System > Routing > Gateways. Otherwise it may be switching to one of the VPNs as default which might not work.

                                        1 Reply Last reply Reply Quote 1
                                        • S
                                          swansense @Gertjan
                                          last edited by

                                          @Gertjan said in NIC issues and advise:

                                          @swansense

                                          A, ok, thanks, that explains the rapid sequential IP attribution.
                                          Your real WAN was/is

                                          ...
                                          110.79.143.248 -> 110.79.210.9-four
                                          ....

                                          and the attribution of that one looks normal.

                                          As the logs show, when your pppoe time out, and reconnects, many (like a lot) of 'packages == system processes restart.
                                          And because you have not one WAN IP, but 3 (the two VPNs) processes get restarted even more often and faster.

                                          No need to be a specialist to draw a simple conclusion from this :

                                          ...
                                          Feb 14 04:10:24 pfSense tail_pfb[83338]: [pfBlockerNG] Firewall Filter Service stopped
                                          Feb 14 04:10:24 pfSense php_pfb[84182]: [pfBlockerNG] filterlog daemon stopped
                                          Feb 14 04:10:24 pfSense php-fpm[11175]: /rc.start_packages: The command '/usr/local/etc/rc.d/pfb_filter.sh stop' returned exit code '1', the output was 'kill: 15788: No such process'
                                          Feb 14 04:10:24 pfSense php-fpm[11175]: [pfBlockerNG] Restarting firewall filter daemon
                                          Feb 14 04:10:25 pfSense tail_pfb[89730]: [pfBlockerNG] Firewall Filter Service stopped
                                          Feb 14 04:10:25 pfSense php_pfb[93021]: [pfBlockerNG] filterlog daemon stopped
                                          Feb 14 04:10:25 pfSense lighttpd_pfb[97157]: [pfBlockerNG] DNSBL Webserver stopped
                                          Feb 14 04:10:25 pfSense tail_pfb[97178]: [pfBlockerNG] Firewall Filter Service stopped
                                          Feb 14 04:10:25 pfSense php_pfb[99329]: [pfBlockerNG] filterlog daemon stopped
                                          Feb 14 04:10:25 pfSense tail_pfb[22]: [pfBlockerNG] Firewall Filter Service started
                                          Feb 14 04:10:25 pfSense lighttpd_pfb[3769]: [pfBlockerNG] DNSBL Webserver started
                                          Feb 14 04:10:25 pfSense tail_pfb[6878]: [pfBlockerNG] Firewall Filter Service started
                                          Feb 14 04:10:25 pfSense php_pfb[2261]: [pfBlockerNG] filterlog daemon started
                                          Feb 14 04:10:25 pfSense php[6957]: [pfBlockerNG] DNSBL parser daemon started
                                          Feb 14 04:10:25 pfSense check_reload_status[425]: Rewriting resolv.conf
                                          Feb 14 04:10:25 pfSense php_pfb[7747]: [pfBlockerNG] filterlog daemon started
                                          ...

                                          Note that pfSense couldn't even follow the pace : a first instance of "pfb_filter.sh" (a PHP stand alone process) was already gone before it could have been signaled to stop ...

                                          I know, I'm not very helpful here.

                                          Your logs by themselves show me a successful reconnect.
                                          I advise you to check if all processes that are restarted are actually 'up and running' :
                                          Example :

                                          dig @127.0.0.1 google.fr +short
                                          dig @192.168.1.1 google.fr +short
                                          

                                          If 192.168.1.1 is your pfSense LAN interface.
                                          Etc.

                                          Thanks with the intel nic the last 2 nights the connection dropped out at 4am. My ISP must be resetting connections at this time or something.

                                          From the little time i had tested all is good with connection from pfsense and its only lan hosts that can not reach the internet(no ICMP, http etc..).

                                          I changed the wan to the realtek nic(not connected) and change it back again and it usually works but not this morning for some reason until i change System > Routing > Gateways to automatic

                                          On previous versions of Pfsense i use to have a lot of issues with unbound if the connection was down for more than 30 seconds. restarting unbound would usually resolve the issue but it was a pain in the backside as at the time my connection was from a bridge mobile broadband router that couldnt hold a connection for more than 24 hours at a time.

                                          @stephenw10 said in NIC issues and advise:

                                          Make sure you have set the default IPv4 gateway to PPPOE_GW in System > Routing > Gateways. Otherwise it may be switching to one of the VPNs as default which might not work.

                                          Thanks for the suggestion but it was always PPPOE_GW. I changed it to automatic this morning and it seemed to bring the connection back up. Ill leave it at automatic and see if the same issue happens again tonight.

                                          There is possible to run a script or something when the PPPOE connection drops like this so services will restart after the connection comes back up?

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Just saving that page would likely have brought it back by re-applying the default gateway. It should be set to PPPoE_GW though to avoid it trying to default to one of the VPNs.

                                            S 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.