Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Intel Interface Issues

    Scheduled Pinned Locked Moved Hardware
    20 Posts 5 Posters 2.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      rediske
      last edited by

      Do I need legal.intel_wpi.license_ack=1 in /boot/loader.conf.local?

      Is this accepting that Intel will own all the data the interface passes? :)

      1 Reply Last reply Reply Quote 0
      • R
        rediske
        last edited by

        I tried that setting in /boot/loader.conf.local, rebooted, left to run some errands and when I came back two hours later the interface was down again:

        Oct 13 12:45:49 kernel em0: Watchdog timeout Queue[0]-- resetting
        Oct 13 12:45:49 kernel Interface is RUNNING and ACTIVE
        Oct 13 12:45:49 kernel em0: TX Queue 0 ------
        Oct 13 12:45:49 kernel em0: hw tdh = -1, hw tdt = -1
        Oct 13 12:45:49 kernel em0: Tx Queue Status = -2147483648
        Oct 13 12:45:49 kernel em0: TX descriptors avail = 117
        Oct 13 12:45:49 kernel em0: Tx Descriptors avail failure = 0
        Oct 13 12:45:49 kernel em0: RX Queue 0 ------
        Oct 13 12:45:49 kernel em0: hw rdh = -1, hw rdt = -1
        Oct 13 12:45:49 kernel em0: RX discarded packets = 0
        Oct 13 12:45:49 kernel em0: RX Next to Check = 440
        Oct 13 12:45:49 kernel em0: RX Next to Refresh = 439

        I still get this in my dmesg:

        ipw_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE.
        ipw_bss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf

        I'm guessing my /boot/loader.conf.local wasn't processed? So I added the line to /boot/loader.conf; not even bothering to cross my fingers at this point.

        1 Reply Last reply Reply Quote 0
        • R
          rediske
          last edited by

          I just crashed again and on reboot, I still get:

          iwi_bss: If you agree with the license, set legal.intel_iwi.license_ack=1 in /boot/loader.conf.

          As noted in a previous post, I added that to /boot/loader.conf.local and I also added it to /boot/loader.conf this morning, too.

          Here's the syslog from the last crash and reboot. I had continuous cross pings in two windows, one from my PC to the router and the other from a ssh shell on the router to my PC. They both dropped at the same time of course, but the only error in the syslog was that my gateway dropped 22%... ???

          Oct 13 18:07:05 rc.gateway_alarm 31127 >>> Gateway alarm: WAN_DHCP (Addr:173.21.160.1 Alarm:1 RTT:13.081ms RTTsd:4.581ms Loss:22%)
          Oct 13 18:07:05 check_reload_status updating dyndns WAN_DHCP
          Oct 13 18:07:05 check_reload_status Restarting ipsec tunnels
          Oct 13 18:07:05 check_reload_status Restarting OpenVPN tunnels/interfaces
          Oct 13 18:07:05 check_reload_status Reloading filter
          Oct 13 18:07:22 login login on ttyv0 as root
          Oct 13 18:07:27 php-cgi rc.initial.reboot: Stopping all packages.
          Oct 13 18:07:31 reboot rebooted by root

          On a side note, when I looked just before one of the system crashes last night, I saw I had filled up all 6 GB of RAM and dipped into swap a few percent. I think maybe it was due to NtopNG, so I disabled it and haven't been over probably 10-20% memory since. I sort of think I have 2-3 or maybe even more different issues that I can't seem to find. I'm going to uninstall Snort, even though I never started to configure it. I'm also going to disable dark stat as well. It's probably unrelated, but at this point I'm starting to try anything...

          What in the world is so unstable about using an HP PC, Intel card and a fresh 2.4.4 pfSense install???

          1 Reply Last reply Reply Quote 0
          • R
            rediske
            last edited by

            Additional symptoms

            When I enabled em3 and assigned it a 192.168 address for my MikroTik router, I could not see it in the list of interfaces in the DHCP Server menu on the GUI. I had to set a DHCP pool manually from the console.

            I was going to simply disable Snort instead of uninstalling it, but the Snort option was not available in the Services menu on the web GUI.

            I might have to wind up blaming Russian hackers! This is getting crazy. Do I need to reformat the drive and reinstall from scratch?

            1 Reply Last reply Reply Quote 0
            • R
              rediske
              last edited by

              Just happened again and the only error in the system log was:

              Oct 13 21:23:04 rc.gateway_alarm 6087 >>> Gateway alarm: WAN_DHCP (Addr:173.21.160.1 Alarm:1 RTT:13.090ms RTTsd:5.287ms Loss:21%)

              em0 just stopped passing traffic, but this time I could still access the LAN port on the router. I unplugged the ethernet cable on em0, plugged it back in, nothing. I rebooted my cable modem and no luck there, either. I decided to re-enable ACPI in the BIOS since it doesn't seem to matter. I also enabled powerd in the misc section of the pfSense advanced config.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                The Intel liscence ACKs only stop those messages appearing. They are only related to the wireless drivers iwi(4) and ipw(4), it will have no effect here.

                Those watchdog errors should never normally appear. It is failing, trying to recover and failing ti do so.

                I would start trying to solve this by making the most basic install possible and cheking that runs OK before adding any packages etc.
                You said you installed 2 quad port NICs but it looks like you're using on 4 ports. I would remove the second NIC if you;re not using it. Or even swap it out with the first one to be sure it's not a hardware issue.

                Steve

                R 1 Reply Last reply Reply Quote 0
                • R
                  rediske @stephenw10
                  last edited by

                  @stephenw10 Thanks much for the reply! I did drop to one NIC a couple days ago. These last few crashes are particularly puzzling because the only symptom I see is a log entry about the WAN interface losing some packets (15-25% loss). A few seconds after that, I notice the WAN link is totally unresponsive. I don't think I've had an instance yet where dpinger notes dropped packets and the device recovers. Sometimes the LAN side drops, too, but other times I can still ping, ssh and use the web interface. I "think" the LAN side stays responsive as long as there aren't watchdog timeouts and additional interface related log entries.

                  It does seem slightly more stable with ACPI/powerd enabled, as it's been staying up longer, but that could be random too.

                  I think you're right, my next two steps are:

                  Change NIC's, even use the other PCI slot
                  Reinstall from scratch, don't use the old config and don't install any packages

                  Currently I have darkstat, iftop, nmap, ntopng and pfBlockerNG. I had Snort, but something was wrong with it as it didn't show up in the menus, so I dropped it. This might also be a symptom of something wrong in the install.

                  1 Reply Last reply Reply Quote 0
                  • R
                    rediske
                    last edited by

                    I forgot to note, I started out bridging 7 of the 8 ports in the 2 quad NIC's. It was working about the same as now, so my first step in troubleshooting was to disable the bridge and also change out the WAN cable and the cable to my PC.

                    I kind of wanted to use all 8 ports in the pfSense box as it saves cables, extra hardware, interfaces and allows more port by port management overall, not that I need it in a home network. I'm kind of testing this to see if it's something I want to install at work, where there may be some use in connecting 6 home locations via VPN.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      It should work even if bridging is usually a bad idea.
                      Yes try to rule out hardware initially if you can. As I said start out with a super basic two NIC config and make sure that works.
                      Disable anything you don't need in the BIOS, soundcards etc.

                      Steve

                      1 Reply Last reply Reply Quote 0
                      • R
                        rediske
                        last edited by rediske

                        Just found something, pciconf -l -c em0 gives some PCI info, including the line:

                        ecap 0001[100] = AER 1 1 fatal 3 non-fatal 5 corrected

                        AER is Advanced Error Reporting and this notes some PCI bus errors. Next time I crash, I'll run this command at the console and see what it reveals.

                        1 Reply Last reply Reply Quote 0
                        • R
                          rediske
                          last edited by rediske

                          EDIT: Changed script for all interfaces

                          And just 'cause it's Sunday, I wrote a little perl script:

                          #!/usr/local/bin/perl

                          for (my $i=1; $i <= 604800; $i++) {
                          print "\n";
                          my $ts=system('date');
                          my $err=system('/usr/sbin/pciconf -l -c em0 | grep AER');
                          my $err=system('/usr/sbin/pciconf -l -c em1 | grep AER');
                          my $err=system('/usr/sbin/pciconf -l -c em2 | grep AER');
                          my $err=system('/usr/sbin/pciconf -l -c em3 | grep AER');
                          my $err=system('/usr/sbin/pciconf -l -c bge0 | grep AER');
                          sleep(1);

                          which outputs:

                          Sun Oct 14 13:13:41 CDT 2018
                          ecap 0001[100] = AER 1 1 fatal 3 non-fatal 5 corrected
                          ecap 0001[100] = AER 1 1 fatal 3 non-fatal 5 corrected
                          ecap 0001[100] = AER 1 0 fatal 2 non-fatal 5 corrected
                          ecap 0001[100] = AER 1 0 fatal 3 non-fatal 5 corrected
                          ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected

                          I redirected the output to a text file so I can have a second by second account of the state of the em0-em3 and bge0 interfaces, to see if PCI errors (and what kind and how many) occur second(s) before dpinger makes its syslog entry about the gateway dropping.

                          1 Reply Last reply Reply Quote 1
                          • B
                            bfeitell
                            last edited by

                            Take a look at the info about MSI/MSIX here:
                            https://www.netgate.com/docs/pfsense/hardware/tuning-and-troubleshooting-network-cards.html

                            1 Reply Last reply Reply Quote 0
                            • R
                              rediske
                              last edited by

                              So I waited a while until a crash. dpinger says the interface crashed at 16:57:23. My script stopped logging a full minute earlier at 16:56:10; maybe it was hanging on the system call to pciconf? The log I made found 2 additional fatal errors though, on em2 (nothing plugged in) and em3 (MikroTik router). So we went from:

                              em0 - 1 fatal 3 non-fatal 5 corrected
                              em1 - 1 fatal 3 non-fatal 5 corrected
                              em2 - 0 fatal 2 non-fatal 5 corrected
                              em3 - 0 fatal 3 non-fatal 5 corrected
                              

                              bge0 - 0 fatal 0 non-fatal 0 corrected

                              to

                              em0 - 1 fatal 3 non-fatal 5 corrected
                              em1 - 1 fatal 3 non-fatal 5 corrected
                              em2 - 1 fatal 2 non-fatal 5 corrected
                              em3 - 1 fatal 3 non-fatal 5 corrected
                              bge0 - 0 fatal 0 non-fatal 0 corrected

                              But this error happened at 14:18:43, 2.5 hours before the eventual crash. After I rebooted again, without any changes (my son was trying to play time sensitive games), the machine crashed 2 more times inside 10 minutes.

                              Oh well, after the third crash/reboot, I swapped the NIC out and put it in a different PCI slot. dpinger logged packet loss on the WAN interface after that, but it hasn't dropped the interface altogether yet after 30 min knock on wood.

                              @BFEITELL I thought about MSI maybe causing problems. The dmesg I have above shows the USB device having trouble:

                              xhci0: Unable to map MSI-X table

                              but I don't know if that would matter? I could disable all the USB for that matter, I only need it for booting to install.

                              R 1 Reply Last reply Reply Quote 1
                              • R
                                rediske @rediske
                                last edited by

                                I neglected to say, my little perl script logged PCI status once every second (57,000+ lines) until mysteriously hanging/stopping one minute short of the crash. I doubt that's a coincidence.

                                GertjanG 1 Reply Last reply Reply Quote 0
                                • GertjanG
                                  Gertjan @rediske
                                  last edited by Gertjan

                                  @rediske said in Intel Interface Issues:

                                  I neglected to say, my little perl script logged PCI status once every second (57,000+ lines) until mysteriously hanging/stopping one minute short of the crash. I doubt that's a coincidence.

                                  Well, let's say the device driver, and the related NIC most probably, goes down that moment - or, at least, becomes very busy.
                                  The NIC takes the system with it a couple of moments later.

                                  Just to exclude outside issues (DDOS) : is it possible that you change your "real" WAN IP ?
                                  Or leave WAN disconnected for a while.

                                  No "help me" PM's please. Use the forum, the community will thank you.
                                  Edit : and where are the logs ??

                                  R 1 Reply Last reply Reply Quote 0
                                  • R
                                    rediske @Gertjan
                                    last edited by rediske

                                    @gertjan said in Intel Interface Issues:

                                    Well, let's say the device driver, and the related NIC most probably, goes down that moment - or, at least, becomes very busy.
                                    The NIC takes the system with it a couple of moments later.

                                    Just to exclude outside issues (DDOS) : is it possible that you change your "real" WAN IP ?
                                    Or leave WAN disconnected for a while.

                                    I'm sorry, I got a little fast and loose with the term crash. The pfSense router never actually crashes, the ethernet interfaces become unresponsive to network traffic (ping, web configurator, etc).

                                    Since I swapped the NIC out and changed PCI slots, em3 on the second NIC died twice now. On the first NIC it was em0 that kept dropping. Same config as before, em0 WAN, em1 LAN, em2 empty, em3 MikroTik router for wireless. I see I got a different WAN IP after the reboot last night, but this morning em3 is down already again and that's on an internal network, with very little traffic (wireless for 2 phones and 2 tablets) and my son and I were sleeping.

                                    Right now it shows em2 and em3 have single fatal PCI errors and the ethernet connection and activity lights on em3 both went dark. I'm writing this on a PC plugged into a switch that's connected to em1 and the WAN is on em0, and those seem to work fine.

                                    When this happened last night, I unplugged the em3 cable and plugged it back in and got link lights back, but it still wouldn't talk. This morning when I unplugged it and plugged it back in, the lights stayed dark.

                                    At this point, I think I'm going to reinstall pfSense and maybe try messing with MSI settings. But I'm betting nothing I do will get either of these intel cards to be stable with this HP PC/mobo. I don't think it's traffic related as I imaged 2 VM's on my PC at the same time, 60 GB of traffic in 40 min (200 Mbits) and that went fine.

                                    It just seems after some period of time, anything from an hour to 12 hours, it shuts off one or more ethernet interfaces, sometimes putting messages in the system log and sometimes not.

                                    I saw these from the latest crash:

                                    Oct 15 07:24:43 kernel em3: Watchdog timeout Queue[0]-- resetting
                                    Oct 15 07:24:43 kernel Interface is RUNNING and ACTIVE
                                    Oct 15 07:24:43 kernel em3: TX Queue 0 ------
                                    Oct 15 07:24:43 kernel em3: hw tdh = -1, hw tdt = -1
                                    Oct 15 07:24:43 kernel em3: Tx Queue Status = -2147483648
                                    Oct 15 07:24:43 kernel em3: TX descriptors avail = 40
                                    Oct 15 07:24:43 kernel em3: Tx Descriptors avail failure = 5
                                    Oct 15 07:24:43 kernel em3: RX Queue 0 ------
                                    Oct 15 07:24:43 kernel em3: hw rdh = -1, hw rdt = -1
                                    Oct 15 07:24:43 kernel em3: RX discarded packets = 0
                                    Oct 15 07:24:43 kernel em3: RX Next to Check = 525
                                    Oct 15 07:24:43 kernel em3: RX Next to Refresh = 524

                                    That repeated a few times, the last time being:

                                    Oct 15 07:27:12 kernel em3: Watchdog timeout Queue[0]-- resetting
                                    Oct 15 07:27:12 kernel Interface is RUNNING and ACTIVE
                                    Oct 15 07:27:12 kernel em3: TX Queue 0 ------
                                    Oct 15 07:27:12 kernel em3: hw tdh = -1, hw tdt = -1
                                    Oct 15 07:27:12 kernel em3: Tx Queue Status = -2147483648
                                    Oct 15 07:27:12 kernel em3: TX descriptors avail = 58
                                    Oct 15 07:27:12 kernel em3: Tx Descriptors avail failure = 119
                                    Oct 15 07:27:12 kernel em3: RX Queue 0 ------
                                    Oct 15 07:27:12 kernel em3: hw rdh = -1, hw rdt = -1
                                    Oct 15 07:27:12 kernel em3: RX discarded packets = 0
                                    Oct 15 07:27:12 kernel em3: RX Next to Check = 0
                                    Oct 15 07:27:12 kernel em3: RX Next to Refresh = 0

                                    And now it's 10 AM and there's been no kernel errors since.

                                    1 Reply Last reply Reply Quote 0
                                    • R
                                      rediske
                                      last edited by

                                      I left the machine with em3 down, since I don't need wifi anyway, and it's been functioning fine as far as I can tell. Only 4 entries on the system log:

                                      Oct 15 10:01:07 check_reload_status Syncing firewall
                                      Oct 15 10:01:07 syslogd exiting on signal 15
                                      Oct 15 10:01:07 syslogd kernel boot file is /boot/kernel/kernel
                                      Oct 15 10:01:07 pfsense.localdomain nginx: 2018/10/15 10:01:07 [error] 58467#100412: send() failed (54: Connection reset by peer)

                                      It's been at 1-3% cpu usage and 7% memory, totally normal for a home network with just 1 PC using the web.

                                      As a refresher, I'm using an AMD A4 PRO-7300B processor (3.8 GHz) in an HP EliteDesk 705 G1 SFF, 6GB RAM 500GB HDD. I did not disable the on board bge0 ethernet and it has nothing plugged into it. I have a single Intel PRO 1000 PT Quad Port 1Gb PCIe Ethernet card and I've tried two different cards in two different slots.

                                      It'll be a bummer if I can't use the Intel cards. When I researched it, I heard they're usually wonderful for pfSense and I got the pair for $70. There's something sexy about having 8 MAC addresses numbered in a row ;)

                                      1 Reply Last reply Reply Quote 0
                                      • M
                                        Mats
                                        last edited by

                                        One idea.

                                        what hapends if you plug the wireless to the mainboard nic?
                                        My idea is if it's an issue between microtic and Intel it might help running the mcirotic against another nictype

                                        1 Reply Last reply Reply Quote 0
                                        • R
                                          rediske
                                          last edited by

                                          I did not try putting the MikroTik on another port, however I did try only having two of the Intel interfaces up as WAN and LAN, and I still want up having problems.

                                          For fun, I tried installing the ESXi on the machine to put pfsense inside that. ESXi wouldn’t recognize the Intel at all.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.