Watchdog timeout on queue 0



  • Hello,

    I see lots of the following errors in the log:

    kernel vmx0: watchdog timeout on queue 0

    Can anyone tell me what this is? Having this on the stable and the development release of pfSense.
    Running on vmware (esxi 6) using vmxnet3 (vmx0 is my LAN)

    Tested to disable the following options under System/Advanced/Networking:

    CHECKED Disable hardware checksum offload
    CHECKED Disable hardware TCP segmentation offload
    CHECKED Disable hardware large receive offload

    But error still comes back.

    Are there any other options I can try?
    Hope someone can help me out..

    Regards.
    Donald.


  • Rebel Alliance Developer Netgate

    Very little detail in your message, but it's possible you might be hitting a known issue on 2.3: https://redmine.pfsense.org/issues/6296



  • Still having this issue. (on 3 different esxi hosts, the vm's including pfsense having this issue are all BSD variants)

    What I've done:

    Updated all esxi versions to v6.0U2
    Updated pfSense to dev branche 2.3.3 now on FreeBSD 10.3-RELEASE-p7

    What else can I try?


  • Rebel Alliance Developer Netgate

    If you are seeing it on all BSD variants, then the odds are it's nothing you can address in pfSense. I and many others are using vmx NICs with success and no errors of the sort you're showing. It's possible it's still something related to your ESX installation or the hardware behind it.

    Personally I have close to a dozen pfSense and FreeBSD VMs that run on ESX 24/7 without such problems.

    If it happens on plain FreeBSD, it may also be an issue in the FreeBSD vmx drivers that is beyond our control, if you can repeat it there reliably, you might want to raise the issue on a FreeBSD forum directly.



  • I just experienced this for the first time, and it wasn't precipitated by anything in particular. The interface stopped processing traffic and it took me a while to figure that out. I saw the following in my system log:

    	kernel		vtnet2: watchdog timeout on queue 0
    

    This was on pfSense 2.4.3-RELEASE-p1 running on KVM with paravirtual NICs defined like this in the VM definition XML:

    <interface type='bridge'>
      <mac address='12:23:34:45:56:67'/>
      <source bridge='brteam0.1111'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <alias name='net3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
    

    Disabling and reenabling the interface resolved the issue. I'll edit this post if this becomes a recurring problem.



  • @rulerof Has the issue popped again after doing what you outlined? I'm encountering this all of a sudden and seems like the time-span is random. Started occurring (it seems) after adding a 3rd NIC to the VM to connect to a DVR. I disabled the NIC (in pfSesne) and it seems like the issue subsided. That was about 10 days ago. Then, it just occurred again. Not sure what's going on. But, I'll look into implementing your steps.



  • @erutan409 it happened one more time in the last month, and just like before I really don't have any explanation as to why. Sorry :(



  • I resolved this issue, running pfSense as a VM in vSphere. I changed the NIC's assigned to the VM from VMXNET 3 to E1000e. Apparently, there's an issue with grouping more than two of the VMXNET 3 type of NIC's together in the same environment. Up and running for about a month without the issue reoccurring.



  • @Erutan409 if this happens to me again, I'll change my virtual NICs over to fully-emulated hardware. I hate to have to do that, but I'm glad it seems to be a viable fix!



  • Was there ever a fix for this? I have the latest version of pfsense running on an esxi host. I have a 4 port intel gigabit nic that was using two ports for a lan and wan. That would cause the watchdog timeout on queue 0 to happen every couple of hours. I have since switched to the onboard motherboard nic for LAN and using the quad port card for WAN. That seemed to have fixed it, but then I had it happen again today. Any reason why this is happening?



  • @clifford64 I mentioned in my last response what my fix was. If your environment is similar, I'd suspect it's the type of virtual adapter you're using. And I'm not sure it's something that is reconcilable within pfSense. It might be in the OS, itself.



  • So, I may have found a fix for my configuration - accounting for bufferbloat:

    • vSphere - 6.7 - VMXNET3 on all 4 network adapters I have configured on the VM
    • Internet speed - 250 Mbps/30 Mbps
    • pfSense version - latest (2.4.4_3)

    Following this guide, I seemed to have avoided getting this issue when (a reoccurring issue) I'd set a couple downloads going on my PS4 and max out the download speed during the night. This seems to be the best reproduction of this issue, without fail, when I'd wake up in the morning.

    The Internet would get sluggish and eventually stop working all together. Then, I'd log into vSphere and see the issue in the VM console.

    But, after following that guide, it seems like my Internet is overall stable. I'll update if it still happens in the future. Hoping this IS the fix.


Log in to reply