watchdog timeout on Pfsense 2.4.5p1



  • Kernel vmx(x=0,1,3,4,5,6): watchdog timeout on queue 0 on VMWare 6.7
    Any ideas? My WAN(PPPoE) stops working and i can't access to internet. Any ideas? Please help me to fix. Thanks


  • Netgate Administrator

    You see that on all 6 NICs? But only WAN stops responding?

    You see any other errors before that?

    Steve



  • @stephenw10 said in watchdog timeout on Pfsense 2.4.5p1:

    You see that on all 6 NICs? But only WAN stops responding?
    You see any other errors before that?

    Hi Stephenw10,
    Not all NICs, vmx2 are LANs still work fine. I have seen the crash before 2 times. Here are the new crash reports of PFSENSE:
    <118> __
    <118> _ __ / |__ ___ _ __ ___ ___
    <118>| '_ | |/ _|/ _ \ ' / __|/ _
    <118>| |
    ) | _ \ / | | _ \ /
    <118>| .
    /|| |
    /_|| ||/_|
    <118>|_|
    <118>
    <118>
    <118>Welcome to pfSense 2.4.5-RELEASE (Patch 1)...
    <118>
    <118>No core dumps found.
    <118>...ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/ipsec /usr/local/lib/mysql /usr/local/lib/nss /usr/local/lib/perl5/5.30/mach/CORE
    <118>32-bit compatibility ldconfig path:
    <118>done.
    <118>>>> Removing vital flag from php72... done.
    <118>External config loader 1.0 is now starting... da0s1 da0s1a da0s1b
    <118>Launching the init system...Updating CPU Microcode...
    CPU: Intel(R) Xeon(R) CPU E5-2689 0 @ 2.60GHz (2593.50-MHz K8-class CPU)
    Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7
    Features=0xf8bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,SS>
    Features2=0x9fba2203<SSE3,PCLMULQDQ,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,HV>
    AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
    AMD Features2=0x1<LAHF>
    Structured Extended Features=0x2<TSCADJ>
    Structured Extended Features3=0xbc000000<IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
    IA32_ARCH_CAPS=0xc<RSBA,SKIP_L1DFL_VME>
    TSC: P-state invariant
    Hypervisor: Origin = "VMwareVMware"
    <118>Done.
    <118>...... done.
    <118>Initializing.................. done.
    <118>Starting device manager (devd)...done.
    <118>Loading configuration......done.
    <118>Updating configuration...done.
    <118>Checking config backups consistency.................................done.
    <118>Setting up extended sysctls...done.
    padlock0: No ACE support.
    aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard
    <118>Setting timezone...done.
    <118>Configuring loopback interface...
    <5>lo0: link state changed to UP
    <118>done.
    <118>Starting syslog...done.
    <118>Starting Secure Shell Services...done.
    <118>Setting up interfaces microcode...done.
    <118>Configuring loopback interface...done.
    <118>Creating wireless clone interfaces...done.
    <118>Configuring LAGG interfaces...done.
    <118>Configuring VLAN interfaces...done.
    <118>Configuring QinQ interfaces...done.
    <118>Configuring IPsec VTI interfaces...done.
    <118>Configuring WAN1_RSM_2 interface...
    <5>vmx2: link state changed to UP
    <5>vmx6: link state changed to UP
    <6>ng0: changing name to 'pppoe1'
    <118>done.
    <118>Configuring LAN interface...done.
    <118>Configuring WAN2_CTYRSM_2019 interface...
    <6>ng1: changing name to 'pppoe7'
    <118>done.
    <118>Configuring WAN3_T008_FTTH_NAMCTKTVTVRV0 interface...
    <6>ng2: changing name to 'pppoe3'
    <118>done.
    <118>Configuring WAN4_T008_GFTTH_DTLCTTKT interface...
    <6>ng3: changing name to 'pppoe4'
    <118>done.
    <118>Configuring WAN5_T008_GTFFH_DTL interface...done.
    <118>Configuring WAN6_RSM_1 interface...
    <6>ng4: changing name to 'pppoe6'
    <118>done.
    <6>vmx6: promiscuous mode enabled
    <6>carp: 2@vmx6: INIT -> BACKUP (initialization complete)
    <6>vmx2: promiscuous mode enabled
    <6>carp: 1@vmx2: INIT -> BACKUP (initialization complete)
    <118>Configuring CARP settings...done.
    <118>Configuring CARP settings...done.
    <118>Syncing OpenVPN settings...
    <6>tun1: changing name to 'ovpns1'
    <5>ovpns1: link state changed to UP
    <6>tun2: changing name to 'ovpns2'
    <5>ovpns2: link state changed to UP
    <6>tun3: changing name to 'ovpns3'
    <5>ovpns3: link state changed to UP
    <6>tun4: changing name to 'ovpns4'
    <5>ovpns4: link state changed to UP
    <6>tun5: changing name to 'ovpns5'
    <5>ovpns5: link state changed to UP
    <6>tun6: changing name to 'ovpns6'
    <5>ovpns6: link state changed to UP
    <118>done.
    <6>pflog0: promiscuous mode enabled
    <118>Configuring firewall......done.
    <118>Starting PFLOG...done.
    <118>Setting up gateway monitors...done.
    <118>Setting up static routes...done.
    <118>Setting up DNSs...
    <118>Starting DNS Resolver...
    <6>carp: 2@vmx6: BACKUP -> MASTER (master timed out)
    <6>carp: 1@vmx2: BACKUP -> MASTER (master timed out)
    <118>done.
    <118>Synchronizing user settings...done.
    <118>Starting webConfigurator...done.
    <118>Configuring CRON...done.
    <118>Starting NTP time client...done.
    <118>Starting DHCP service...done.
    <118>Configuring firewall......done.
    <118>Generating RRD graphs...done.
    <118>Starting syslog...done.
    <118>Starting CRON... done.
    <118> Starting package OpenVPN Client Export Utility...done.
    <118> Starting package Open-VM-Tools...done.
    <118> Starting package Backup...done.
    <118> Starting package squid3...done.
    <118> Starting package suricata...done.
    <118> Starting package squidGuard...done.
    <118> Starting /usr/local/etc/rc.d/sqp_monitor.sh...done.
    <118> Starting /usr/local/etc/rc.d/vmware-guestd.sh...done.
    <118> Starting /usr/local/etc/rc.d/vmware-kmod.sh...done.
    VMware memory control driver initialized
    <118>pfSense 2.4.5-RELEASE (Patch 1) amd64 Tue Jun 02 17:51:17 EDT 2020
    <118>Bootup complete
    <6>ng0: changing name to 'pppoe1'
    <6>ng0: changing name to 'pppoe1'
    <6>ng0: changing name to 'pppoe1'
    <6>ng0: changing name to 'pppoe1'
    <6>ng4: changing name to 'pppoe6'
    <6>ng0: changing name to 'pppoe1'
    <6>carp: 1@vmx2: MASTER -> INIT (hardware interface down)
    <6>carp: demoted by 240 to 240 (interface down)
    <6>carp: 2@vmx6: MASTER -> INIT (hardware interface down)
    <6>carp: demoted by 240 to 480 (interface down)
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx6: 3
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx6: 3
    <6>carp: demoted by -240 to 240 (vhid removed)
    <6>vmx6: promiscuous mode disabled
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx2: 3
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx2: 3
    <6>carp: demoted by -240 to 0 (vhid removed)
    <6>vmx2: promiscuous mode disabled
    <5>vmx5: link state changed to UP
    <5>vmx3: link state changed to UP
    <5>vmx8: link state changed to UP
    <5>vmx1: link state changed to UP
    <5>vmx4: link state changed to UP
    <6>ng0: changing name to 'pppoe4'
    <6>ng1: changing name to 'pppoe3'
    <6>ng2: changing name to 'pppoe1'
    <7>arpresolve: can't allocate llinfo for 192.168.1.1 on vmx6
    <6>ng0: changing name to 'pppoe1'
    <6>ng0: changing name to 'pppoe1'
    <5>ovpns1: link state changed to DOWN
    <5>ovpns1: link state changed to UP
    <6>ng1: changing name to 'pppoe7'
    <5>ovpns2: link state changed to DOWN
    <5>ovpns2: link state changed to UP
    <6>ng2: changing name to 'pppoe3'
    <5>ovpns3: link state changed to DOWN
    <5>ovpns3: link state changed to UP
    <6>ng3: changing name to 'pppoe4'
    <6>ng4: changing name to 'pppoe6'
    <5>ovpns4: link state changed to DOWN
    <5>ovpns4: link state changed to UP
    <5>ovpns6: link state changed to DOWN
    <5>ovpns6: link state changed to UP
    <6>ng4: changing name to 'pppoe6'
    <5>ovpns5: link state changed to DOWN
    <5>ovpns5: link state changed to UP
    <5>ovpns5: link state changed to DOWN
    <6>vmx6: promiscuous mode enabled
    <6>carp: demoted by 240 to 240 (interface down)
    <6>carp: 2@vmx6: INIT -> BACKUP (initialization complete)
    <6>carp: demoted by -240 to 0 (interface up)
    <5>ovpns5: link state changed to UP
    <6>carp: 2@vmx6: BACKUP -> INIT (hardware interface down)
    <6>carp: demoted by 240 to 240 (interface down)
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx6: 3
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx6: 3
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx6: 3
    <6>carp: demoted by -240 to 0 (vhid removed)
    <6>vmx6: promiscuous mode disabled
    <6>vmx6: promiscuous mode enabled
    <6>carp: demoted by 240 to 240 (interface down)
    <6>vmx2: promiscuous mode enabled
    <6>carp: demoted by 240 to 480 (interface down)
    <6>carp: 1@vmx2: INIT -> BACKUP (initialization complete)
    <6>carp: demoted by -240 to 240 (interface up)
    <6>carp: 2@vmx6: INIT -> BACKUP (initialization complete)
    <6>carp: demoted by -240 to 0 (interface up)
    <6>carp: 1@vmx2: BACKUP -> MASTER (master timed out)
    <6>carp: 2@vmx6: BACKUP -> MASTER (master timed out)
    <6>carp: 2@vmx6: MASTER -> BACKUP (more frequent advertisement received)
    <7>ifa_maintain_loopback_route: deletion failed for interface vmx6: 3
    <5>ovpns1: link state changed to DOWN
    <5>ovpns1: link state changed to UP
    <5>ovpns2: link state changed to DOWN
    <5>ovpns2: link state changed to UP
    <5>ovpns3: link state changed to DOWN
    <5>ovpns3: link state changed to UP
    <5>ovpns5: link state changed to DOWN
    <5>ovpns5: link state changed to UP
    <5>ovpns4: link state changed to DOWN
    <5>ovpns4: link state changed to UP
    <5>ovpns2: link state changed to DOWN
    <5>ovpns2: link state changed to UP
    <6>arp: 192.168.2.115 moved from 34:40:b5:86:c5:b2 to 78:ac:c0:56:2c:74 on vmx2
    <6>arp: 192.168.2.115 moved from 78:ac:c0:56:2c:74 to 34:40:b5:86:c5:b2 on vmx2
    [zone: pf frag entries] PF frag entries limit reached
    <6>arp: 192.168.2.163 moved from c4:65:16:b7:26:9a to 8c:04:ba:25:2d:b9 on vmx2
    <6>ng2: changing name to 'pppoe3'
    <6>ng2: changing name to 'pppoe3'
    <5>ovpns3: link state changed to DOWN
    <5>ovpns3: link state changed to UP
    vmx5: watchdog timeout on queue 0
    vmx3: watchdog timeout on queue 0
    vmx4: watchdog timeout on queue 0
    vmx6: watchdog timeout on queue 0
    vmx5: watchdog timeout on queue 0
    vmx3: watchdog timeout on queue 0
    vmx4: watchdog timeout on queue 0
    vmx6: watchdog timeout on queue 0
    vmx5: watchdog timeout on queue 0
    vmx3: watchdog timeout on queue 0
    vmx4: watchdog timeout on queue 0
    vmx6: watchdog timeout on queue 0
    vmx5: watchdog timeout on queue 0
    vmx3: watchdog timeout on queue 0
    vmx4: watchdog timeout on queue 0
    vmx6: watchdog timeout on queue 0
    vmx5: watchdog timeout on queue 0

    Fatal trap 12: page fault while in kernel mode
    cpuid = 15; apic id = 1e
    fault virtual address = 0x0
    fault code = supervisor read data, page not present
    instruction pointer = 0x20:0xffffffff80d579c3
    stack pointer = 0x28:0xfffffe0861c7a8a0
    frame pointer = 0x28:0xfffffe0861c7a8b0
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq280: vmx5)
    trap number = 12
    panic: page fault
    cpuid = 15
    KDB: enter: panic


  • Netgate Administrator

    Ok so there are a few things there. It help to see the backtrace from the crash report not just the message buffer.

    It looks like you have 5 PPPoE connections?

    But you only see timeout errors on 4 interfaces, was one PPPoE link down?

    It looks like that is part of an HA pair but PPPoE connections are not supported by HA, how do you have the secondary configured?

    This error is the most likely cause: [zone: pf frag entries] PF frag entries limit reached

    You can increase the size of the frags table in System > Advanced > Firewall&NAT however that is usually a sign that there are far too many fragmented packets on the network. Do you have an MTU mismatch somewhere maybe?

    Steve



  • Correctly, I have 5 PPPoE connection, but sometime random timeout error interface.
    I have 1 connection WAN5 (vmx6) used CARP for HA. I know PPPoE not support by HA. I was disabled PPPoE interface WAN on the secondary firewall.
    Now, please let me know how much number used fragmented packets on the network. I used MTU, default 1500



  • @chungnp said in watchdog timeout on Pfsense 2.4.5p1:

    I used MTU, default 1500

    pppoe will always be lower as 1500 - like 1472 (to be tested), as it included packet overhead.


  • Netgate Administrator

    Yup PPPoE is usually 1492 by default.

    I would start by doubling the default value for fragmented packet and see if that makes any difference. So set it to 10000.

    Steve



  • Hi Stephenw10,

    I have changed the value for fragmented packet to 10000 and it's working fine. I don't see WAN dead again after change it.

    Thank for your support.



  • @Gertjan

    Thanks you for your suggestion. I will change it and check. Because RTT is in the 10ms - 120ms range, I feel too high.


Log in to reply