pf_test: kif == NULL, if_xname on multi-WAN and "Reset all states if WAN IP Address changes"



  • Spamming terminal and logs with

    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    Mar  2 16:41:37	10.0.72.1		kern	crit	kernel	pf_test: kif == NULL, if_xname  v—.
    

    Do somebody else have it?
    As a result sometimes pfSense becomes
    inaccessible.



  • Looks similar to https://redmine.pfsense.org/issues/9796
    gif present as Hurricane Ipv6, that uses PPPoE as main internet connection.

    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[27697]	waiting for process 8608 to die...
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[27697]	waiting for process 8608 to die...
    10.0.72.1	Mar  2 09:38:32		daemon	info	charon[6152]	07[KNL] interface gif0 activated
    10.0.72.1	Mar  2 09:38:32		kern	info	kernel	gif0: link state changed to DOWN
    10.0.72.1	Mar  2 09:38:32		daemon	info	charon[6152]	07[KNL] interface gif0 deactivated
    10.0.72.1	Mar  2 09:38:32		daemon	info	charon[6152]	08[KNL] interface gif0 disappeared
    10.0.72.1	Mar  2 09:38:32		daemon	err	php-fpm[640]	/rc.linkup: The command '/sbin/ifconfig gif0 tunnel  '226.58.80.90'' returned exit code '1', the output was 'ifconfig: 'tunnel' requires 2 arguments' 
    10.0.72.1	Mar  2 09:38:32		daemon	err	php-fpm[640]	/rc.linkup: Default gateway setting Interface HENETV6_TUNNELV6 Gateway as default.
    10.0.72.1	Mar  2 09:38:32		user	notice	check_reload_status[653]	Restarting ipsec tunnels
    10.0.72.1	Mar  2 09:38:32		user	notice	check_reload_status[653]	Rewriting resolv.conf
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IFACE: Removing IPv4 address from pppoe0 failed(IGNORING for now. This should be only for PPPoE friendly!): Can't assign requested address
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IFACE: Removing IPv4 address from pppoe0 failed(IGNORING for now. This should be only for PPPoE friendly!): Can't assign requested address
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: Close event
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: Close event
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: state change Opened --> Closing
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: state change Opened --> Closing
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: SendTerminateReq #4
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: SendTerminateReq #4
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: LayerDown
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IPV6CP: LayerDown
    10.0.72.1	Mar  2 09:38:32		user	warning	dpinger[42889]	HENETV6_TUNNELV6 2003:471:27:191::1: sendto error: 50
    10.0.72.1	Mar  2 09:38:32		user	notice	check_reload_status[653]	Rewriting resolv.conf
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IFACE: Down event
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IFACE: Down event
    10.0.72.1	Mar  2 09:38:32		daemon	info	charon[6152]	14[KNL] fe80::a236:9fff:fec3:4a2c disappeared from pppoe0
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IFACE: Rename interface pppoe0 to pppoe0
    10.0.72.1	Mar  2 09:38:32		daemon	info	ppp[8608]	[wan] IFACE: Rename interface pppoe0 to pppoe0
    10.0.72.1	Mar  2 09:38:32		user	warning	dpinger[42889]	HENETV6_TUNNELV6 2003:471:27:191::1: sendto error: 50
    10.0.72.1	Mar  2 09:38:33		user	err	dhcpleases[14114]	kqueue error: unknown
    10.0.72.1	Mar  2 09:38:33		user	err	dhcpleases[14114]	kqueue error: unknown
    10.0.72.1	Mar  2 09:38:33		user	info	dhcpleases[14114]	Sending HUP signal to dns daemon(67233)
    10.0.72.1	Mar  2 09:38:33		daemon	info	ppp[27697]	waiting for process 8608 to die...
    10.0.72.1	Mar  2 09:38:33		daemon	info	ppp[27697]	waiting for process 8608 to die...
    10.0.72.1	Mar  2 09:38:33		user	warning	dpinger[42889]	HENETV6_TUNNELV6 2003:471:27:191::1: sendto error: 50
    10.0.72.1	Mar  2 09:38:33		daemon	err	upsmon[67868]	Poll UPS [eaton@10.0.72.200:3493] failed - Write error: Permission denied
    10.0.72.1	Mar  2 09:38:33		daemon	notice	upsmon[67868]	Communications with UPS eaton@10.0.72.200:3493 lost
    10.0.72.1	Mar  2 09:38:34		user	warning	dpinger[42889]	HENETV6_TUNNELV6 2003:471:27:191::1: sendto error: 50
    10.0.72.1	Mar  2 09:38:34		daemon	info	dhcpd[75177]	Internet Systems Consortium DHCP Server 4.4.1
    10.0.72.1	Mar  2 09:38:34		daemon	info	dhcpd[75177]	Copyright 2004-2018 Internet Systems Consortium.
    10.0.72.1	Mar  2 09:38:34		daemon	info	dhcpd[75177]	All rights reserved.
    10.0.72.1	Mar  2 09:38:34		daemon	info	dhcpd[75177]	For info, please visit https://www.isc.org/software/dhcp/
    10.0.72.1	Mar  2 09:38:34		daemon	info	dhcpd[75177]	Config file: /etc/dhcpd.conf
    10.0.72.1	Mar  2 09:38:34		daemon	info	dhcpd[75177]	Database file: /var/db/dhcpd.leases
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Internet Systems Consortium DHCP Server 4.4.1
    10.0.72.1	Mar  2 09:38:34		daemon	info	dhcpd[75177]	PID file: /var/run/dhcpd.pid
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Copyright 2004-2018 Internet Systems Consortium.
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	All rights reserved.
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	For info, please visit https://www.isc.org/software/dhcp/
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Wrote 0 class decls to leases file.
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Wrote 0 deleted host decls to leases file.
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Wrote 0 new dynamic host decls to leases file.
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Wrote 4 leases to leases file.
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Listening on BPF/ix1/a0:36:9f:c3:4a:2e/10.0.72.0/24
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Sending on   BPF/ix1/a0:36:9f:c3:4a:2e/10.0.72.0/24
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Sending on   Socket/fallback/fallback-net
    10.0.72.1	Mar  2 09:38:34		local7	info	dhcpd[75177]	Server starting service.
    10.0.72.1	Mar  2 09:38:34		user	info	dhcpleases[14114]	Sending HUP signal to dns daemon(67233)
    10.0.72.1	Mar  2 09:38:34		user	info	dhcpleases[14114]	Sending HUP signal to dns daemon(67233)
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[27697]	waiting for process 8608 to die...
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[27697]	waiting for process 8608 to die...
    10.0.72.1	Mar  2 09:38:34		user	err	php[51805]	notify_monitor.php: Could not send the message to anonymous@gmail.com -- Error: Failed to connect to ssl://smtp.gmail.com:465 [SMTP: Failed to connect socket: Operation timed out (code: -1, response: )]
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[8608]	[wan] Bundle: Shutdown
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[8608]	[wan] Bundle: Shutdown
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[8608]	[wan_link0] Link: Shutdown
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[8608]	[wan_link0] Link: Shutdown
    10.0.72.1	Mar  2 09:38:34		daemon	info	charon[6152]	10[KNL] interface pppoe0 disappeared
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[8608]	process 8608 terminated
    10.0.72.1	Mar  2 09:38:34		daemon	info	ppp[8608]	process 8608 terminated
    10.0.72.1	Mar  2 09:38:34		user	warning	dpinger[42889]	HENETV6_TUNNELV6 2003:471:27:191::1: sendto error: 50
    10.0.72.1	Mar  2 09:38:34		kern	crit	kernel	pf_test: kif == NULL, if_xname pppoe0
    10.0.72.1	Mar  2 09:38:34		user	warning	dpinger[42363]	WAN_PPPOE 213.77.29.236: sendto error: 13
    10.0.72.1	Mar  2 09:38:34		kern	crit	kernel	pf_test: kif == NULL, if_xname аv—.
    10.0.72.1	Mar  2 09:38:34		kern	crit	kernel	pf_test: kif == NULL, if_xname аv—.
    10.0.72.1	Mar  2 09:38:34		kern	crit	kernel	pf_test: kif == NULL, if_xname аv—.
    10.0.72.1	Mar  2 09:38:34		kern	crit	kernel	pf_test: kif == NULL, if_xname аv—.
    10.0.72.1	Mar  2 09:38:34		kern	crit	kernel	pf_test: kif == NULL, if_xname аv—.
    

  • Rebel Alliance Developer Netgate

    Probably the same as that redmine, but I wouldn't worry about it unless it continues to happen on 2.5.0 once we move 2.5.0 up to FreeBSD 12-STABLE. Could be any number of problems on 12.0 causing that.



  • @jimp
    Thanks, will see when it's ready and out.
    Now I've sequence to replicate this bug, planning to test it on 2.4.x, if I have free time in the weekend.



  • It's looks like I've different issue, caused by option "Reset all states if WAN IP Address changes".
    Even having gif/IPv6 interfaces deleted this option triggers those "pf_test: ..." messages if I manipulate with WAN interface or playing with media where WAN (PPPoE) based on.
    Looks like having this option not enabled makes the system stable again.



  • Tested 2.4.5 — not affected. ☺



  • @w0w said in pf_test: kif == NULL, if_xname:

    Tested 2.4.5 — not affected. ☺

    If you can provide me how to set this up and can see if my environment would permit me to test it on my version of 12-STABLE. Its current and just finished rebasing it.

    I'm currently using using a PPPOE connection and using the stf to implement ipv6 via 6RD, The stf has been modified with commits that I pulled forward from 2.4.x.



  • @rschell
    Actually I have multiple WANs, but I can test it in single WAN config later, to see if bug still there or not.
    If you have WAN PPPoE, go to the System-Advanced-Networking and tick the option "Reset all states if WAN IP Address changes" reboot your firewall to make things clean and just try to re-plug the ethernet cable on PPPoE WAN port, unplug, wait for a 15 sec and put it back. If you don't see those pf_test messages in console or in system logs, system does not crashing, it's good.
    Tested, it fails only with multiple WANs and Gateway group configured. So may be I need to test your 12-STABLE version by myself, can I download it somewhere?



  • @w0w I currently don't have any file sharing set up.



  • @rschell
    https://filebin.net/ without registration.





  • Sad to say but I have same error on the same config. Regardless of option "Reset all states if WAN IP Address changes" enabled or not after known sequence I get those pf_test: kif == NULL, if_xname and then Fatal trap 12 a bit later.

    trap12_lily2.jpg



  • @w0w did you try again after rebooting? I seen a page fault occur on my bare metal instance but not VM, but much earlier in the boot process. Second boot it doesn't occur.

    Not likely to fix the original error however.



  • It was rebooted multiple times before I did test sequence to crash it ☺, so it is some bug. I'll wait for Netgate version, when it is out or if somebody else find something. It can be also related to ix driver, not sure, will test it also a bit later



  • On the igb0 I have similar situation, but different message on 2.5.0-DEVELOPMENT (amd64)
    built on Fri Mar 13. Instead of pf_test spam I have this one...

    kernel: igb0: TX(3) desc avail = 1024, pidx = 0
    

    Looks like it's this one
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=239240

    rschell do you have the patch mentioned in the thread included in your firewall version?



  • @w0w The patches mentioned in 239240 (MFC 356310) and have been incorporated. If its posted in 12-stable branch of pfsense/FreeBSD-src, as of the build date, its in there .



  • I had that fatal trap 12: page fault message happen on a test machine with 2.5.

    I tested mb,drive,memory,switched cpu everything tested good.

    I finally swapped memory even though the previous mem passed memtest for 24 hrs
    and the machine has been running fine since.



  • @Impatient
    I do not think it's memory, could be Intel card driver, I don't have any problem until I do known sequences on known configuration. Also 2.4.5 is not affected by this bug on the same hardware.



  • If it was the network card would it not mention that in the error?

    It does mention what is not in memory (supervisor read data)

    Then the result of that is the (panic: page fault).

    I run dual wan PPPoE with Reset all states if WAN IP Address changes enabled
    and get no error's running 2.5.0.a.20200313.1633 as of today on above system.
    Previously in 2.5.0.a.20200312.something.

    It has only been running for about 12 hrs. with 1 reboot.



  • @Impatient
    My firewall is overall stable on 2.5.0, I can run it for a month until something triggers the fatal trap or "pf_test: kif == NULL", sometimes it crash on boot and 1 of 3 times crashing or/and this "pf_test: kif == NULL" message spamming, when there is the problem on the PPPoE ethernet port — for example ISP router is out of power or I have unplugged the cable or changed the link speed.
    I repeat, I do not have any of the mentioned problems on 2.4.5 version, so I think it's software problem.



  • Hello all,

    As was announced 2.5 to be soon mainstreamed, I jumped onto the dev branch.

    15 minutes after successful upgrade net went down, I couldnt ping or load pfsense web.
    Error streaming on screen without stop was

    igb1: TX(2) desc avail = 1024, pidx = 0

    Running 2.5.0.a.20200330.1049 (latest x64)
    Potential Solution based on power settings posted here? Not sure how to issue these commands. (from terminal at firewall, thru web, ssh...???)
    https://forum.netgate.com/topic/149643/igbx-bug-tx-1-desc-avail-no-carrier

    Bug report mentioned elsewhere here:
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=239240

    Had been running stable until now and never had a single issue.
    Only way to get the firewall working (tried ifconfig up down, bridge up down, reset web config, etc) was to reboot the firewall. Has been working for over 50 minutes now without issue...

    Heres the logs at time of event.
    Mar 30 15:15:15 kernel igb1: TX(2) desc avail = 1024, pidx = 0
    Mar 30 15:15:15 check_reload_status 356 Reloading filter
    Mar 30 15:15:15 php-fpm 343 /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.1.1 )
    Mar 30 15:15:14 kernel igb1: link state changed to DOWN
    Mar 30 15:15:14 check_reload_status 356 Linkup starting $igb1
    Mar 30 15:15:14 kernel igb1: TX(1) desc avail = 42, pidx = 712

    Additional Log Hardware Information for comparison to yours if you suffer from this too...
    Mar 30 15:21:04 kernel wlan: mac acl policy registered
    Mar 30 15:21:04 kernel random: entropy device external interface
    Mar 30 15:21:04 kernel Timecounter "TSC" frequency 1600051300 Hz quality 1000
    Mar 30 15:21:04 kernel Launching APs: 3 2 1
    Mar 30 15:21:04 kernel ioapic0 <Version 2.0> irqs 0-114 on motherboard
    Mar 30 15:21:04 kernel random: unblocking device.
    Mar 30 15:21:04 kernel FreeBSD/SMP: 1 package(s) x 4 core(s)
    Mar 30 15:21:04 kernel FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
    Mar 30 15:21:04 kernel WARNING: L1 data cache covers fewer APIC IDs than a core (0 < 1)
    Mar 30 15:21:04 kernel ACPI APIC Table: <ALASKA A M I >
    Mar 30 15:21:04 kernel Event timer "LAPIC" quality 600
    Mar 30 15:21:04 kernel avail memory = 8161169408 (7783 MB)
    Mar 30 15:21:04 kernel real memory = 8589934592 (8192 MB)
    Mar 30 15:21:04 kernel TSC: P-state invariant, performance statistics
    Mar 30 15:21:04 kernel VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
    Mar 30 15:21:04 kernel Structured Extended Features3=0xc000400<IBPB,STIBP>
    Mar 30 15:21:04 kernel Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
    Mar 30 15:21:04 kernel AMD Features2=0x101<LAHF,Prefetch>
    Mar 30 15:21:04 kernel AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
    Mar 30 15:21:04 kernel Features2=0x43d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,AESNI,RDRAND>
    Mar 30 15:21:04 kernel Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
    Mar 30 15:21:04 kernel Origin="GenuineIntel" Id=0x406c4 Family=0x6 Model=0x4c Stepping=4
    Mar 30 15:21:04 kernel CPU: Intel(R) Celeron(R) CPU J3160 @ 1.60GHz (1600.05-MHz K8-class CPU)
    Mar 30 15:21:04 kernel VT(efifb): resolution 800x600
    Mar 30 15:21:04 kernel FreeBSD clang version 6.0.1 (tags/RELEASE_601/final 335540) (based on LLVM 6.0.1)
    Mar 30 15:21:04 kernel FreeBSD 12.0-RELEASE-p10 ce9563d5729(RELENG_2_5) pfSense amd64

    Suggestions? Recommend I go back to stable?



  • I am waiting for 2.5.0 based on STABLE. Expected to be soon. I hope that it will be released before the end of the world. ☺



  • Good news is it hasnt failed on me again since the one time 15m after original upgrade.

    So... scary start, but stable for nearly 24hours now.

    Hopefully that release is SOON.


Log in to reply