Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    pfTop hangs my GUI in 2.5.2 RC

    Scheduled Pinned Locked Moved 2.5.2 Release Candidate Snapshots (Retired)
    35 Posts 4 Posters 7.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      Trey @jimp
      last edited by

      Block bogon networks was now on before reboot and is still...

      pfctl shows up in top every now and then and uses a lot of cpu...

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        It's not likely related to bogons or aliases/tables at this point, but something in the state table.

        I've started https://redmine.pfsense.org/issues/12045 for this, and one of the other devs has a lead on a possible solution.

        We're still trying to find a way to replicate it locally yet, but no luck.

        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          OK I can't quite get up to the number of states you had with a quick and dirty test but I was able to get up to about 900 states and I definitely saw a slowdown.

          20-50 states: 0.01s
          300-450 states: 3s
          850 states: 20s

          I could easily see it degrading fast, need more data points but that certainly appears to be significant growth. The FreeBSD commit I linked in the Redmine above mentions factorial time (O(N!)) which would be quite bad in terms of efficiency.

          We're working on getting a fix into a new build, it should be available soon.

          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          T 1 Reply Last reply Reply Quote 0
          • T
            Trey @jimp
            last edited by

            Sounds greate. I did a ktrace und kdump on pfctl -ss

            62033 pfctl    0.126404 CALL  mmap(0,0xa01000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0)
             62033 pfctl    0.126446 RET   mmap 34374418432/0x800e00000
             62033 pfctl    0.126700 CALL  ioctl(0x3,DIOCGETSTATESNV,0x7fffffffe410)
             62033 pfctl    71.020411 RET   ioctl 0
             62033 pfctl    71.020973 CALL  mmap(0,0x5000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0)
             62033 pfctl    71.020984 RET   mmap 34368081920/0x8007f5000
            

            It keeps 71 sec? (not so sure about how kdump time output) in ioctl(0x3,DIOCGETSTATESNV,0x7fffffffe410).... Does this confirm the nvlist bottleneck or is ioctl something else ?

            T 1 Reply Last reply Reply Quote 0
            • T
              Trey @Trey
              last edited by

              Okay, that is easy to answer:

              Add DIOCGETSTATESNV, an nvlist-based alternative to DIOCGETSTATES.

              Also from netgate: https://reviews.freebsd.org/D30243

              So this should be it...

              T 1 Reply Last reply Reply Quote 0
              • T
                Trey @Trey
                last edited by

                Okay just as scrolling through the kdump output of pfctl -ss .... has netgate some developer for the pfctl binary? Because from my small understanding it always reads two files from the filesystem for each state... 2400 states x reading 2 files...

                For me it looks a little insane to not cache the "/etc/nsswitch.conf" and "/etc/protocols", as it should waste a lot of time... Perhaps not, because the os is caching it, but still seems like a good idea to cache all that stuff....

                This is really executed for each state:

                
                 62033 pfctl    71.128083 CALL  fstatat(AT_FDCWD,0x80032b331,0x7fffffffe240,0)
                 62033 pfctl    71.128086 NAMI  "/etc/nsswitch.conf"
                 62033 pfctl    71.128091 STRU  struct stat {dev=159, ino=26564847, mode=0100644, nlink=1, uid=0, gid=0, rdev=53065152, atime=1623785215.468672000, mtime=1623781736.833527000, ctime=1623781736.833527000, birthtime=1623619006, size=188, blksize=32768, blocks=8, flags=0x0 }
                 62033 pfctl    71.128093 RET   fstatat 0
                 62033 pfctl    71.128096 CALL  open(0x80032e6b6,0x100000<O_RDONLY|O_CLOEXEC>)
                 62033 pfctl    71.128099 NAMI  "/etc/protocols"
                 62033 pfctl    71.128103 RET   open 4
                 62033 pfctl    71.128106 CALL  fstat(0x4,0x7fffffffdea0)
                 62033 pfctl    71.128108 STRU  struct stat {dev=159, ino=26564791, mode=0100644, nlink=1, uid=0, gid=0, rdev=53064232, atime=1623785215.488007000, mtime=1623619006, ctime=1623752821.339423000, birthtime=1623619006, size=6394, blksize=32768, blocks=16, flags=0x0 }
                 62033 pfctl    71.128110 RET   fstat 0
                 62033 pfctl    71.128112 CALL  read(0x4,0x80226bf80,0x8000)
                 62033 pfctl    71.128117 GIO   fd 4 read 4096 bytes
                       "#
                	# Internet protocols
                	#
                	# $FreeBSD$
                	#	from: @(#)protocols	5.1 (Berkeley) 4/17/89
                	#
                	# See also http://www.iana.org/assignments/protocol-numbers
                	#
                	ip	0	IP		# internet protocol, pseudo protocol number
                	#hopopt	0	HOPOPT		# hop-by-hop options for ipv6
                	icmp	1	ICMP		# internet control message protocol
                	igmp	2	IGMP		# internet group management protocol
                	ggp	3	GGP		# gateway-gateway protocol
                	ipencap	4	IP-ENCAP	# IP encapsulated in IP (officially ``IP'')
                	st2	5	ST2		# ST2 datagram mode (RFC 1819) (officially ``ST'')
                	tcp	6	TCP		# transmission control protocol
                	cbt	7	CBT		# CBT, Tony Ballardie <A.Ballardie@cs.ucl.ac.uk>
                	egp	8	EGP		# exterior gateway protocol
                	igp	9	IGP		# any private interior gateway (Cisco: for IGRP)
                	bbn-rcc	10	BBN-RCC-MON	# BBN RCC Monitoring
                	nvp	11	NVP-II		# Network Voice Protocol
                	pup	12	PUP		# PARC universal packet protocol
                	argus	13	ARGUS		# ARGUS
                	emcon	14	EMCON		# EMCON
                	xnet	15	XNET		# Cross Net Debugger
                	chaos	16	CHAOS		# Chaos
                	udp	17	UDP		# user datagram protocol
                	mux	18	MUX		# Multiplexing protocol
                	dcn	19	DCN-MEAS	# DCN Measurement Subsystems
                	hmp	20	HMP		# host monitoring protocol
                	prm	21	PRM		# packet radio measurement protocol
                	xns-idp	22	XNS-IDP		# Xerox NS IDP
                	trunk-1	23	TRUNK-1		# Trunk-1
                	trunk-2	24	TRUNK-2		# Trunk-2
                	leaf-1	25	LEAF-1		# Leaf-1
                	leaf-2	26	LEAF-2		# Leaf-2
                	rdp	27	RDP		# "reliable datagram" protocol
                	irtp	28	IRTP		# Internet Reliable Transaction Protocol
                	iso-tp4	29	ISO-TP4		# ISO Transport Protocol Class 4
                	netblt	30	NETBLT		# Bulk Data Transfer Protocol
                	mfe-nsp	31	MFE-NSP		# MFE Network Services Protocol
                	merit-inp	32	MERIT-INP	# MERIT Internodal Protocol
                	dccp	33	DCCP		# Datagram Congestion Control Protocol
                	3pc	34	3PC		# Third Party Connect Protocol
                	idpr	35	IDPR		# Inter-Domain Policy Routing Protocol
                	xtp	36	XTP		# Xpress Transfer Protocol
                	ddp	37	DDP		# Datagram Delivery Protocol
                	idpr-cmtp	38	IDPR-CMTP	# IDPR Control Message Transport Proto
                	tp++	39	TP++		# TP++ Transport Protocol
                	il	40	IL		# IL Transport Protocol
                	ipv6	41	IPV6		# ipv6
                	sdrp	42	SDRP		# Source Demand Routing Protocol
                	ipv6-route	43	IPV6-ROUTE	# routing header for ipv6
                	ipv6-frag	44	IPV6-FRAG	# fragment header for ipv6
                	idrp	45	IDRP		# Inter-Domain Routing Protocol
                	rsvp	46	RSVP		# Resource ReSerVation Protocol
                	gre	47	GRE		# Generic Routing Encapsulation
                	dsr	48	DSR		# Dynamic Source Routing Protocol
                	bna	49	BNA		# BNA
                	esp	50	ESP		# encapsulating security payload
                	ah	51	AH		# authentication header
                	i-nlsp	52	I-NLSP		# Integrated Net Layer Security TUBA
                	swipe	53	SWIPE		# IP with Encryption
                	narp	54	NARP		# NBMA Address Resolution Protocol
                	mobile	55	MOBILE		# IP Mobility
                	tlsp	56	TLSP		# Transport Layer Security Protocol
                	skip	57	SKIP		# SKIP
                	ipv6-icmp	58	IPV6-ICMP	icmp6	# ICMP for IPv6
                	ipv6-nonxt	59	IPV6-NONXT	# no next header for ipv6
                	ipv6-opts	60	IPV6-OPTS	# destination options for ipv6
                	#	61			# any host internal protocol
                	cftp	62	CFTP		# CFTP
                	#	63			# any local network
                	sat-expak	64	SAT-EXPAK	# SATNET and Backroom EXPAK
                	kryptolan	65	KRYPTOLAN	# Kryptolan
                	rvd	66	RVD		# MIT Remote Virtual Disk Protocol
                	ippc	67	IPPC		# Internet Pluribus Packet Core
                	#	68			# any distributed filesystem
                	sat-mon	69	SAT-MON		# SATNET Monitoring
                	visa	70	VISA		# VISA Protocol
                	ipcv	71	IPCV		# Internet Packet Core Utility
                	cpnx	72	CPNX		# Computer Protocol Network Executive
                	cphb	73	CPHB		# Computer Protocol Heart Beat
                	wsn	74	WSN		# Wang Span Network
                	pvp	75	PVP		# Packet Video Protocol
                	br-sat-mon	76	BR-SAT-MON	# Backroom SATNET Monitoring
                	sun-nd	77	SUN-ND		# SUN ND PROTOCOL-Temporary
                	wb-mon	78	WB-MON		# WIDEBAND Monitoring
                	wb-expak	79	WB-EXPAK	# WIDEBAND EXPAK
                	iso-ip	80	ISO-IP		# ISO Internet Protocol
                	vmtp	81	VMTP		# Versatile Message Transport
                	secure-vmtp	82	SECURE-VMTP	# SECURE-VMTP
                	vines	83	VINES		# VINES
                	ttp	84	TTP		# TTP
                	#iptm	84	IPTM		# Protocol Internet Protocol Traffic
                	nsfnet-igp	85	NSFNET-IGP	# NSFNET-IGP
                	dgp	86	DGP		# Dissimilar Gateway Protocol
                	tcf	87	TCF		# TCF
                	eigrp	88	EIGRP		# Enhanced Interior Routing Protocol (Cisco)
                	ospf	89	OSPFIGP		# Open Shortest Path First IGP
                	sprite-rpc	90	Sprite-RPC	# Sprite RPC Protocol
                	larp	91	LARP		# Locus Address Resolution Protocol
                	mtp	"
                 62033 pfctl    71.128119 RET   read 6394/0x18fa
                 62033 pfctl    71.128125 CALL  close(0x4)
                 62033 pfctl    71.128129 RET   close 0
                 62033 pfctl    71.128137 CALL  write(0x1,0x802252000,0x4c)
                 62033 pfctl    71.128142 GIO   fd 1 wrote 76 bytes
                       "all tcp 192.168.0.68:80 <- 192.168.11.7:50425       ESTABLISHED:ESTABLISHED
                       "
                
                1 Reply Last reply Reply Quote 0
                • M
                  mfld LAYER 8 @maverick_slo
                  last edited by mfld

                  I was about to open a thread here on the same before daring to touch redmine.

                  I noticed this thread on reddit.

                  Lo and behold I upgraded a 2.4.5-p1 to 2.5.2-RC and now the CPU is locked with high load from pfctl -ss.

                  System is virtual instance in KVM, single core, 1GB RAM. Less than half the RAM is used. No fancy packages. Normal workload service DNS recursion over tcp/853 and 53/udp averages 30-40k states out of 100k so it is not yet scaling.

                  In my case the states are mostly UDP connection from DNS clients. System used to handle 30-40k states like this without breaking a sweat and CPU utilization showing less than 10%. Now it handles around 1k states OK, once there are around 5k states it falls over.

                  screenshot at just under 5000 states

                  Cloned instance, upgraded to a 2.6.0 snapshot and see the same there.

                  I will try update to today's 2.5.2-RC build and see if it helps

                  Current Base System
                  2.5.2.r.20210613.1712
                  Latest Base System
                  2.5.2.r.20210615.1851

                  Edit2: Initially I came up with high load after updating to 2.5.2.r.20210615.1851 but this is because I had ~20k new connections coming in all at once after the reboot and this is a 5 dollar VPS so... Happy to report that having been live on 2.5.2.r.20210615.1851 for about 20 minutes now I am back to tens of thousands of states with "normal for my spec" load.

                  1 Reply Last reply Reply Quote 0
                  • M
                    maverick_slo
                    last edited by

                    I just updated to latest build and issue is gone :)
                    Thanks!

                    M 1 Reply Last reply Reply Quote 0
                    • M
                      mfld LAYER 8 @maverick_slo
                      last edited by

                      And now I pray that https://redmine.pfsense.org/issues/11545 will also make it into 2.5.2 ๐Ÿ™

                      Big improvements from 2.5.0 to 2.5.2. I'd hazard a guess the crew did not 9-5 these past few weeks. #insomnia

                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        Looks good for me here too on the latest build. I can't replicate any slowness on the latest build. I only had it up to about 1k states, but that was enough to see a significant impact previously.

                        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        T 1 Reply Last reply Reply Quote 2
                        • T
                          Trey @jimp
                          last edited by

                          @jimp

                          After updating 4 day ago, no more problems, thanks

                          1 Reply Last reply Reply Quote 1
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            There is a new snapshot up now (2.5.2.r.20210629.1350) which should be much better here. Update and give it a try.

                            Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            T 1 Reply Last reply Reply Quote 0
                            • T
                              Trey @jimp
                              last edited by

                              @jimp

                              I just updated. Anything outstanding to test or try? What specifically changed in this version regarding the pfctl problem?

                              1 Reply Last reply Reply Quote 0
                              • jimpJ
                                jimp Rebel Alliance Developer Netgate
                                last edited by jimp

                                We rolled back all the recent pf changes so it's closer to what's in 21.05/2.5.1 (but with multi-WAN fixed). The new code needs some optimization work yet and we didn't want to hold up 2.5.2 any longer.

                                Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                Need help fast? Netgate Global Support!

                                Do not Chat/PM for help!

                                T 1 Reply Last reply Reply Quote 1
                                • T
                                  Trey @jimp
                                  last edited by

                                  @jimp Greate news! After 10 minutes test drive no problems. Hope 2.5.2 is relased very sooooon. :-) The Multi Wan problem somehow, sometimes drives my IPSec VPN nuts. Like 10 pakets go through, then 10 pakets get droped....

                                  1 Reply Last reply Reply Quote 1
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.