2.4.5-p1 HA carp setup on Hyper-V, high hvevent0 CPU usage

rgijsen

We have a HA setup with CARP running on Hyper-V, and we've had it like that for years without any real issues. We are running 2.4.5-p1, on Server 2019 (core) with Hyper-V role. some VLAN trunk ports and all is functionally well. I noticed though that the primary pfSense node consumes a lot of CPU with kernel{hvenent0}, top extract:

PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 64K RUN 1 513:56 87.41% [idle{idle: cpu1}]
11 root 155 ki31 0K 64K RUN 3 505:40 86.37% [idle{idle: cpu3}]
11 root 155 ki31 0K 64K CPU2 2 499:32 85.76% [idle{idle: cpu2}]
11 root 155 ki31 0K 64K CPU0 0 489:10 70.66% [idle{idle: cpu0}]
0 root -92 - 0K 800K - 0 73:35 26.15% [kernel{hvevent0}]
32077 www 27 0 22016K 11672K CPU1 1 37:05 12.46% /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cf
0 root -92 - 0K 800K - 2 21:36 8.74% [kernel{hvevent2}]
0 root -92 - 0K 800K - 3 20:04 7.87% [kernel{hvevent3}]

Now with this copy / paste the CPU is about 26%, but I've seen it over 30 a lot of times as well. The second node (running on another Hyper-V box of course) does not have that problem, even when it's made active and it has work to do. Both hosts running these VM's are on the same patch-level, KB4577668 as of now. As far as I'm aware nothing obvious has been changed
in the pfSense VM's apart from the operational firewall rule changes we need from time to time.

As per https://forum.netgate.com/topic/121407/solved-with-workaround-higher-cpu-load-after-upgrade-to-2-4-on-hyper-v/3 and https://redmine.pfsense.org/issues/6882 I did soms checks, but both VM's do NOT have a CD drive attached at all, nor do they have SNMP enabled. I tried disabling the Hyper-V integration services from Hyper-V side, powered down the primary pfSense VM and restarted it and tried enabling and disabling SNMP just to make sure that wan't the issue. No cigar.

How to troubleshoot this further?

stephenw10

I would check vmstat -i and ps -auxwwd so if anything is showing high there.

It may not though. That seems to be load servicing the hv_vmbus.

Steve

rgijsen

This post is deleted!

stephenw10

Yes, that's exactly what I would try; failover to the secondary and see if the load exists there when it is master.

Steve

rgijsen

As stated before when I do a carp failover to the second node, or even shutdown the primary node, the second node shows practically no load at all. However, with the secondary node being active, the primary stills shows that load on the hvevent process.
So my next step I guess will be to mount the pfsense disk to a new VM with the same NIC config, and see if anything's wrong from that perspective. If that fails, I can still try rebuilding the whole VM.
Or would there be any other clues?

rgijsen

I had to delete my second post as it turns out I forgot to mask the IP addresses

I'll post it again below, but mind that it actually is a response to stephenw10's first reply.

vmstat -i

interrupt                          total       rate
cpu0:timer                      18539648        160
cpu1:timer                      14713803        127
cpu2:timer                      15964083        138
cpu3:timer                      15701489        136
cpu0:hyperv                    478580870       4130
cpu1:hyperv                     99633883        860
cpu2:hyperv                     99907288        862
cpu3:hyperv                     90973736        785
Total                          834014800       7198

ps -auxwwd

USER    PID  %CPU %MEM    VSZ   RSS TT  STAT STARTED       TIME COMMAND
root      0  88.8  0.1      0   800  -  DLs  Wed01    499:07.14 [kernel]
root     11 292.9  0.0      0    64  -  RNL  Wed01   6830:15.44 - [idle]
root     12   1.0  0.0      0   224  -  WL   Wed01      8:17.97 - [intr]
root      1   0.0  0.1   5016   848  -  ILs  Wed01      0:00.01 - /sbin/init --
www   32077  16.0  0.8  22016 11932  -  Ss   Wed01    124:37.03 |-- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D
root    344   0.0  1.7  94900 25024  -  Ss   Wed01      0:02.65 |-- php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm)
root  10455   0.0  2.6  99124 39452  -  S    23:31      0:41.73 | |-- php-fpm: pool nginx (php-fpm)
root  55168   0.0  2.6  99124 39256  -  S    23:34      0:41.27 | |-- php-fpm: pool nginx (php-fpm)
root  98139   0.0  2.6  99124 39256  -  S    23:08      0:42.40 | `-- php-fpm: pool nginx (php-fpm)
root    359   0.0  0.2   6756  2640  -  INs  Wed01      0:00.02 |-- /usr/local/sbin/check_reload_status
root    361   0.0  0.2   6756  2396  -  IN   Wed01      0:00.00 | `-- check_reload_status: Monitoring daemon of check_reload_status
root    414   0.0  0.3   9160  4984  -  Is   Wed01      0:00.06 |-- /sbin/devd -q -f /etc/pfSense-devd.conf
root   7792   0.0  0.4  14460  6796  -  Is   Wed01      0:00.00 |-- /usr/local/sbin/mpd5 -b -d /var/etc/l2tp-vpn -p /var/run/l2tp-vpn.pid -s l2tps l2tps
root   8312   0.0  0.2   6380  2376  -  Is   Wed01      0:00.18 |-- /usr/sbin/cron -s
root  13350   0.0  0.4  12472  5948  -  Ss   Wed01      0:08.50 |-- /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
proxy 14438   0.0  0.1   6268  2212  -  Ss   Wed01      0:02.25 |-- /usr/sbin/ftp-proxy -a <masked ip> -v
root  46084   0.0  0.5  21636  7396  -  Is   11:02      0:00.00 |-- nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-webConfigurator.conf (nginx)
root  46259   0.0  0.6  23684  8360  -  S    11:02      0:14.40 | |-- nginx: worker process (nginx)
root  46466   0.0  0.5  23684  8280  -  I    11:02      0:03.23 | `-- nginx: worker process (nginx)
root  49364   0.0  0.1   6208  2220  -  Is   11:07      0:04.66 |-- /usr/sbin/hv_kvp_daemon
root  49818   0.0  0.1   6196  2004  -  Is   11:07      0:00.00 |-- /usr/sbin/hv_vss_daemon
root  51968   0.0  0.2   6976  2612  -  IN   00:01      0:00.00 |-- /bin/sh /etc/rc.update_pkg_metadata
root  52083   0.0  0.1   4144  1824  -  INC  00:01      0:00.00 | `-- sleep 78926
root  60998   0.0  0.2   8168  3736  -  Is   Wed01      0:00.00 |-- /usr/local/libexec/ipsec/starter --daemon charon
root  61228   0.0  1.1  55900 17208  -  Is   Wed01      0:10.75 | `-- /usr/local/libexec/ipsec/charon --use-syslog
root  73334   0.0  0.5  12676  7344  -  Is   11:03      0:00.00 |-- /usr/sbin/sshd
root   7545   0.0  0.5  12968  7828  -  Ss   09:53      0:00.27 | `-- sshd: <user>@pts/0 (sshd)
root  37777   0.0  0.2   6976  2640  0  Is   09:53      0:00.00 |   `-- /bin/sh /etc/rc.initial
root  48222   0.0  0.2   7284  3484  0  S    09:53      0:00.03 |     `-- /bin/tcsh
root   2424   0.0  0.2   6824  2672  0  R+   10:00      0:00.00 |       `-- ps -auxwwd
root  77295   0.0  0.2   6908  2344  -  Is   Wed01      0:15.63 |-- /usr/local/bin/dpinger -S -r 0 -i <gateway> -B <masked ip> -p /var/run/dpinger_<gateway>~<masked ip>~<masked ip>.pid -u /var/run/dpinger_<gateway>~<masked ip>~<masked ip>.sock -C /etc/rc.gateway_alarm -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 <masked ip>
root  86712   0.0  0.4  10312  5692  -  Ss   Wed01    208:14.88 |-- /usr/local/sbin/openvpn --config /var/etc/openvpn/server1.conf
root  88850   0.0  0.2   6968  2808  -  Ss   Wed01      0:52.38 |-- /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
root  90697   0.0  0.2   6412  2476  -  Ss   Wed01      0:40.06 |-- /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -l /tmp/haproxy_chroot/var/run/log -P /var/run/syslog.pid -f /etc/syslog.conf
root  43415   0.0  0.2   6976  2488  -  Is   10:59      0:00.00 | `-- /bin/sh /usr/local/sbin/sshguard -i /var/run/sshguard.pid
root  43635   0.0  0.1   6196  1876  -  I    10:59      0:00.00 |   |-- /bin/cat
root  43907   0.0  0.3  12016  5184  -  IC   10:59      0:00.00 |   |-- /usr/local/libexec/sshg-parser
root  44184   0.0  0.2   6536  2372  -  IC   10:59      0:00.05 |   |-- /usr/local/libexec/sshg-blocker
root  44469   0.0  0.2   6976  2488  -  I    10:59      0:00.00 |   `-- /bin/sh /usr/local/sbin/sshguard -i /var/run/sshguard.pid
root  44518   0.0  0.2   6976  2476  -  I    10:59      0:00.00 |     `-- /bin/sh /usr/local/libexec/sshg-fw-pf
root  91192   0.0  0.1   6192  1896  -  Is   Wed01      0:00.00 |-- /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh
root  91342   0.0  0.1   6192  1912  -  I    Wed01      0:00.04 | `-- minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
root  91357   0.0  0.1   6192  1896  -  Is   Wed01      0:00.00 |-- /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts
root  91586   0.0  0.1   6192  1912  -  I    Wed01      0:00.00 | `-- minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts  (minicron)
root  91798   0.0  0.1   6192  1896  -  Is   Wed01      0:00.00 |-- /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data
root  92231   0.0  0.1   6192  1912  -  I    Wed01      0:00.00 | `-- minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data  (minicron)
root  95054   0.0  1.9 241920 28616  -  Is   Wed01      0:13.11 |-- /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
root  47601   0.0  0.2   6976  2536 v0- SN   11:02      0:15.75 |-- /bin/sh /var/db/rrd/updaterrd.sh
root  91893   0.0  0.1   4144  1824  -  SNC  10:00      0:00.00 | `-- sleep 60
root  69685   0.0  0.1   6316  2032 v0  Is+  11:26      0:00.00 |-- /usr/libexec/getty Pc ttyv0
root  61754   0.0  0.1   6316  2032 v1  Is+  Wed01      0:00.00 |-- /usr/libexec/getty Pc ttyv1
root  62014   0.0  0.1   6316  2032 v2  Is+  Wed01      0:00.00 |-- /usr/libexec/getty Pc ttyv2
root  62204   0.0  0.1   6316  2032 v3  Is+  Wed01      0:00.00 |-- /usr/libexec/getty Pc ttyv3
root  62552   0.0  0.1   6316  2032 v4  Is+  Wed01      0:00.00 |-- /usr/libexec/getty Pc ttyv4
root  62819   0.0  0.1   6316  2032 v5  Is+  Wed01      0:00.00 |-- /usr/libexec/getty Pc ttyv5
root  62865   0.0  0.1   6316  2032 v6  Is+  Wed01      0:00.00 |-- /usr/libexec/getty Pc ttyv6
root  63124   0.0  0.1   6316  2032 v7  Is+  Wed01      0:00.00 `-- /usr/libexec/getty Pc ttyv7
root      2   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [crypto]
root      3   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [crypto returns 0]
root      4   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [crypto returns 1]
root      5   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [crypto returns 2]
root      6   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [crypto returns 3]
root      7   0.0  0.0      0    32  -  DL   Wed01      0:00.00 - [cam]
root      8   0.0  0.0      0    16  -  DL   Wed01      0:00.02 - [soaiod1]
root      9   0.0  0.0      0    16  -  DL   Wed01      0:00.02 - [soaiod2]
root     10   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [audit]
root     13   0.0  0.0      0    64  -  DL   Wed01      0:00.00 - [ng_queue]
root     14   0.0  0.0      0    48  -  DL   Wed01      0:00.03 - [geom]
root     15   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [sequencer 00]
root     16   0.0  0.0      0    16  -  DL   Wed01      0:00.02 - [soaiod3]
root     17   0.0  0.0      0    16  -  DL   Wed01      0:00.01 - [soaiod4]
root     18   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [sctp_iterator]
root     19   0.0  0.0      0    16  -  DL   Wed01      0:26.85 - [pf purge]
root     20   0.0  0.0      0    16  -  DL   Wed01      0:40.00 - [rand_harvestq]
root     21   0.0  0.0      0    48  -  DL   Wed01      0:02.03 - [pagedaemon]
root     22   0.0  0.0      0    16  -  DL   Wed01      0:00.00 - [vmdaemon]
root     23   0.0  0.0      0    16  -  DNL  Wed01      0:00.00 - [pagezero]
root     24   0.0  0.0      0    32  -  DL   Wed01      0:02.41 - [bufdaemon]
root     25   0.0  0.0      0    16  -  DL   Wed01      0:00.54 - [bufspacedaemon]
root     26   0.0  0.0      0    16  -  DL   Wed01      0:04.46 - [syncer]
root     27   0.0  0.0      0    16  -  DL   Wed01      0:00.60 - [vnlru]
root     66   0.0  0.0      0    16  -  DL   Wed01      0:00.09 - [md0]

No obvious culprits to me, just kernel and specifically hvevent0 being high.

The only obvious difference in both pfSense VM's is that the primary one runs on a dedicated HPe DL360 gen8, which, even with the latest firmwares and bios, has Current PTI status: Enabled, by means of 'auto detect'. The second node runs on a Gen10 with loads of other VMs, but that one has PTI disabled. That shouldn't make too much difference, especially as it's the hyper-v servics that get loaded, rather than some actual BSD OS related stuff.

I'll try moving the passive box to the same host as the primary, and see what happens.

rgijsen

I've had no success so far. In the past I've had one VM which had something of a corrupted configuration (from hypervisor perspective) which made it very slow to live-migrate to another host. In the end I removed that VM, created a new VM with the same disk / network config and that issue was gone.
I did the same with this pfSense VM, but alas, no cigar. The next thing I'm going to try is moving this first node to a gen10 host, like the second node is. While pfSense shouldn't be that CPU intensive, who know there is something of a hardware issue, even if virtualized. If THAT doesn't work, I guess I'll reinstall it from scratch and restore the config.
It'll take a while though, I can't live failover pfSense unfortunately, as most connections are through haproxy, which doesn't sync states.

stephenw10

It could well be something like that. That load appears to be all hypervisor generated, it's unclear what could be causing it.
https://www.freebsd.org/cgi/man.cgi?query=hv_vmbus&sektion=4

Steve

rgijsen

Coming back to this, I have to come clear. Next weekend is my monthly maintenance, and to rule some things out I did a carp failover to the second node this night. Practically no CPU load on the second node. However, now there's actually a lot of traffic going on, I have the same phenomenon on the second node as well. I guess I ran top -a on that second node, rather than top -aSH, showing no kernel use at all. Now on the second node I have haproxy consuming about 9-11% and kernel{hvevent0} about 15%.

Even if I wonder why there's so much kernel usage on a rather low overall load (about 10.000 states, that't peanuts), it seems to be a hypervisor integration issue, or maybe even just regular kernel load. I can't try another hypervisor or even physical machine, as these pfSense boxes are in production. In my test setup I can't generate that amount of load, which is why I never saw the issue there probably.

stephenw10

Ah, OK so when you failover you are actually seeing about the same loading on the other node?

It's probably just where the load appears for that. Unless you're actually seeing an issue there I wouldn't worry too much.

Steve

rgijsen

I'll leave it at it is. Thanks for the insights!