Haproxy 100% cpu usage

stephenw10

@coreybrett said in Haproxy 100% cpu usage:

No panics for me

Was it panicking in 2.9.1?

coreybrett

@stephenw10 no, that was never an issue for me

coreybrett

these sites have very little traffic, but they service web hooks that must work when called

Luca De Andreis

@coreybrett

Before upgrading HAProxy, then using version 2.9.1, I had problems with instability, PfSense reboots and high CPU load on the system in production (higher network load).
On an identical system, but at low load I had only CPU "waste" but no system instability.
In my opinion, with version 2.9.1 of HAProxy the crashing problem occurs only if a certain system load is exceeded.
With 2.9.7 everything was ok after 11 days of uptime.

Luca De Andreis

@stephenw10

Hi !

This morning my PfSense plus on 24.03 and HAProxy crash.
dump.tar.gz

stephenw10

Hmm, disappointing. Looks like exactly the same crash.

You didn't see the high CPU loading before it panicked?

Luca De Andreis

@stephenw10

No, low or very low rate (on HAProxy 2.9.7) abnormal high CPU usage on 2.9.1
Just a note, PfSense plus run on a VM on more than 40 firewalls, only PfSense using HAProxy has this behavior.

coreybrett

I was not able to use more that one backend server. With only one backend, it would run without any issues, but if I had more that one backend, it wouldn't last 24 hours before it consumed 100% of CPU and stopped accepting new connections.

The HAP logs did not reveal anything. However, I was looking thru the system logs and noticed that one of my WAN interfaces was glitching, and that coincided with HAP locking up.

The WAN interface was not really necessary (it had a static IP and was connected to a Cradlepoint cellular modem), so I just disabled it to see what would happen. This interface was not used by the HAP config at all. HAP has been solid since and is working fine with my full config that has 4 backends. This also fixed another issue I had with "loading the rules: pfctl: DIOCADDRULENV" errors.

Sergei_Shablovsky

@Luca-De-Andreis said in Haproxy 100% cpu usage:

@stephenw10
The site in production with a fair number of accesses, stayed UP 3-5 days, then crashed.

????
That’s VERY BAD idea to MAKE FRESH UPDATE ON A PRODUCTION node!!!

Who stop You to keep balancer (the same HAproxy, for example, or claster of HAproxy’s behind pairs master-slave/slave-slave LVS) above Your’s 30 of pfSenses to having the ability of flawlessly redirect a part of whole traffic on a node which You need to test on a real loading after fresh update ?

Sketch of this structure are on a picture below:

(You may change “Mobile Frontend” on a “pfSense” name, but the sense of structure would be the same: You have HA and balancing above set of Your pfSense’s nodes.)

Or may be this topic also would be helpful for You in building infrastructure.

If You so serious in business and having 30+ firewalls, creating whole architecture to be able flawlessly pull out/put in nodes (for updates, hardware maintenance and upgrades) - only one way not to having headaches.

All PfSense works in VM and we have about 30 of them,

I strongly suggest not to using pfSense on VM on HighLoad in production, and ALWAYS USING BARE METAL servers (no matter this would be Netgate hardware or DIY server from IBM, Dell, Fujitsu, SuperMicro,…), even You have fast-and-costly NICs like Mellanox, Intel…

Using pfSense on VM You make step on a way with a lot of different troubles. Part of them You receive at start, and most of them You achieve only after Your business are in the middle of the way and changing the infrastructure are so much costly (if possible at all).

Sergei_Shablovsky

@Luca-De-Andreis said in Haproxy 100% cpu usage:

@stephenw10

In this regard I had also opened a ticket via "professional" support, which was closed with the response... "HAProxy is a third-party package, its update is managed in best effort" .... closed.

Hm. VERY BAD PAID SUPPORT BEHAVIOR at all!!

Just copy Apple’s strategy: even we are not a developer of this certain software/hardware ANYWAY WE TRY TO SOLVE CUSTOMER’S PROBLEM !

Sergei_Shablovsky

@coreybrett

Please, SAVE THE CONFIG.XML and try fresh install (and of coarse put config.xml back in place) from scratch (not forgot to cold reboot at the end of install),- may be this help…