Load balance nit-picks (post-success questions)

  • Hi!

    I've configured load balancing successfully for a few months now, plus recently multiple pools all doing their job fine.

    Using v1.2.3 - so it may be an old issue, but need to check if it's worth some downtime to upgrade.
    My config is 9 gateways, using the config method of editing the backup file.


    My issues is, as you spot the deliberate error, the gateway-alive field cannot be anything else because:

    • If all the same, say, the first one to check it will apply to ALL gateways (so GW1 fails - they all fail)
    • If each has a unique external address - say a different google server each - they all fail for no reason.

    Currently, I have each test itself - fine for gateway failure, bad for WAN failure.

    Live example:
    In attached image 'pfsense-lb-0.jpg', gateway 206 has no WAN connection, but because it pings the gw direct, it can't say if the wan is dud.
    In attached image 'pfsense-lb-1.jpg', each ping a unique, pingable, as-I-type-tested IP with low loss - all fail with 100% loss.

    Is this a bug?
    Is it fixed in later versions?
    Any workaround otherwise?


  • I have noted a high ping ms there - this is due a bit load. It can be as low as 20-30 and still 'fail' with 'correct' config'.

    Each GW is fed into a switch then to the pfSense box - no more than 3 feet total distance from each other, tested with different switch and routers - will sit at this figure under load.

    Pinging (google svr) via a pc routed through the pfSense box will result in a 18ms ping.
    Pinging the same IP via pfSense diags results also in an 18ms ping
    Rather oddly - pinging a GW via the digs results in a 0.5ms ping - so why in the 70's range with the LB tool?

    More 'oddly':
    As I type this, I tried half/half. First 5 having unique external IPs to ping. The first being the pfSense gateway, 200, is now responding with a 20ms ping. The following 3 are 100% loss. The fifth 100% loss but 217ms ping.
    Last four 'live' as still pointing to themselves.

    Changing the pfSense GW to another IP makes the first in the list go offline - with 19ms ping.

    Some randomness, with some changing state with no correlation to ping ms.

