Outbound load balancer failover problems. Not usable in production.

  • We still have 3 problems with load balancer in the latest 1.2.3 snapshot :

    1. after a failover, we need to manually realease / renew wan interfaces to get connectivity back. This is not always true. It seems that we need to do this only for Wan interfaces where the modem has been powered off. Our setup is ADSL modem in half bridge mode (the modem establish a PPP session and deliver a public IP through DHCP to PFsense). i will make more tests and reports them for each failure case.

    2. During a failover, the state table is not reseted, even using Fit123 AFC. -> all VoIP trunks are lost. We need to reset manually the table to get VoIP connectivity back.

    3. During a failover, it seems that port forwarding on the backup Wan interface does not work for UDP traffic, because we loose audio for SIP phones connected from internet to the PFsense Box (SIP phones are configured with a backup proxy and backup registrar to our failover WAN interface).
      This last problem need to be more deeply checked, as i suspect a bug in the phone state table himself.

    In the end, using Outbound Failover is less reliable than no Failover at all and not usable at all in production.

    It is interesting to see that most all complex router setups are only tested with HTTP trafic. As soos as other trafics are tested, VPN or VoIP; failovers becomes unusable.

  • Try the fit package again, found a path bug.
    also added so it logs to system logs.

  • I've updated to latest snaphot.

    Still no luck. Here is what i get after a failover in the system log :

    slbd[539]: Service Nerim-Free changed status, reloading filter policy

    check_reload_status: reloading filter

    I do not see a state table reset.

    How to be sure that i have the updated Fit123 ?

    I'm going to make tests now with a different rule for UDP trafic with a short state timeout.

  • How to be sure that i have the updated Fit123 ?

    untick afc and press save twice
    uninstall the the fit123 package and install it again
    tick afc and press save twice

  • I've created a rule to catch UDP traffic on the LAN interface, reducing the state timeout to 60 seconds. This rule is linked to the load balancer failover gateway.

    Something is weird. Normally with this rule we should see all UDP trafic states reseted after 60 secondes after a failover.

    It is not the case, perhaps because Asterisk never stop to send udp trafic, so the states are never resetted ?

    So i decided to modify this rule, setting the state type to "none". With this setting, we should have no memory at all on udp trafic.

    Obviously we shouldn't see any reference to UDP trafic in the state table with this rule enabled.

    The very strange thing is that the state table still keep states of UDP trafic without any difference… And the failover still does not work correctly.

  • FIT123 is know working. I got the state table reset for each failover. And VoIP trunks are OK.

    Nervertheless, it's quite strange to see that the state table keep states when using the "None" state mode in the firewall rules. Is this setting really working ?

    Last, is it the best solution to reset the full state table after a failover ?

    Doing this, TCP trafic linked to a dedicated WAN interface is cut through this table reset.