Failover didn't fall back to tier1 after downtime
-
-Tier1 came back online
-Gateway group showed both 'online'
-Default route = Tier1
–---Default-gateway switching = enabledstill, for whatever reason, it kept sending "some" clients out through tier2 - this was still happening 10hours after the last gateway event.
to fix it i clicked "reset all states" in the GUI.
(it's impossible that the states were still alive from before the gateway-event, because nobody was around at 2am in the morning)May 2 13:38:35 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 40% dest_addr 195.130.130.11 bind_addr 81.82.213.131 identifier "WAN_TELENET0 " May 2 13:38:35 dpinger send_interval 500ms loss_interval 10000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 40% dest_addr 195.238.2.21 bind_addr 192.168.5.2 identifier "WAN_SCARLETGW " May 1 02:14:26 dpinger WAN_TELENET0 195.130.130.11: Clear latency 9788us stddev 1395us loss 29% May 1 02:13:26 dpinger WAN_TELENET0 195.130.130.11: Alarm latency 9607us stddev 1263us loss 41%
The values set for dpinger are those that made APINGER work somewhat reliably.
perhaps the values need to be set to sane values, now that we have a good pinger ?That said, the system has been up for 20 days, and a couple of failover events took place … this is the first time it didn't fall back.
suggestions?
-
since first post it happened again, 3 times to be exact.
i've reset all dpinger value's/variables to their default settings by GUI.dpinger clears the error, but pfsense keeps sending traffic towards the Tier2 gateway (identical as first post).
i'm thinking the 'clear' isn't (always/under every circumstance) picked up by the backend code.today i changed the trigger level from 'member down' -to-> 'packetloss or high latency'.
will update this thread with updates in the next couple of days/weeks -
Maybe you`re hitting this: https://redmine.pfsense.org/issues/6110 ?
-
perhaps, but no PPP(oE) involved. it is possible that default gateway switching is still enabled (from back in the day when there was a transparent proxy running). will check if disabling this makes a difference
-
Guessing it's already-established connections that are staying there maybe? That'd be expected.
Two things influence traffic routing. Guessing your clients are being routed via a gateway group, which you can verify on the back end with:
grep route-to /tmp/rules.debug
The other thing would be the default gateway, for traffic matching firewall rules set to "default" rather than a gateway group. Check Diag>Routes to verify that.
-
@cmb:
Guessing it's already-established connections that are staying there maybe? That'd be expected.
it kept sending new clients towards tier2 for days after the gateway event … can't have been (all) established connections
@cmb:
Two things influence traffic routing. Guessing your clients are being routed via a gateway group, which you can verify on the back end with:
grep route-to /tmp/rules.debug
The other thing would be the default gateway, for traffic matching firewall rules set to "default" rather than a gateway group. Check Diag>Routes to verify that.
will check the rules.debug when/if it happens next