Replaced Soekris with Netgate 4860- 1U ??
Short story Soekris box took a hit (surge) via one of the internal connections OPT1 in a storm.
Ordered a replacement Netgate, dumped the configuration and restored it to the new box..
Updated the Interface assignments as appropriate - Though we were done. But I missed a step in that the Comcast link (opt2) did not get a upstream gateway address and I did not notice it till much latter in the day.. This setting apparently critical really should be above the fold.. :-)
The question I have is more regarding why traffic did not work as the WAN gateway was fine, and the two are combined into a Gateway for the customer with OPT2 as Tier 1 and Wan as Tier 2 (The WAN is a T1 from Cbeyond/Birch an used mostly for SIP traffic). The problem was most evident in that DNS resolution was failing and I could not ping any of the external DNS servers from devices using the gateway instead of specific routing rules.
Rather than Failing I would have thought the Tier 1 gateway target traffic would have timed out, and failed over to Tier 2. NOTE this is not load sharing. Also, the status screen showed both as UP when in fact the OPT2 interface was down as there was no upstream gateway defined.
Anyway, all is well just trying to learn for next time –
When DNS fails like that it's usually because the clients are using one of the DNS servers on pfSense and that is not configured to use both WANs.
By default pfSense runs Unbound in resolving mode. In that configuration Unbound itself always uses the default route so if that was the Comcast link in this case it would have failed and no clients using it could resolve IPs.
To avoid that either use forwarding mode in Unbound or switch to the DNS forwarder and make sure you have upstream DNS servers defined against both WANs in System > General. Or alternatively enable default gateway switching in System > Advanced > Misc.
Using DNS forwarding is usually preferable to avoid traffic on the wrong WAN after a failover.