Intermittent packet loss related to DHCP with Multi-WAN
-
So far I'm confused why two WAN interfaces with DHCP can impact each other. I've been on stable most of the time, but see the same behavior with current 2.7.0 develoment branch as well:
2.7.0-DEVELOPMENT (amd64) built on Wed Jan 04 06:05:22 UTC 2023
I have searched answers online and tried setting "Protocol timing" on WAN2 interface to "pfSense Default" (it was all all clear prior), but it didn't seem to fix the issue, maybe improved things a bit, hard to tell for sure.
Any thoughts what else I can debug here? I contacted ISP, everything looks perfecly fine on their end from logs.
-
Any thoughts from the community? Some days it is bad, others it is terrible.
-
@nazar-pc Is this guide in link below for some help for you?
Link: https://www.cyberciti.biz/faq/howto-configure-dual-wan-load-balance-failover-pfsense-router/I was successful with this guide. Redo it from scratch.
-
@bavcon22 I don't see how that would help and why would I bother wiping the configuration I was maintaining for many years and start from scratch.
It works for me as well, it is just buggy and unstable, which in turn seems to be unlikely due to misconfiguration. -
Reported as an issue a while back, but not much help there either: https://redmine.pfsense.org/issues/14237
Anyone can suggest anything else to check/debug? I'm on 2.7.0 stable right now.
-
I looked at packet capture and noticed that the packet loss happens when interface is considered down and it is considered down when DHCP server of an ISP doesn't respond to DISCOVER within a second. Is there a way to increase the timeout beyond one second so that it doesn't fail anymore?
-
@nazar-pc said in Intermittent packet loss related to DHCP with Multi-WAN:
when interface is considered down and it is considered down when DHCP server of an ISP doesn't respond to DISCOVER within a second. Is there a way to increase the timeout beyond one second so that it doesn't fail anymore?
you could try to increase the "select-timeout" in the advanced dhcp options:
https://man.freebsd.org/cgi/man.cgi?query=dhclient.conf&sektion=5#PROTOCOL_TIMING
is this system a VM?
are you sure you don't have clock issues ? (time skipping, jumping back en forth) -
@heper said in Intermittent packet loss related to DHCP with Multi-WAN:
is this system a VM?
are you sure you don't have clock issues ? (time skipping, jumping back en forth)It is a KVM machine, but I don't think it is a clock issue, that'd be weird. It is just running on a regular desktop-grade hardware with not that much load on it. Not sure how I would verify that though.
@heper I just got a response from ISP representative, they pointed out to several issues with DHCP client in pfSense. It basically doesn't follow the spec and that results in issues.
I'll try to translate and summarize it in a bug report.For instance https://datatracker.ietf.org/doc/html/rfc2131#section-4.4.1:
The client begins in INIT state and forms a DHCPDISCOVER message.
The client SHOULD wait a random time between one and ten seconds to
desynchronize the use of DHCP at startup.At the same time according to my observations it sends another DHCPDISCOVER right after 1 second, which it is not supposed to do. There could be several DHCP servers responding within 10 seconds, but client just gives up after 1 second.
There are a few others, not sure if I should create one bug report for each or one bigger bug report with all of them together.
Is DHCP client pfSense-specific or just comes from FreeBSD as is?
-
-
@heper Whatever it is, I feel like I'm finally one step closer in figuring it out, it has been happening to me for over 6 month and at the same time I have no issues when connecting ISP to Linux machine directly.
So whoever's fault it is, I'd like to find out and help to resolve it.UPD: It is a different story entirely though that this failure in DHCP client and one interface results in all networking in pfSense going down
-
Posted DHCP client issues here: https://redmine.pfsense.org/issues/14604