DHCP WAN Problem after installing 2.2.4
-
I used pfSense 2.1 for a long time on an ALIX board. Because the ALIX is too slow, I upgraded to an APU board.
Doing so, I did a fresh install of 2.2.4 on the APU board and put the ALIX out of service.
My LAN setup (didn't change, just upgraded the firewall):
[Cable Modem]–-[WAN Switch A]–--[WAN Switch B]–-[pfSense]–-[LAN Switch]–-[Client(s)]
The Cable Modem operates in bridge mode, the WAN IP is acquired by pfSense itself through DHCP.
Since 2.2.4, I've noticed a severe problem:
Whenever WAN Switch A or B or the Cable Modem goes down and comes back up, pfSense completely and permanently looses all WAN connectivity.
apinger shows the gateway offline then.The only things I can do to restore connectivity afterwards are:
a) reboot or b) Release DHCP (Status|Interfaces), then RenewWhat might be amiss?
-
having two switches in between seems really bad, or at least unnecessary. Often situations like you describe with cable modems are the modem seeing the MAC of a managed switch via STP and locking on to it until you do something that sends a gratuitous ARP (reboot, release/renew) or a new DHCP request or similar.
Take the switches out of the mix entirely and see if it's still replicable. Highly likely from that description it isn't (the APU has no clue when you cycle the modem or switch A, so it almost certainly is something outside of it).
-
Alas, removing the two switches is not possible.
For a test, I routed via two powerline adapters, but it's the same problem there. Only comes back online after I reconnect the WAN interface plug, do a release/renew or reboot.
Interestingly, this particular network's layout didn't change in over two years and the problem was not there before the update.
Even more curious: The same issue arises if I reconnect the WAN (RF) side of the cable modem ??? ???
I can confirm from shell that pinging the gateway won't work in that case. Eg not a problem with apinger I think.
At the same time, I can see the gateway 's MAC in pfSense's arp table. -
You get an ARP cache entry because inbound traffic works. Nothing further because outbound doesn't, your modem is picking up some other MAC as its authorized MAC, probably one of one of the switches from STP or similar things it's spitting out. Then it doesn't allow your APU out.
-
Makes sense somehow, yeah. So that the cable modem would not listen unless we have obtained a valid DHCP lease.
But then again totally not - it worked with my old ALIX based pfSense (2.2.1 I think)!
I will try putting back the ALIX box to check.Giving this a second though, how could the cable modem even see any of the WAN switches' MAC addresses?
- WAN circuit is on a VLAN
- no other devices are connected to that VLAN
- Management IP of both WAN switches are on a different VLAN.
- cable modem is not on management VLAN
pfSense certainly does not have an ARP entry on the WAN interface for any of the WAN switches.
Also: Why does the same issue appear if the WAN/RF side of the cable modem is being re-connected?
Note that the modem is not power cycled, ie the Ethernet link never goes down. -
Managed switches sent out STP, which can cause the cable modem to pick up that source MAC as the authorized device. Where you're authorized for a single device on your account, often your cable modem grabs the first MAC it sees and that's the authorized one. Losing upstream link and re-gaining it makes your modem re-learn the first MAC it sees from the sounds of it.
Don't bother putting the ALIX back in, there is absolutely no question that the device you have two switches in will have no impact whatsoever on your cable modem when you do anything to it.
Disable STP on the switch port(s) connected to your cable modem and the problem probably goes away. Same issue comes up here once a month or so and that always fixes it.
-
By STP, you mean RSTP PDUs? Those switches on the WAN side do not support (R)STP at all.
Also, my subscription allows for up to 4 dynamic IP Addresses. Shouldn't the modem allow up to 4 "authorized" MACs then, according to that theory?
Something doesn't add up I think…
-
Either STP or RSTP BPDUs, yes. Or LLDP, CDP or other possibilities for initiating traffic, if managed switches. Where you have 4 dynamic IPs, that shouldn't be the case. But it's still definitely a problem on the modem given only unplugging and replugging the coax, which has no relation to or impact on anything inside your network, resolves it for some period of time.
-
I've spoken to the cable provider, and they have no clue.
Thus, I resorted to purchasing a 600x14mm drill and a 100m spool of cat6 cable and pulled a second Ethernet wire, so as to ditch the Switches on the WAN side.
To no avail :( Even with a dedicated wire going from pfSense WAN to cable modem now, with zero intermittent network switches, the problem persists.
I would also think that it's likely to be a problem with the modem. However, the issue was non-existent before upgrading both pfSense software and APU hardware (was ALIX before)… :-\
To clarify: Re-plugging the coax doesn't solve anything. Au contraire, re-plugging the coax on the modem does indeed create the very problem!
My observation is such:
If the WAN link is severed AND if the pfSense WAN interface does not go down at this event (examples: provider side network problems / disconnecting the coax at the cable modem in my current No-WAN-Switch setup / disconnecting the Ethernet at the Far-From-PfSense-Side in my previous WAN-Switch-Setup), IP connectivity is permanently lost. Although the WAN link may eventually be brought back up, IP connectivity is only to be restored by getting pfSense DHCP client to release and renew somehow:
a) reboot pfSense
or
b) physically re-connect the Ethernet wire at pfSense LAN port
or
c) disable the WAN interface in pfSense GUI, then reenable it
or
d) clicking release/renew on the WAN interface (status page) in the pfSense GUI -
did u ever get it resolved ? reverted back to 2.1.x ? im facing similar issue but with static IP WAN
-
Same issue. ISP originally stated they were receiving the MAC of the powerline, so I mimicked it in the firewall. The problems went away for about a week, then came back. I suspect it to be some configuration error on the pfsense side, as it seems to relate to DHCP not renewing properly.