cannot connect to internet with new install behind PPPoE gateway in bridge mode
-
As the title say, I cannot connect to the internet after setting up my new SG-2100 behind my Frontier DSL gateway - an Arris NVG443b set to transparent bridge mode and pfsense WAN configured as PPPoE with proper credentials entered for username/pw. The only similar situation I found on the forums is here https://forum.netgate.com/post/921479, so I am thinking changing default gateway to WAN_PPPOE may be the fix, but since I won't be able to work on this until later tonight or tomorrow I thought I would see if anyone else has any thoughts. Plus, there is a possible confounding issue related to the Arris box. A detailed explanation follows.
First, the issue with the Arris box is that there is apparently an problem with the firmware version I am running affecting transparent bridge mode. When transparent bridge is enabled it is supposed to disable the wireless, firewall and DHCP. However, with current firmware wireless is not disabled; not sure if firewall and DHCP are also not disabled or if there are other issues. Information online is limited and no one mentions if this issue prevents them from connecting to internet in bridge mode. Latest firmware reportedly fixes these issues, but Frontier does not provide a way to manually check for and install firmware updates. Not sure if I can get them to provide me with latest firmware, since they don't support transparent bridge mode. If this turns out to be the culprit then I will go the DMZ route, but hoping to avoid that.
On to pfsense. As noted, I am using an SG-2100, running pfsense 2.4.5-release-p1. This is my first pfsense install, although I am familiar with the software, which we use at my office (I did not configure/install, but have some experience configuring rules). I ran the set up wizard after reading through the manual several times and watching the Lawrence Systems youtube tutorial video. The 2100 was not connected via WAN port during set up - ran it off my laptop at work, and this is for my home network. Relevant settings I ended up with are as follows (if not noted, settings were left at default - let me know if there are other settings you need to know):
hostname: pfsense
domain: home.arpa
primary dns server: 208.67.222.123 (opendns familyshield) (gateway not set for either dns server)
secondary dns server: 208.67.220.123 (opendns familyshield)
WAN type: PPPoE
PPPoE username and password configured correctly (triple checked this)
Override DNS: unchecked
disable DNS forwarder: unchecked
block private networks: checked
block bogon networks: checked
LAN: 192.168.147.1
subnet mask: 24
LAN DHCP server: enabled (no issues noted on LAN - clients being assigned addresses)
Also changed the following advanced settings after watching Lawrence Systems video:
changed the TCP port for the webGUI
disabled webGUI redirect
set NAT reflection to Pure NAT (don't think this was necessary at this point as I have not configured any port forwards or other rules, but sounded like something I may want once I get up and running)After running the wizard I took the 2100 home to test. As noted, the box had not been connected to the internet as yet, so no packages installed/running and I had not configured any additional rules. I wanted to check the basic configuration before doing anything more advanced.
On the Arris, I disabled wireless, DHCP, firewall and all but 1 of the LAN ports. I then enabled transparent bridge mode and rebooted. Once rebooted, I noted the wireless was re-enabled - not sure what else might have been re-enabled as I could not access web interface on the Arris at that point.
I then connected WAN port on the 2100 to LAN1 on the Arris and booted up the Netgate box. Once booted, I connected my laptop to LAN1 on the 2100. I could connect to the pfsense webgui, but laptop reported no internet connection. Pfsense dashboard showed WAN down and LAN up. After a few minutes WAN showed as up with a public IP address, but still no internet connection. I noted as I navigated through pfsense GUI to check settings that WAN would sometimes be up, with public IP, and sometimes down. Not sure if public IP was the same each time or if it changed. I did restart both devices, with no change in behavior. I ran the Windows network troubleshooter, which said it could not resolve DNS servers. Dashboard shows DNS servers as 127.0.0.1 as well as the opendns familyshield servers.
As far as other troubleshooting, I didn't have time to do much as internet was now down for rest of family, but did try toggling the following off and on (individually) with no change:
block private networks (on WAN)
enable DNS query forwarding (under Services->DNS resolver)
disabled DNSSEC (Services->DNS resolver) - I believe I have read some posts indicating issues with opendns servers and DNSSECThanks for reading through that. Any help is certainly appreciated. Let me know if there are any other details you need.
-
Ok, if it showed the WAN as UP with a public IP at all it must be passing the PPPoE traffic through the modem at least partially.
Check the System logs and the PPP logs, which will likely look mostly the same. There should be something there to indicate why it disconnected.
Look for the remote PPPoE server passing you a valid gateway to use.You will only have one gateway and it will be the auto-created gateway for WAN. You can leave the default gateway set as automatic for now.
Steve
-
sys log for transparent bridge fail.zip [0_1612370139795_sys log for transparent bridge fail.doc](Uploading 100%)
@stephenw10 edit - I clicked submit before I was done
Thanks for the advice. I checked the logs and see several instances of connecting and disconnecting. I get the correct WAN gateway IP from my ISP and then a WAN IP. It seems that within a few seconds to a minute the WAN IP changes and then the connection is lost. I have attached a segment of the sys log showing 1 connect/disconnect cycle, since each cycle appears almost identical except for the IPs. I am not exactly sure how to interpret this and hoping you or someone else here can take a look and give me some advice.I suppose my main question is whether this is due to issues with the Arris gateway from my ISP of configuration issues in pfsense. I am leaning toward the former since it is known that transparent bridging is not behaving correctly with the current firmware I have installed in the gateway, as I described in my original post. I should note that before enabling transparent bridge mode, I manually disabled the wireless, firewall and DHCP on the gateway, as well as disabling all but 1 LAN port, and when I rebooted after enabling transparent bridge the wireless was back on (but with default SSIDs) and all LAN ports were active. So I am assuming that DHCP and the firewall on the gateway were also re-enabled, and who knows what else since I cannot access the GUI in bridge mode. After typing the paragraph below I realized there is one other aspect of the Arris I should note. The default DNS config (Frontier DSL) is set to dynamic - which I had changed to static in order to use with the opendns familyshield servers. Given the other issues with bridge mode I am assuming that this got reset as well when bridge mode was enabled.
I should also note that in the meantime I reset the Sg-2100 to factory default and reconfigured it so now it is in the DMZ of the gateway, and that seems to be working fine. The only issue I had was getting it to use the opendns familyshield DNS servers, but I got that working by enabling the DNS query forwarding in DNS resolver and DNS Server override in General setup. I can live with this set up, but would prefer to use the transparent bridge mode if possible.
Please let me know if any other info/details would be helpful. And thanks in advance for any help.[0_1612370088430_sys log for transparent bridge fail.doc](Uploading 100%)
-
Well it's disconnecting after LCP stops responding:
Jan 30 15:23:46 pfSense ppp: [wan_link0] LCP: no reply to 1 echo request(s) Jan 30 15:23:56 pfSense ppp: [wan_link0] LCP: no reply to 2 echo request(s) Jan 30 15:24:06 pfSense ppp: [wan_link0] LCP: no reply to 3 echo request(s) Jan 30 15:24:16 pfSense ppp: [wan_link0] LCP: no reply to 4 echo request(s) Jan 30 15:24:26 pfSense ppp: [wan_link0] LCP: no reply to 5 echo request(s) Jan 30 15:24:26 pfSense ppp: [wan_link0] LCP: peer not responding to echo requests Jan 30 15:24:26 pfSense ppp: [wan_link0] LCP: state change Opened --> Stopping
Like ~30s after it connects. We can't see the gateway IP but I assume it's valid. The latency looks like it's actually going over the WAN:
Jan 30 15:23:37 pfSense rc.gateway_alarm[69096]: >>> Gateway alarm: WAN_PPPOE (Addr:74.42.148.170 Alarm:1 RTT:12.255ms RTTsd:3.627ms Loss:21%
That matches this:
Jan 30 15:21:55 pfSense ppp: [wan] 50.107.165.xxx -> 74.42.148.170
Is that the expected gateway?
Try comparing it with the logs from a successful connection.
Nothing there really looks like a problem other than the LCP timeouts.
Steve
-
@stephenw10
To answer your question, that is the correct gateway IP and the WAN IPs are in the range for Frontier in my area. When you say to compare it to a log of a successful connection, I assume you mean one that lasts more than 30 secs?
I have not been able to achieve that with the PPPoE connection. I have with the current set up using DMZ and setting the WAN interface to DHCP, but I'm thinking you mean a successful PPPoE connection. If any successful connection will do, let me know.
I did search for "pppoe lcp" here, and it looks like a number of people have reported this or a similar problem over the last couple of years. I haven't had a chance to look at them in detail, but will take a look and see what others have done.
Any thoughts on the issues I report regarding the Arris modem and transparent bridging? I suppose I am just looking for confirmation of my thinking that I can't be sure there's an issue with pfsense if I know the Arris is acting flaky. I will try to pull those logs as well and see what I can find out.
Thanks again. -
Hmm, well something there is not right. As I understand it LCP echo requests should only be sent when no other traffic is on the WAN. Do you still see that if you have something running across it?
You might want something link this:
https://redmine.pfsense.org/issues/3552However you can manually configure a conf file to use with whatever options you want.
Copy the existing ppp conf file from /var/etc/ it's named with the interface so probably mpd_wan.conf. Copy it to /conf, e.g./conf/mpd_wan.conf
That file will be used in preference to the autogenerated one for that interface so edit it as you wish.
So setting:
set link keep-alive 0 60
Should disable LCP echo requests completely which may or may not help you!
Steve
-
Hmm, well something there is not right. As I understand it LCP echo requests should only be sent when no other traffic is on the WAN. Do you still see that if you have something running across it?
Forgive my ignorance, but is there a way to get something running across the WAN when it is up for such a short time? When I was working on this it seemed that by the time I could see the WAN connection was up (in the dashboard) and tried to test by bringing up a webpage, the connection was already down again. And the network icon in the taskbar never showed an internet connection.
You might want something link this:
https://redmine.pfsense.org/issues/3552However you can manually configure a conf file to use with whatever options you want.
Copy the existing ppp conf file from /var/etc/ it's named with the interface so probably mpd_wan.conf. Copy it to /conf, e.g./conf/mpd_wan.conf
That file will be used in preference to the autogenerated one for that interface so edit it as you wish.
So setting:
set link keep-alive 0 60
Should disable LCP echo requests completely which may or may not help you!
Sounds like that is worth a shot. I still need to look through the forum and see what others with this or a similar problem have found as well. Then I'll need to find some time to test, since it means taking down the internet for the family. Hopefully I can get on it this weekend.
Let me ask 2 things.
First, seems like the issue is the ISP not responding the the LCP echo requests, so isn't disabling the LCP echo requests more of a workaround than a fix? I don't have a problem with that if it works, but I would always prefer to get at the root cause if possible.
Second, I looked through my full sys log from the other day and did find a period of about 30 minutes with 6 connect/disconnect cycles. It looks like essentially the same info repeated 6 times, but just wondering if there would be any utility to posting that section of the log. Would seeing that many instances back to back help at all?Once again, thanks for the help.
-
You shouldn't need to send anything since by default the gateway monitoring in pfSense would be pinging across the link continually anyway. Something there is clearly broken. It might well be at the ISP end.
If there is any new info in the log showing an error it might be useful to see. If it just fails in the same way there's nothing much more we can do with that info.
Steve
-
Thanks. Not sure when I'm going to be able to get back to actively troubleshooting this. I'll stick with the DMZ setup for now and continue to research. Once I have some answers, or more likely new questions, I will start a new thread.