No internet connection after upgrade from April 20 snapshot to May 19
-
https://redmine.pfsense.org/issues/8504#change-36668
I thought that it was the similar problem, but also the wrong suspect.
-
@w0w said in No internet connection after upgrade from April 20 snapshot to May 19:
https://redmine.pfsense.org/issues/8504#change-36668
That is a completely different gateway issue, unrelated to this one. That was a failed gateway upgrade problem, this is a separate issue having to do with purely dynamic gateways not present in config.xml. If there isn't already a ticket for it, it should have one.
-
@jimp, in my case if I follow this closed ticket guidelines, installing the 2.4.3, everything works like a charm, but if I upgrade this working installation to 2.4.4 latest version it just does not have default gateway anymore, but it does not mean your upgrade code fails, yes, and it is different issue, caused by different commit mentioned above. Also, I understand that you have more information about closed issue and you are right in your conclusions, but from my side it looked like similar issue and my conclusions were based just on guideline posted on redmine ticket.
If I understand correctly this removed code that maintained dynamic gateways, making them default without placing them into config.xml, caused some other problems? -
Completely dynamic gateways do not need an entry in config.xml, they never have. They are dynamic and maintained internally. There is nothing in config.xml so there is nothing for the upgrade code to touch.
Again, it's apples and oranges. The symptoms are similar but they are completely different problems.
-
@jimp, yes it's absolutely different issue, I see it.
If dynamic gateways don't need entry then why the code maintained them was removed, i mean those system.inc changes?
I am also not sure about redmine tickets regarding this problem, I don't see anything now, but it also can be just because I'm blind -
I don't know, it happened as a part of a larger feature merge, that's what still needs to be investigated. It's possible that it was not intentional, or it wasn't compatible with the new features, etc. That is what remains to be debugged and tested for this specific issue.
-
Looking through the code the same actions seem to be taken now but they are split into separate functions. It's still possible there is some difference but I haven't spotted it yet.
I've also tried replicating the problem on several VMs here but I haven't been able to make one end up without a default gateway in the OS routing table. I have a few purely dynamic gateway VMs but they are all DHCP. Since I don't currently have a way to test one that is purely PPPoE, it may be that the issue is specific to only having PPPoE dynamic gateways.
The only thing I have noticed is that despite actually having a default gateway at the OS, these show
None
in the Default Gateway IPv4 drop-down and nothing in the Default column of the gateway list. There is likely a need to improve the handling of that field. Though I'm still not sure we should even offer aNone
option for IPv4 at this point. The previous code forced one default at all times.I need to check into a couple unrelated things now but I'll try to setup a PPPoE only VM if I can and see what happens.
@PiBa may need to have a look as well, he developed the default gateway changes that were merged into 2.4.4.
-
@jimp
Thank you! I hope you will find something. -
Tried a few things but have not yet found a scenario when the default-route gets 'missing'..
Currently testing with 2 pfSense boxes 1 being the pppoe server, the other being the client.. I can change the settings on the server, and the client would loose connection for a few seconds, and then pick it up again, including new gateway-ip and different client-ip..
The gateway pppoe interface is selected as the default on my config in the webgui. Even though it is a 'automatically generated' gateway it does show in the selection boxes, and should be selected to be the default.
The 'non-local gateway' option is should not be related to pppoe connections. It was first created for OVH datacenter usage where they have 'regular' network connections, but do need to route traffic out over the specified interface..
Either way, the gateway selection box being empty would be strange..
Ive also installed a new pfSense and went through the 'wizard' in which case the default-gateway indeed stays as 'none', but other options are shown available in it when clicked, i suppose the wizard needs to be changed to fill the new selection option.. After selecting that WAN_PPPOE manually though the default route gets added properly.
Only thing i could find sofar is that the 'installation wizard' needs a little fix.. When configured, the actual code handling those settings 'seems' to work properly in my tests..
-
@piba said in No internet connection after upgrade from April 20 snapshot to May 19:
The ‘non-local gateway’ option
I am also not sure why it's needed in my case, but I think it triggers something else and connection restores. I do think that this can be some other problem, but also related to those changes, because restoring of system.inc fixes everything. I think that we have some scenario like when you are selecting manually this gateway it is then stored in config.xml and then it processed some other way needed this option also, but I am not sure about that. May be the the second one, like adding manually WAN_PPPoE creates one entry in config.xml and checking this ‘non-local gateway’ option just saves my dynamic gateway into config.xml and that's why it works.
I can do some tests tomorrow, I hope.
The one thing that also confuses me it that if I select that WAN_PPPOE manually, it also works without this ‘non-local gateway’ option but only up to re-connection or reboot.@piba said in No internet connection after upgrade from April 20 snapshot to May 19:
Only thing i could find sofar is that the ‘installation wizard’ needs a little fix…
But this little fix does not solve the problem when you have no default gateway on upgrade, since you have already configured everything and config.xml does not contain dynamic gateway entry.
-
Quick test showed that second scenario posted above is true.
No need for this option at all, just pressing 'save' in gateway options do the same thing. -
Okay seems i can reproduce it now with a reboot while WAN nic is disconnected, then after enabling the wan-nic, it does get a IP, but the default-route isn't set. Then when the PPPOE server side is restarted, the PPPOE client does automatically re-connected and then does set its default-route that time..
Trying to figure out now how where/how the first connection doesn't set the gateway and the second does. Seems to me like both should follow almost the same steps when the connection is established.
-
FYI
what im seeing is this in my systemlog, both times the rc.newwanip is running, but only the second time it tells "Default gateway setting Interface WAN_PPPOE Gateway as default
".
I presume you have something similar to the first one in your log.?Jun 9 15:51:30 php-fpm 321 /rc.start_packages: Restarting/Starting all packages. Jun 9 15:51:28 check_reload_status Starting packages Jun 9 15:51:28 php-fpm 322 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.30.20.163 -> 10.30.20.163 - Restarting packages. Jun 9 15:51:26 php-fpm 322 /rc.newwanip: Creating rrd update script Jun 9 15:51:26 php-fpm 322 /rc.newwanip: Resyncing OpenVPN instances for interface WAN. Jun 9 15:51:23 php-fpm 322 /rc.newwanip: Default gateway setting Interface WAN_PPPOE Gateway as default. Jun 9 15:51:23 php-fpm 322 /rc.newwanip: rc.newwanip: on (IP address: 10.30.20.163) (interface: WAN[wan]) (real interface: pppoe0). Jun 9 15:51:23 php-fpm 322 /rc.newwanip: rc.newwanip: Info: starting on pppoe0. Jun 9 15:51:22 ppp [wan] IFACE: Rename interface ng0 to pppoe0
Jun 9 15:47:55 php-fpm 73857 /rc.start_packages: Restarting/Starting all packages. Jun 9 15:47:54 check_reload_status Starting packages Jun 9 15:47:54 php-fpm 321 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.30.20.163 -> 10.30.20.163 - Restarting packages. Jun 9 15:47:52 php-fpm 321 /rc.newwanip: Creating rrd update script Jun 9 15:47:52 php-fpm 321 /rc.newwanip: Resyncing OpenVPN instances for interface WAN. Jun 9 15:47:48 php-fpm 321 /rc.newwanip: rc.newwanip: on (IP address: 10.30.20.163) (interface: WAN[wan]) (real interface: pppoe0). Jun 9 15:47:48 php-fpm 321 /rc.newwanip: rc.newwanip: Info: starting on pppoe0. Jun 9 15:47:47 ppp [wan] IFACE: Rename interface ng0 to pppoe0
-
@piba
Yep, looks exactly like that. Both. I've just found that on the latest snapshot it works a little bit different way or I just missed it before:
After setting my dynamic gateway manually as default I see that line/rc.newwanip: Default gateway setting Interface WAN_PPPOE Gateway as default.
then I rebooted firewall and looked at system log again — when PPPoE starts on boot, I don't see this line anymore, system log is exactly like your second one and default route is missing, certainly. If I manually re-connect PPPoE without changing anything else, then I see this line again and everything works like it should be.
-
It seems the dynamic pppoe gateway does not have a status yet when it hasn't connected before.. And the code assumes its a gatewaygroup as it cannot find the status a normal gateway normally does have..
Would you be able to try this little patch with 1 changed line ?:src/etc/inc/gwlb.inc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/etc/inc/gwlb.inc b/src/etc/inc/gwlb.inc index 9a059a7..e0b94be 100644 --- a/src/etc/inc/gwlb.inc +++ b/src/etc/inc/gwlb.inc @@ -1032,7 +1032,7 @@ function fixup_default_gateway($ipprotocol, $gateways_status, $gateways_arr) { } else { $gwdefault = $config['gateways']['defaultgw6']; } - if (isset($gateways_status[$gwdefault])) { + if (isset($gateways_arr[$gwdefault])) { // the configured gateway is a regular one. (not a gwgroup) use it as is.. $dfltgwname = $gwdefault; } else {
-
Redmine ticked and fix submitted for review: https://redmine.pfsense.org/issues/8561
-
@piba
I have patched manually this line, but problem still exists.
I need to select gateway manually and it works until reboot, after reboot I need 'Disconnect'/'Connect' PPPoE to make it work.
Also I am confused with that:
-
That it says 'Default (IPv6)' seems just a 'display issue'. It is determined by a function that doesn't take dynamic gateways into account currently. Anyhow added a fix to the PR for that https://github.com/pfsense/pfsense/pull/3947/commits/092abdb6005072365bc860966b0e2ffce8d85e1b
Can you add this logging below and run another test? Please let me know what it tells. as my 'reproduction' is nolonger failing with the patch above and i'm currently out of ideas where to look, sorry but this is probably going to take a few rounds of trial and error.:
src/etc/inc/gwlb.inc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/etc/inc/gwlb.inc b/src/etc/inc/gwlb.inc index 7b157b5..3a70fe3 100644 --- a/src/etc/inc/gwlb.inc +++ b/src/etc/inc/gwlb.inc @@ -1105,6 +1105,8 @@ function fixup_default_gateway($ipprotocol, $gateways_status, $gateways_arr) { } } } + log_error("fixup_default_gateway dfltgwdown:{$dfltgwdown} upgw:{$upgw} dfltgwname:{$dfltgwname} , gwdefault:{$gwdefault}"); + log_error("gateways_arr:".print_r($gateways_arr,true)); if ($dfltgwdown == true && !empty($upgw)) { setdefaultgateway($gateways_arr[$upgw]); } else if (!empty($dfltgwname)) {
-
@piba
Thats fun! This comes on system boot BEFORE PPPoE connection is established :Jun 10 19:49:56 php-cgi rc.bootup: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => [ipprotocol] => inet [gateway] => [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) ) Jun 10 19:49:56 php-cgi rc.bootup: fixup_default_gateway dfltgwdown: upgw: dfltgwname: , gwdefault: Jun 10 19:49:56 php-cgi rc.bootup: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => [ipprotocol] => inet [gateway] => [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) )
PPPoE is established 2 seconds later just start of session!
Jun 10 19:49:58 ppp [wan_link0] PPPoE: connection successful
And this coming when I manually re-connect PPPoE
Jun 10 19:53:28 php-fpm 335 /rc.newwanip: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => 1 [ipprotocol] => inet [gateway] => 212.7.29.236 [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [monitor] => 212.7.29.236 [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) ) Jun 10 19:53:28 php-fpm 335 /rc.newwanip: fixup_default_gateway dfltgwdown: upgw: dfltgwname: , gwdefault: Jun 10 19:53:28 php-fpm 335 /rc.newwanip: Default gateway setting Interface WAN_PPPOE Gateway as default. Jun 10 19:53:28 php-fpm 335 /rc.newwanip: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => 1 [ipprotocol] => inet [gateway] => 212.7.29.236 [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [monitor] => 212.7.29.236 [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) )
PPPoE session is established completely 1 second before:
Jun 10 19:53:27 ppp [wan] IFACE: Rename interface ng0 to pppoe0
-
I suppose between the 'Rename interface ng0 to pppoe0' and the manual re-connect you do have these 2 lines?:
Jun 10 19:49:13 php-fpm 322 /rc.newwanip: rc.newwanip: on (IP address: 10.30.20.163) (interface: WAN[wan]) (real interface: pppoe0). Jun 10 19:49:13 php-fpm 322 /rc.newwanip: rc.newwanip: Info: starting on pppoe0.
And then the "rc.newwanip" just stops running .? While it should log a few more lines and set the gateway like in previous posts.? I think there is a race condition there where the script exit's during bootup...