No internet connection after upgrade from April 20 snapshot to May 19



  • I am pretty sure that changes made in this commit https://github.com/pfsense/pfsense/commit/43a9b03deb9db482713dfa1218662bce8b6360ce#diff-1332c372788c9e1a8c6c9bae9ebb55a5
    are the root of the problem with default dynamic gateway on PPPoE, in my case. I think it's all around system.inc within deleted lines 647-705, because restoring them fixed the problem with default gateway.
    Finally, steps to reproduce the problem. Create your own pfSense firewall, virtually or on real hardware, you will need two Ethernet cards. I have used Auto install on ZFS and after successful install I just used wizard, that automatically started after logging into WEBGUI and I have nothing changed, only selected WAN as PPPoE for IPv4 and that all, after finishing wizard steps you will already have configuration with NO default ROUTE and therefore NO internet connection.



  • https://redmine.pfsense.org/issues/8504#change-36668
    I thought that it was the similar problem, but also the wrong suspect.
    ☺


  • Rebel Alliance Developer Netgate

    @w0w said in No internet connection after upgrade from April 20 snapshot to May 19:

    https://redmine.pfsense.org/issues/8504#change-36668

    That is a completely different gateway issue, unrelated to this one. That was a failed gateway upgrade problem, this is a separate issue having to do with purely dynamic gateways not present in config.xml. If there isn't already a ticket for it, it should have one.



  • @jimp, in my case if I follow this closed ticket guidelines, installing the 2.4.3, everything works like a charm, but if I upgrade this working installation to 2.4.4 latest version it just does not have default gateway anymore, but it does not mean your upgrade code fails, yes, and it is different issue, caused by different commit mentioned above. Also, I understand that you have more information about closed issue and you are right in your conclusions, but from my side it looked like similar issue and my conclusions were based just on guideline posted on redmine ticket.
    If I understand correctly this removed code that maintained dynamic gateways, making them default without placing them into config.xml, caused some other problems?


  • Rebel Alliance Developer Netgate

    Completely dynamic gateways do not need an entry in config.xml, they never have. They are dynamic and maintained internally. There is nothing in config.xml so there is nothing for the upgrade code to touch.

    Again, it's apples and oranges. The symptoms are similar but they are completely different problems.



  • @jimp, yes it's absolutely different issue, I see it.
    If dynamic gateways don't need entry then why the code maintained them was removed, i mean those system.inc changes?
    I am also not sure about redmine tickets regarding this problem, I don't see anything now, but it also can be just because I'm blind ☺


  • Rebel Alliance Developer Netgate

    I don't know, it happened as a part of a larger feature merge, that's what still needs to be investigated. It's possible that it was not intentional, or it wasn't compatible with the new features, etc. That is what remains to be debugged and tested for this specific issue.


  • Rebel Alliance Developer Netgate

    Looking through the code the same actions seem to be taken now but they are split into separate functions. It's still possible there is some difference but I haven't spotted it yet.

    I've also tried replicating the problem on several VMs here but I haven't been able to make one end up without a default gateway in the OS routing table. I have a few purely dynamic gateway VMs but they are all DHCP. Since I don't currently have a way to test one that is purely PPPoE, it may be that the issue is specific to only having PPPoE dynamic gateways.

    The only thing I have noticed is that despite actually having a default gateway at the OS, these show None in the Default Gateway IPv4 drop-down and nothing in the Default column of the gateway list. There is likely a need to improve the handling of that field. Though I'm still not sure we should even offer a None option for IPv4 at this point. The previous code forced one default at all times.

    I need to check into a couple unrelated things now but I'll try to setup a PPPoE only VM if I can and see what happens.

    @PiBa may need to have a look as well, he developed the default gateway changes that were merged into 2.4.4.



  • @jimp
    Thank you! I hope you will find something. 👍



  • Tried a few things but have not yet found a scenario when the default-route gets 'missing'..

    Currently testing with 2 pfSense boxes 1 being the pppoe server, the other being the client.. I can change the settings on the server, and the client would loose connection for a few seconds, and then pick it up again, including new gateway-ip and different client-ip..

    The gateway pppoe interface is selected as the default on my config in the webgui. Even though it is a 'automatically generated' gateway it does show in the selection boxes, and should be selected to be the default.

    The 'non-local gateway' option is should not be related to pppoe connections. It was first created for OVH datacenter usage where they have 'regular' network connections, but do need to route traffic out over the specified interface..

    Either way, the gateway selection box being empty would be strange..

    Ive also installed a new pfSense and went through the 'wizard' in which case the default-gateway indeed stays as 'none', but other options are shown available in it when clicked, i suppose the wizard needs to be changed to fill the new selection option.. After selecting that WAN_PPPOE manually though the default route gets added properly.

    Only thing i could find sofar is that the 'installation wizard' needs a little fix.. When configured, the actual code handling those settings 'seems' to work properly in my tests..



  • @piba said in No internet connection after upgrade from April 20 snapshot to May 19:

    The ‘non-local gateway’ option

    I am also not sure why it's needed in my case, but I think it triggers something else and connection restores. I do think that this can be some other problem, but also related to those changes, because restoring of system.inc fixes everything. I think that we have some scenario like when you are selecting manually this gateway it is then stored in config.xml and then it processed some other way needed this option also, but I am not sure about that. May be the the second one, like adding manually WAN_PPPoE creates one entry in config.xml and checking this ‘non-local gateway’ option just saves my dynamic gateway into config.xml and that's why it works.
    I can do some tests tomorrow, I hope.
    The one thing that also confuses me it that if I select that WAN_PPPOE manually, it also works without this ‘non-local gateway’ option but only up to re-connection or reboot.

    @piba said in No internet connection after upgrade from April 20 snapshot to May 19:

    Only thing i could find sofar is that the ‘installation wizard’ needs a little fix…

    But this little fix does not solve the problem when you have no default gateway on upgrade, since you have already configured everything and config.xml does not contain dynamic gateway entry.



  • Quick test showed that second scenario posted above is true.
    No need for this option at all, just pressing 'save' in gateway options do the same thing.



  • Okay seems i can reproduce it now with a reboot while WAN nic is disconnected, then after enabling the wan-nic, it does get a IP, but the default-route isn't set. Then when the PPPOE server side is restarted, the PPPOE client does automatically re-connected and then does set its default-route that time..

    Trying to figure out now how where/how the first connection doesn't set the gateway and the second does. Seems to me like both should follow almost the same steps when the connection is established.



  • FYI
    what im seeing is this in my systemlog, both times the rc.newwanip is running, but only the second time it tells "Default gateway setting Interface WAN_PPPOE Gateway as default".
    I presume you have something similar to the first one in your log.?

    Jun 9 15:51:30	php-fpm	321	/rc.start_packages: Restarting/Starting all packages.
    Jun 9 15:51:28	check_reload_status		Starting packages
    Jun 9 15:51:28	php-fpm	322	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.30.20.163 -> 10.30.20.163 - Restarting packages.
    Jun 9 15:51:26	php-fpm	322	/rc.newwanip: Creating rrd update script
    Jun 9 15:51:26	php-fpm	322	/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
    Jun 9 15:51:23	php-fpm	322	/rc.newwanip: Default gateway setting Interface WAN_PPPOE Gateway as default.
    Jun 9 15:51:23	php-fpm	322	/rc.newwanip: rc.newwanip: on (IP address: 10.30.20.163) (interface: WAN[wan]) (real interface: pppoe0).
    Jun 9 15:51:23	php-fpm	322	/rc.newwanip: rc.newwanip: Info: starting on pppoe0.
    Jun 9 15:51:22	ppp		[wan] IFACE: Rename interface ng0 to pppoe0
    
    Jun 9 15:47:55	php-fpm	73857	/rc.start_packages: Restarting/Starting all packages.
    Jun 9 15:47:54	check_reload_status		Starting packages
    Jun 9 15:47:54	php-fpm	321	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.30.20.163 -> 10.30.20.163 - Restarting packages.
    Jun 9 15:47:52	php-fpm	321	/rc.newwanip: Creating rrd update script
    Jun 9 15:47:52	php-fpm	321	/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
    Jun 9 15:47:48	php-fpm	321	/rc.newwanip: rc.newwanip: on (IP address: 10.30.20.163) (interface: WAN[wan]) (real interface: pppoe0).
    Jun 9 15:47:48	php-fpm	321	/rc.newwanip: rc.newwanip: Info: starting on pppoe0.
    Jun 9 15:47:47	ppp		[wan] IFACE: Rename interface ng0 to pppoe0
    


  • @piba
    Yep, looks exactly like that. Both. I've just found that on the latest snapshot it works a little bit different way or I just missed it before:
    After setting my dynamic gateway manually as default I see that line

    /rc.newwanip: Default gateway setting Interface WAN_PPPOE Gateway as default.
    

    then I rebooted firewall and looked at system log again — when PPPoE starts on boot, I don't see this line anymore, system log is exactly like your second one and default route is missing, certainly. If I manually re-connect PPPoE without changing anything else, then I see this line again and everything works like it should be.



  • It seems the dynamic pppoe gateway does not have a status yet when it hasn't connected before.. And the code assumes its a gatewaygroup as it cannot find the status a normal gateway normally does have..
    Would you be able to try this little patch with 1 changed line ?:

     src/etc/inc/gwlb.inc | 2 +-
     1 file changed, 1 insertion(+), 1 deletion(-)
    
    diff --git a/src/etc/inc/gwlb.inc b/src/etc/inc/gwlb.inc
    index 9a059a7..e0b94be 100644
    --- a/src/etc/inc/gwlb.inc
    +++ b/src/etc/inc/gwlb.inc
    @@ -1032,7 +1032,7 @@ function fixup_default_gateway($ipprotocol, $gateways_status, $gateways_arr) {
     	} else {
     		$gwdefault = $config['gateways']['defaultgw6'];
     	}
    -	if (isset($gateways_status[$gwdefault])) {
    +	if (isset($gateways_arr[$gwdefault])) {
     		// the configured gateway is a regular one. (not a gwgroup) use it as is..
     		$dfltgwname = $gwdefault;
     	} else {
    
    


  • Redmine ticked and fix submitted for review: https://redmine.pfsense.org/issues/8561



  • @piba
    I have patched manually this line, but problem still exists.
    I need to select gateway manually and it works until reboot, after reboot I need 'Disconnect'/'Connect' PPPoE to make it work.
    0_1528603240024_routes.png
    Also I am confused with that:
    0_1528602483450_default_route.png



  • That it says 'Default (IPv6)' seems just a 'display issue'. It is determined by a function that doesn't take dynamic gateways into account currently. Anyhow added a fix to the PR for that https://github.com/pfsense/pfsense/pull/3947/commits/092abdb6005072365bc860966b0e2ffce8d85e1b

    Can you add this logging below and run another test? Please let me know what it tells. as my 'reproduction' is nolonger failing with the patch above and i'm currently out of ideas where to look, sorry but this is probably going to take a few rounds of trial and error.:

     src/etc/inc/gwlb.inc | 2 ++
     1 file changed, 2 insertions(+)
    
    diff --git a/src/etc/inc/gwlb.inc b/src/etc/inc/gwlb.inc
    index 7b157b5..3a70fe3 100644
    --- a/src/etc/inc/gwlb.inc
    +++ b/src/etc/inc/gwlb.inc
    @@ -1105,6 +1105,8 @@ function fixup_default_gateway($ipprotocol, $gateways_status, $gateways_arr) {
     			}
     		}
     	}
    +	log_error("fixup_default_gateway  dfltgwdown:{$dfltgwdown} upgw:{$upgw}  dfltgwname:{$dfltgwname} , gwdefault:{$gwdefault}");
    +	log_error("gateways_arr:".print_r($gateways_arr,true));
     	if ($dfltgwdown == true && !empty($upgw)) {
     		setdefaultgateway($gateways_arr[$upgw]);
     	} else if (!empty($dfltgwname)) {
    


  • @piba
    Thats fun! This comes on system boot BEFORE PPPoE connection is established :

    Jun 10 19:49:56 	php-cgi 		rc.bootup: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => [ipprotocol] => inet [gateway] => [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) )
    Jun 10 19:49:56 	php-cgi 		rc.bootup: fixup_default_gateway dfltgwdown: upgw: dfltgwname: , gwdefault:
    Jun 10 19:49:56 	php-cgi 		rc.bootup: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => [ipprotocol] => inet [gateway] => [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) )
    
    

    PPPoE is established 2 seconds later just start of session!

    Jun 10 19:49:58 	ppp 		[wan_link0] PPPoE: connection successful 
    

    And this coming when I manually re-connect PPPoE

    Jun 10 19:53:28 	php-fpm 	335 	/rc.newwanip: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => 1 [ipprotocol] => inet [gateway] => 212.7.29.236 [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [monitor] => 212.7.29.236 [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) )
    Jun 10 19:53:28 	php-fpm 	335 	/rc.newwanip: fixup_default_gateway dfltgwdown: upgw: dfltgwname: , gwdefault:
    Jun 10 19:53:28 	php-fpm 	335 	/rc.newwanip: Default gateway setting Interface WAN_PPPOE Gateway as default.
    Jun 10 19:53:28 	php-fpm 	335 	/rc.newwanip: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => 1 [ipprotocol] => inet [gateway] => 212.7.29.236 [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [monitor] => 212.7.29.236 [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) ) 
    

    PPPoE session is established completely 1 second before:

    Jun 10 19:53:27 	ppp 		[wan] IFACE: Rename interface ng0 to pppoe0 
    


  • I suppose between the 'Rename interface ng0 to pppoe0' and the manual re-connect you do have these 2 lines?:

    Jun 10 19:49:13	php-fpm	322	/rc.newwanip: rc.newwanip: on (IP address: 10.30.20.163) (interface: WAN[wan]) (real interface: pppoe0).
    Jun 10 19:49:13	php-fpm	322	/rc.newwanip: rc.newwanip: Info: starting on pppoe0.
    

    And then the "rc.newwanip" just stops running .? While it should log a few more lines and set the gateway like in previous posts.? I think there is a race condition there where the script exit's during bootup...



  • @piba
    Here is more:

    Jun 10 19:50:05 	upsd 	28485 	User monuser@::1 logged into UPS [SMK-1000A]
    Jun 10 19:50:03 	upsd 	28485 	Startup successful
    Jun 10 19:50:03 	upsd 	28483 	Can't connect to UPS [SMK-1000A] (snmp-ups-SMK-1000A): No such file or directory
    Jun 10 19:50:03 	upsd 	28483 	listening on 127.0.0.1 port 3493
    Jun 10 19:50:03 	upsd 	28483 	listening on ::1 port 3493
    Jun 10 19:50:02 	php-cgi 		notify_monitor.php: Could not send the message to w***n@gmail.com -- Error: Failed to connect to mail.in***.**:25 [SMTP: Failed to connect socket: Network is unreachable (code: -1, response: )]
    Jun 10 19:50:02 	upsmon 	27593 	Startup successful
    Jun 10 19:50:02 	php-fpm 	334 	/rc.start_packages: Starting service nut
    Jun 10 19:50:02 	php-fpm 	334 	/rc.start_packages: Restarting/Starting all packages.
    Jun 10 19:50:02 	kernel 		done.
    Jun 10 19:50:02 	syslogd 		kernel boot file is /boot/kernel/kernel
    Jun 10 19:50:02 	syslogd 		exiting on signal 15
    Jun 10 19:50:02 	syslogd 		Logging subprocess 9161 (exec /usr/local/sbin/sshlockout_pf 15) exited due to signal 15.
    Jun 10 19:50:00 	root 		/etc/rc.d/hostid: WARNING: hostid: unable to figure out a UUID from DMI data, generating a new one
    Jun 10 19:50:00 	php-fpm 	334 	/rc.newwanip: rc.newwanip: on (IP address: 84.52.**.200) (interface: WAN[wan]) (real interface: pppoe0).
    Jun 10 19:50:00 	php-fpm 	334 	/rc.newwanip: rc.newwanip: Info: starting on pppoe0.
    Jun 10 19:50:00 	kernel 		done.
    Jun 10 19:49:59 	check_reload_status 		Linkup starting igb1
    Jun 10 19:49:59 	kernel 		igb1: link state changed to UP
    Jun 10 19:49:59 	ppp 		[wan] IFACE: Rename interface ng0 to pppoe0
    Jun 10 19:49:59 	ppp 		[wan] IFACE: Up event
    Jun 10 19:49:59 	check_reload_status 		rc.newwanip starting pppoe0
    Jun 10 19:49:58 	php-fpm 	335 	/rc.dyndns.update: Curl error occurred: Could not resolve host: freedns.afraid.org
    Jun 10 19:49:58 	kernel 		ng_pppoe[12]: no matching session
    Jun 10 19:49:58 	kernel 		ng_pppoe[12]: no matching session
    Jun 10 19:49:58 	check_reload_status 		Rewriting resolv.conf
    Jun 10 19:49:58 	ppp 		[wan] 84.52.**.200 -> 212.7.29.236
    Jun 10 19:49:58 	ppp 		[wan] IPCP: LayerUp
    Jun 10 19:49:58 	ppp 		[wan] IPCP: state change Ack-Sent --> Opened
    Jun 10 19:49:58 	ppp 		[wan] SECDNS 212.7.9.34
    Jun 10 19:49:58 	ppp 		[wan] PRIDNS 212.7.0.33
    Jun 10 19:49:58 	ppp 		[wan] IPADDR 84.52.xx.200
    Jun 10 19:49:58 	ppp 		[wan] IPCP: rec'd Configure Ack #3 (Ack-Sent)
    Jun 10 19:49:58 	ppp 		[wan] SECDNS 212.7.9.34
    Jun 10 19:49:58 	ppp 		[wan] PRIDNS 212.7.0.33
    Jun 10 19:49:58 	ppp 		[wan] IPADDR 84.52.xx.200
    Jun 10 19:49:58 	ppp 		[wan] IPCP: SendConfigReq #3
    Jun 10 19:49:58 	ppp 		[wan] SECDNS 212.7.9.34
    Jun 10 19:49:58 	ppp 		[wan] PRIDNS 212.7.0.33
    Jun 10 19:49:58 	ppp 		[wan] 84.52.xx.200 is OK
    Jun 10 19:49:58 	ppp 		[wan] IPADDR 84.52.xx.200
    Jun 10 19:49:58 	ppp 		[wan] IPCP: rec'd Configure Nak #2 (Ack-Sent)
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: LayerFinish
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: state change Req-Sent --> Stopped
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: protocol was rejected by peer
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: protocol IPV6CP was rejected
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: rec'd Protocol Reject #2 (Opened)
    Jun 10 19:49:58 	ppp 		[wan] SECDNS 0.0.0.0
    Jun 10 19:49:58 	ppp 		[wan] PRIDNS 0.0.0.0
    Jun 10 19:49:58 	ppp 		[wan] IPADDR 0.0.0.0
    Jun 10 19:49:58 	ppp 		[wan] IPCP: SendConfigReq #2
    Jun 10 19:49:58 	ppp 		[wan] COMPPROTO VJCOMP, 16 comp. channels, no comp-cid
    Jun 10 19:49:58 	ppp 		[wan] IPCP: rec'd Configure Reject #1 (Ack-Sent)
    Jun 10 19:49:58 	ppp 		[wan] IPCP: state change Req-Sent --> Ack-Sent
    Jun 10 19:49:58 	ppp 		[wan] IPADDR 212.7.29.236
    Jun 10 19:49:58 	ppp 		[wan] IPCP: SendConfigAck #1
    Jun 10 19:49:58 	ppp 		[wan] 212.7.29.236 is OK
    Jun 10 19:49:58 	ppp 		[wan] IPADDR 212.7.29.236
    Jun 10 19:49:58 	ppp 		[wan] IPCP: rec'd Configure Request #1 (Req-Sent)
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: SendConfigReq #1
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: state change Starting --> Req-Sent
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: Up event
    Jun 10 19:49:58 	ppp 		[wan] SECDNS 0.0.0.0
    Jun 10 19:49:58 	ppp 		[wan] PRIDNS 0.0.0.0
    Jun 10 19:49:58 	ppp 		[wan] COMPPROTO VJCOMP, 16 comp. channels, no comp-cid
    Jun 10 19:49:58 	ppp 		[wan] IPADDR 0.0.0.0
    Jun 10 19:49:58 	ppp 		[wan] IPCP: SendConfigReq #1
    Jun 10 19:49:58 	ppp 		[wan] IPCP: state change Starting --> Req-Sent
    Jun 10 19:49:58 	ppp 		[wan] IPCP: Up event
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: LayerStart
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: state change Initial --> Starting
    Jun 10 19:49:58 	ppp 		[wan] IPV6CP: Open event
    Jun 10 19:49:58 	ppp 		[wan] IPCP: LayerStart
    Jun 10 19:49:58 	ppp 		[wan] IPCP: state change Initial --> Starting
    Jun 10 19:49:58 	ppp 		[wan] IPCP: Open event
    Jun 10 19:49:58 	ppp 		[wan] Bundle: Status update: up 1 link, total bandwidth 64000 bps
    Jun 10 19:49:58 	ppp 		[wan_link0] Link: Join bundle "wan"
    Jun 10 19:49:58 	ppp 		[wan_link0] Link: Matched action 'bundle "wan" ""'
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: authorization successful
    Jun 10 19:49:58 	ppp 		[wan_link0] CHAP: rec'd SUCCESS #1 len: 4
    Jun 10 19:49:58 	ppp 		[wan_link0] CHAP: sending RESPONSE #1 len: 33
    Jun 10 19:49:58 	ppp 		[wan_link0] CHAP: Using authname "yyyyy"
    Jun 10 19:49:58 	ppp 		[wan_link0] Name: "k18-29-236"
    Jun 10 19:49:58 	ppp 		[wan_link0] CHAP: rec'd CHALLENGE #1 len: 31
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: LayerUp
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: auth: peer wants CHAP, I want nothing
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: state change Ack-Sent --> Opened
    Jun 10 19:49:58 	ppp 		[wan_link0] MAGICNUM 0x2e5af1e0
    Jun 10 19:49:58 	ppp 		[wan_link0] MRU 1492
    Jun 10 19:49:58 	ppp 		[wan_link0] PROTOCOMP
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: rec'd Configure Ack #1 (Ack-Sent)
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: state change Req-Sent --> Ack-Sent
    Jun 10 19:49:58 	ppp 		[wan_link0] MAGICNUM 0xcde7b797
    Jun 10 19:49:58 	ppp 		[wan_link0] AUTHPROTO CHAP MD5
    Jun 10 19:49:58 	ppp 		[wan_link0] MRU 1492
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: SendConfigAck #1
    Jun 10 19:49:58 	ppp 		[wan_link0] MAGICNUM 0xcde7b797
    Jun 10 19:49:58 	ppp 		[wan_link0] AUTHPROTO CHAP MD5
    Jun 10 19:49:58 	ppp 		[wan_link0] MRU 1492
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: rec'd Configure Request #1 (Req-Sent)
    Jun 10 19:49:58 	ppp 		[wan_link0] MAGICNUM 0x2e5af1e0
    Jun 10 19:49:58 	ppp 		[wan_link0] MRU 1492
    Jun 10 19:49:58 	ppp 		[wan_link0] PROTOCOMP
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: SendConfigReq #1
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: state change Starting --> Req-Sent
    Jun 10 19:49:58 	ppp 		[wan_link0] LCP: Up event
    Jun 10 19:49:58 	ppp 		[wan_link0] Link: UP event
    Jun 10 19:49:58 	ppp 		[wan_link0] PPPoE: connection successful
    Jun 10 19:49:58 	ppp 		PPPoE: rec'd ACNAME "k18-29-236"
    Jun 10 19:49:57 	kernel 		done
    Jun 10 19:49:57 	php-cgi 		rc.bootup: IPsec ERROR: Could not find phase 1 source for connection . Omitting from configuration file.
    Jun 10 19:49:57 	kernel 		.done.
    Jun 10 19:49:57 	kernel 		....
    Jun 10 19:49:57 	check_reload_status 		Updating all dyndns
    Jun 10 19:49:57 	kernel 		done.
    Jun 10 19:49:57 	kernel 		done.
    Jun 10 19:49:57 	php-cgi 		rc.bootup: NTPD is starting up.
    Jun 10 19:49:57 	check_reload_status 		Linkup starting igb0
    Jun 10 19:49:57 	kernel 		igb0: link state changed to UP
    Jun 10 19:49:56 	kernel 		done.
    Jun 10 19:49:56 	php-cgi 		rc.bootup: sync unbound done.
    Jun 10 19:49:56 	kernel 		Starting DNS Resolver...done.
    Jun 10 19:49:56 	php-cgi 		rc.bootup: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => [ipprotocol] => inet [gateway] => [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) )
    Jun 10 19:49:56 	php-cgi 		rc.bootup: fixup_default_gateway dfltgwdown: upgw: dfltgwname: , gwdefault:
    Jun 10 19:49:56 	php-cgi 		rc.bootup: gateways_arr:Array ( [WAN_PPPOE] => Array ( [dynamic] => [ipprotocol] => inet [gateway] => [interface] => pppoe0 [friendlyiface] => wan [name] => WAN_PPPOE [attribute] => system [descr] => Interface WAN_PPPOE Gateway ) [Null4] => Array ( [name] => Null4 [interface] => lo0 [ipprotocol] => inet [gateway] => 127.0.0.1 ) [Null6] => Array ( [name] => Null6 [interface] => lo0 [ipprotocol] => inet6 [gateway] => ::1 ) )
    Jun 10 19:49:56 	kernel 		load_dn_aqm dn_aqm PIE loaded
    Jun 10 19:49:56 	kernel 		load_dn_aqm dn_aqm CODEL loaded
    Jun 10 19:49:56 	kernel 		load_dn_sched dn_sched FQ_PIE loaded
    Jun 10 19:49:56 	kernel 		load_dn_sched dn_sched FQ_CODEL loaded
    Jun 10 19:49:56 	kernel 		load_dn_sched dn_sched PRIO loaded
    Jun 10 19:49:56 	kernel 		load_dn_sched dn_sched WF2Q+ loaded
    Jun 10 19:49:56 	kernel 		load_dn_sched dn_sched RR loaded
    Jun 10 19:49:56 	kernel 		load_dn_sched dn_sched QFQ loaded
    Jun 10 19:49:56 	kernel 		load_dn_sched dn_sched FIFO loaded
    Jun 10 19:49:56 	kernel 		DUMMYNET 0 with IPv6 initialized (100409)
    Jun 10 19:49:56 	kernel 		pflog0: promiscuous mode enabled
    Jun 10 19:49:56 	php-cgi 		rc.bootup: fixup_default_gateway dfltgwdown: upgw: dfltgwname:WAN_PPPOE , gwdefault:WAN_PPPOE
    Jun 10 19:49:56 	php-cgi 		rc.bootup: The command '/sbin/route delete -host 80.72.146.2' returned exit code '1', the output was 'route: route has not been found delete host 80.72.146.2 fib 0: not in table'
    Jun 10 19:49:56 	php-cgi 		rc.bootup: The command '/sbin/route delete -host 74.82.42.42' returned exit code '1', the output was 'route: route has not been found delete host 74.82.42.42 fib 0: not in table'
    Jun 10 19:49:56 	php-cgi 		rc.bootup: The command '/sbin/route delete -host 8.8.4.4' returned exit code '1', the output was 'route: route has not been found delete host 8.8.4.4 fib 0: not in table'
    Jun 10 19:49:55 	php-cgi 		rc.bootup: Resyncing OpenVPN instances.
    Jun 10 19:49:53 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:53 	sshlockout 	9161 	sshlockout/webConfigurator v3.0 starting up
    Jun 10 19:49:53 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:53 	sshd 	9034 	Server listening on 0.0.0.0 port 22.
    Jun 10 19:49:53 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:53 	sshd 	9034 	Server listening on :: port 22.
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		[wan_link0] PPPoE: Connecting to ''
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		[wan_link0] LCP: LayerStart
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		[wan_link0] LCP: state change Initial --> Starting
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		[wan_link0] LCP: Open event
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		[wan_link0] Link: OPEN event
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	kernel 		ng0: changing name to 'pppoe0'
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		[wan] Bundle: Interface ng0 created
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		web: web is not running
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		process 7276 started, version 5.8 (nobody@pfSense_master_amd64-pfSense_devel-job-02 16:31 21-Oct-2017)
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable
    Jun 10 19:49:52 	ppp 		Multi-link PPP daemon for FreeBSD
    Jun 10 19:49:52 	syslogd 		sendto: Network is unreachable  
    


  • Yep thats the 2 lines right there..

    Jun 10 19:50:00 	php-fpm 	334 	/rc.newwanip: rc.newwanip: on (IP address: 84.52.**.200) (interface: WAN[wan]) (real interface: pppoe0).
    Jun 10 19:50:00 	php-fpm 	334 	/rc.newwanip: rc.newwanip: Info: starting on pppoe0.
    

    So indeed in your case it is 'aborting' that script before its done running.
    Seems like this 'exit' is supposed to be avoiding a race condition where its actually causing one...:
    https://github.com/pfsense/pfsense/blob/d84eec807d6216cfbc073438ce57e01f1c52a2f4/src/etc/rc.newwanip#L192

    @jimp any insights on what would be the right order to call things?

    • initialize PPPOE-client while booting
    • initialize routes and packages and rules and stuff after PPPOE gets connected
    • finish the rc.newwanip to configure routes and other things that depends on the newly assigned wan-ip..
    • finish booting..

    I think it was previously working because every call to return_gateway_groups_array() would again try and fix the gateways.. and eventually one did catch.. Now the return_gateway_groups_array() most of the time skips these actions as fixing the gateway should not happen as a 'side-effect' of 'requesting information'.. But that does seem to cause this problem..



  • Perhaps the fix could be as simple as removing that 'exit' from the rc.newwanip .? (in addition to the PR made)



  • @piba
    I've deleted this line and this fixes the problem with connection on boot, really.



  • @w0w
    I've discussed with jimp, and he didn't like removing the exit completely to avoid bringing the old previous race condition back. we agreed upon a specific 'ppp check' to avoid the exit. Can you put the exit back in, but apply this extra check instead.? That should still fix the issue.. does it.? :

    https://github.com/pfsense/pfsense/pull/3947/commits/627b0941889f4b19ad419ddfe01d329bef2c2bd6



  • @piba
    Looks like with BOTH (gwlib.inc and rc.newwanip) fixes applied — I now have internet connection. One thing is still need to be fixed — after upgrade from previous version and after applying those patches I need also to select default gateway in drop-down list on system_gateways.php page.
    If I restore old configuration on patched system the same thing happens also, I need manually select gateway on system_gateways.php via GUI, is it feature or bug? ☺



  • Yes the first link https://github.com/pfsense/pfsense/pull/3947 contains 3 fixes/commits, the 2 you refer to, and one that fixes the displaying of the IPv6 for a IPv4 gateway..

    The gui 'none' selection is is still something to fix. Lets call that a 'unwanted feature' ;) . Ill probably work on that soon..



  • @piba
    Thank you, let me know if you need further testing.



  • @w0w
    Thank you for testing and reporting back each time :).

    Can you try latest snapshot? Its build with these (slightly modified) fixes included.

    p.s. not the 'none' selection yet though, still working on that.



  • @piba
    I have tried this one:

    2.4.4-DEVELOPMENT (amd64)
    built on Sat Jun 23 01:57:55 EDT 2018
    FreeBSD 11.2-RC3
    

    I've found that internet not working with this snapshot on reboot, but both fixes are present.
    I think it's modified line that was looked like

    if (platform_booting() && strpos($interface_real, "ppp") !== 0) {
    

    before,
    and now it's

    if (platform_booting() && in_array(substr($interface_real, 0, 3), array("ppp", "ppt", "l2t"))) {
    

    because if I change it back, it works again.



  • @w0w
    Ah crap, i forgot to do the inverse, '!'.

    if (platform_booting() && !in_array(substr($interface_real, 0, 3), array("ppp", "ppt", "l2t"))) {
    

    sorry, thanks for testing!

    new PR made: https://github.com/pfsense/pfsense/pull/3956



  • @piba This one is much better 👍 😀
    It works!



  • As expected, latest snapshot containing this PR boots normally.



  • @w0w
    Thanks, added your confirmation to the redmine ticked so it can be closed.



  • @piba
    Please let me know when you need testing gui 'none' selection fix ☺



  • @w0w
    The commit was merged, but not build yet. next snapshot should contain a new 'automatic' option which would be the default for the gateway selection, also a option to move gateways up/down on the list with checkboxes and anchor click. The option 'none' should now also actually be none.. I guess all of those things can use a little testing once it gets build 😉 ..



  • @piba
    Here is my test result. If I select "Automatic" option in GUI then everything looks OK, I have internet and default gateway, but when I try reboot the firewall, on next boot there is no internet and default gateway anymore, sounds familiar? 😉
    This easy to fix, just pressing "save" on system_gateways.php and "Apply changes", it immediately starts working and I see this line in log:

    /system_gateways.php: Default gateway setting Interface WAN_PPPOE Gateway as default. 
    

    There is no such/similar line during boot, so automatic selection does not trig for some reason.
    How can I help to debug this?



  • @w0w
    Thanks again for testing. We seem to have hit catch22 issue here.

    'Automatic' selection takes the gateway 'status' into account which depends on dpinger, which depends on routes to the monitor-targets being configured before hand so to the monitor target will be using the correct gateway, which while configuring routes it also configures the default route which depends on the dpinger already running..

    Hmm... This needs a bit more thought..



  • @piba
    Maybe these questions sound silly, but anyway I'll ask 🙄
    If we have a bunch of gateways, ex. multiple WANs, what this "automatic" selection is supposed to do generally?
    When we have single gateway found on a system, then we can skip status check and just select it as "default", no?


Log in to reply