PPPoE stopped working. I blame the ISP. How do I prove it?



  • I've run some sort of firewall device for 16 years.  Various sorts as the WAN technology changed.  Freesco, then m0n0wall, then PFsense.  I migrated to PFsense when I needed UPnP and m0n0wall didn't have it.

    The last two years it's been PFsense in a VM on 2012 R2 Hyper V.
    This has been updated within a day or two of release consistently.
    PFsense has always worked for me.

    Prior to the VM it was on various machines.

    November 30 @ 11:38 pm local time,  it stopped pasing PPPoE traffic.  I was asleep at the time.  PFsense wasn't updated, neither was my Hyper V Host.  Neither restarted.  Nothing I've done since has brought the PFsense WAN link back up.

    My PPPoE logs show

    Dec 3 16:43:58 ppp Multi-link PPP daemon for FreeBSD
    Dec 3 16:43:58 ppp process 6347 started, version 5.8 (nobody@pfSense_v2_4_1_amd64-pfSense_v2_4_1-job-01 16:32 21-Oct-2017)
    Dec 3 16:43:58 ppp web: web is not running
    Dec 3 16:43:58 ppp [wan] Bundle: Interface ng0 created
    Dec 3 16:43:58 ppp [wan_link0] Link: OPEN event
    Dec 3 16:43:58 kernel ng0: changing name to 'pppoe1'
    Dec 3 16:43:58 ppp [wan_link0] LCP: Open event
    Dec 3 16:43:58 ppp [wan_link0] LCP: state change Initial –> Starting
    Dec 3 16:43:58 ppp [wan_link0] LCP: LayerStart
    Dec 3 16:43:58 ppp [wan_link0] PPPoE: Connecting to ''
    Dec 3 16:43:58 ppp PPPoE: rec'd ACNAME "bras1.mel10"
    Dec 3 16:43:58 ppp [wan_link0] PPPoE: connection successful
    Dec 3 16:43:58 ppp [wan_link0] Link: UP event
    Dec 3 16:43:58 ppp [wan_link0] LCP: Up event
    Dec 3 16:43:58 ppp [wan_link0] LCP: state change Starting –> Req-Sent
    Dec 3 16:43:58 ppp [wan_link0] LCP: SendConfigReq #1
    Dec 3 16:43:58 ppp [wan_link0] PROTOCOMP
    Dec 3 16:43:58 ppp [wan_link0] MRU 1492
    Dec 3 16:43:58 ppp [wan_link0] MAGICNUM 0xd0e8b707
    Dec 3 16:44:00 ppp [wan_link0] LCP: SendConfigReq #2
    Dec 3 16:44:00 ppp [wan_link0] PROTOCOMP
    Dec 3 16:44:00 ppp [wan_link0] MRU 1492
    Dec 3 16:44:00 ppp [wan_link0] MAGICNUM 0xd0e8b707

    and that is it.
    It just keeps cycling the last 4 lines.

    Clearly there is some traffic in both directions.
    I've tried disabling protocomp, because it doesn't seem like it's required.
    I also set the MRU & MTU lower. 
    Because MAGICNUM is the last thing listed I tried disabling that too.
    Nothing helped.

    I've setup PFSense 2.4.2 in a GEN2 VM (Different virtual  NICs)
    I've set up Pfsense 2.4 in a GEN1 VM.  (Revert to a slightly earlier version of PFsense)
    I grabbed (Shhhh, dirt word) OpenSense and put that in a GEN1 VM.  (Different OS)

    I then grabbed a Dell T620 with Intel NICs and rebuilt 2.4.2 on the metal.

    Everything gave the same logs.  And nothing worked.

    Now, here is the HUGE problem.  This is what will get the ISPs backup.  And why I'm here first and not with the ISP.

    Windows 10 and Windows 2012 R2 PPPoE connections work.  I created one on the Hyper V host, so teh same physical hardware as the virtual PFSense.  Also works on a crappy Lenovo laptop running Windows 10 with a realtek NIC

    I realize this is asking a lot but does anyone have any ideas what Windows does differently at this point to what PFSense (and I guess FreeBSd and probably all the BSDs)

    TL;DR
    PFSense bad
    Windows good

    ISP will say "Windows good, go away, it's not us".

    I'm a Windows admin guy.  I expect BSD stuff to just work.  My first, second and third thoughts were all "Windows dropped it".  Well, it isn't Windows.  And I really don't think it's PFSense.

    Any suggestions, thoughts comments or considerations will be greatly appreciated.

    1 specific question : Is there any way to get more verbose logging out of PFsense and PPPoE and LCP, etc.



  • Do you have another firewall, from D-Link, Linksys, etc. available?  If so, connect it to the line and see what happens.  You might also try running packet capture on the interface to see what's happening.



  • Which version are you running?
    Remember that at least 2.4.1 had "problems" with PPPoE on VLANs, which aren't uncommon in a virtualized environment.



  • 2.4.2
    But I also used 2.4.1 & 2.4.0 with no problems.

    I also tried using previous versions of 2.4.0

    I didn't try earlier than that.
    I don't have any VLANs and never have had.



  • Trying a packet capture got me nothing.

    Untangle works and shares traffic as required.

    So, for the moment I've shutdown PFsense and will run Untangle.  I'll revisit after an update to PFsense.

    End result
    Windows 2012 R2 PPPoE, Windows 10 & Untangle all work.
    PFSense isn't functional for me, at this point in time.



  • Did you try a fresh pfSense install without restoring your config? Just install it vanilla and configure manually only the PPPoE interface, forget your config for now.



  • @robi:

    Did you try a fresh pfSense install without restoring your config? Just install it vanilla and configure manually only the PPPoE interface, forget your config for now.

    Yes.
    repeatedly.
    I also used a new install of OpenSense.
    I even went so far as to install PFSense 2.4.2 fresh on different hardware.

    I never actually restored an existing configuration in my testing.  I just created new PPPOE connections, because it's not hard.



  • Tried pfSense and VMWare ?



  • @fredfox_uk:

    Tried pfSense and VMWare ?

    No.
    I have no experience with VMWare.

    And it isn't the Virtualization in any case.


Log in to reply