PPPoE stops working on ESXi



  • I'm running pfSense on ESXi. My modem is bridged with the vSwitch that pfSense is on. Most of the time everything works fine but if I reboot the ESXi host then my PPPoE no longer connects. It will refuse to connect for several hours. No amount of rebooting can get it working again. The log shows the following:

    May 30 11:27:24 	ppp: [wan_link0] Link: reconnection attempt 97
    May 30 11:27:24 	ppp: [wan_link0] PPPoE: Connecting to ''
    May 30 11:27:33 	ppp: [wan_link0] PPPoE connection timeout after 9 seconds
    May 30 11:27:33 	ppp: [wan_link0] Link: DOWN event
    May 30 11:27:33 	ppp: [wan_link0] LCP: Down event
    May 30 11:27:33 	ppp: [wan_link0] Link: reconnection attempt 98 in 2 seconds
    May 30 11:27:35 	ppp: [wan_link0] Link: reconnection attempt 98
    May 30 11:27:35 	ppp: [wan_link0] PPPoE: Connecting to ''
    May 30 11:27:44 	ppp: [wan_link0] PPPoE connection timeout after 9 seconds
    May 30 11:27:44 	ppp: [wan_link0] Link: DOWN event
    May 30 11:27:44 	ppp: [wan_link0] LCP: Down event
    May 30 11:27:44 	ppp: [wan_link0] Link: reconnection attempt 99 in 4 seconds
    

    During this time I am able to successfully connect from my desktop PC by creating a PPPoE connection in Windows, so it's not a problem with the ISP/modem. I am also unable to connect from other Windows VMs running on the same ESXi host (they just say Eror 651) so it looks like some kind of problem with ESXi not passing through PPPoE properly.

    Has anyone experienced this and found a solution?

    EDIT:
    I used Wireshark to comare the PPPoED packets from ESXi and my (working) desktop PC. The PADI packets from ESXi (both pfSense and Windows VMs dialing PPPoE) have incorrect payload lengths. There are 20 extra bytes padded at the end of the frame.

    No.     Time        Source                Destination           Protocol Length Info
          1 0.000000    Vmware_a3:aa:13       Broadcast             PPPoED   60     Active Discovery Initiation (PADI)
    
    Frame 1: 60 bytes on wire (480 bits), 60 bytes captured (480 bits)
        WTAP_ENCAP: 1
        Arrival Time: May 30, 2013 13:43:51.213474000 India Standard Time
        [time]
        Epoch Time: 1369901631.213474000 seconds
        [time]
        [time]
        [time]
        Frame Number: 1
        Frame Length: 60 bytes (480 bits)
        Capture Length: 60 bytes (480 bits)
        [Frame is marked: True]
        [Frame is ignored: False]
        [Protocols in frame: eth:pppoed]
        [Coloring Rule Name: Broadcast]
        [Coloring Rule String: eth[0] & 1]
    Ethernet II, Src: Vmware_a3:aa:13 (00:0c:29:a3:aa:13), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
        Destination: Broadcast (ff:ff:ff:ff:ff:ff)
        Source: Vmware_a3:aa:13 (00:0c:29:a3:aa:13)
        Type: PPPoE Discovery (0x8863)
    PPP-over-Ethernet Discovery
        0001 .... = Version: 1
        .... 0001 = Type: 1
        Code: Active Discovery Initiation (PADI) (0x09)
        Session ID: 0x0000
        Payload Length: 20
        PPPoE Tags
            Host-Uniq: 020000000000000002000000
    
    0000  ff ff ff ff ff ff 00 0c 29 a3 aa 13 88 63 11 09   ........)....c..
    0010  00 00 00 14 01 01 00 00 01 03 00 0c 02 00 00 00   ................
    0020  00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00   ................
    0030  00 00 00 00 00 00 00 00 00 00 00 00               ............
    
    No.     Time        Source                Destination           Protocol Length Info
          2 8.445874    AsustekC_73:36:98     Broadcast             PPPoED   40     Active Discovery Initiation (PADI)
    
    Frame 2: 40 bytes on wire (320 bits), 40 bytes captured (320 bits)
        WTAP_ENCAP: 1
        Arrival Time: May 30, 2013 13:43:59.659348000 India Standard Time
        [time]
        Epoch Time: 1369901639.659348000 seconds
        [time]
        [time]
        [time]
        Frame Number: 2
        Frame Length: 40 bytes (320 bits)
        Capture Length: 40 bytes (320 bits)
        [Frame is marked: True]
        [Frame is ignored: False]
        [Protocols in frame: eth:pppoed]
        [Coloring Rule Name: Broadcast]
        [Coloring Rule String: eth[0] & 1]
    Ethernet II, Src: AsustekC_73:36:98 (00:1f:c6:73:36:98), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
        Destination: Broadcast (ff:ff:ff:ff:ff:ff)
        Source: AsustekC_73:36:98 (00:1f:c6:73:36:98)
        Type: PPPoE Discovery (0x8863)
    PPP-over-Ethernet Discovery
        0001 .... = Version: 1
        .... 0001 = Type: 1
        Code: Active Discovery Initiation (PADI) (0x09)
        Session ID: 0x0000
        Payload Length: 20
        PPPoE Tags
            Host-Uniq: 11000000000000001d000000
    
    0000  ff ff ff ff ff ff 00 1f c6 73 36 98 88 63 11 09   .........s6..c..
    0010  00 00 00 14 01 01 00 00 01 03 00 0c 11 00 00 00   ................
    0020  00 00 00 00 1d 00 00 00                           ........
    
    [PADI.pcap.txt](/public/_imported_attachments_/1/PADI.pcap.txt)[/time][/time][/time][/time][/time][/time][/time][/time]
    


  • It has now started working on pfSense again but when I disconnect pfSense my desktop PC can no longer connect! Is it possible that my ISP is doing some kind of temporary MAC binding, like once a PPPoE connection is established from a MAC address, no other MAC address can connect for a specified amount of time or until the first one is gracefully terminated? The problem mostly occurs after a non-graceful disconnect so I suspect the MAC lock is not released.


Locked