Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Not getting a DHCP WAN IP Address on netgate hardware.

    Scheduled Pinned Locked Moved General pfSense Questions
    47 Posts 6 Posters 6.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Austin 0A
      Austin 0 @Austin 0
      last edited by Austin 0

      @Austin-0 Looks like I spoke too soon. It worked for 5-10 minutes or so and then I got 100% packet loss according to the gateway monitor. I rebooted, and the same thing happened. It worked for 5-10 minutes and then it was dropping all of the packets. Below are the logs from after the reboot. As you can see, It came back up from the reboot at 16:38, and dpinger sent the alarm at 16:45.

      System Logs 2023-09-10.png

      stephenw10S 1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        For the 4100 specifically?

        Austin 0A 1 Reply Last reply Reply Quote 0
        • Austin 0A
          Austin 0 @stephenw10
          last edited by

          @stephenw10 Yes

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator @Austin 0
            last edited by

            @Austin-0 said in Not getting a DHCP WAN IP Address on netgate hardware.:

            Looks like I spoke too soon

            Did the switch lose link? pfSense only shows the pings to 1.1.1.1 started to fail.

            Austin 0A 1 Reply Last reply Reply Quote 0
            • Austin 0A
              Austin 0 @stephenw10
              last edited by

              @stephenw10 The switch did not lose the link. 1.1.1.1 is what I have gateway monitoring set to. However I can confirm that all internet access was lost at that time, not just to 1.1.1.1.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Even to the actual gateway? Is it still in the ARP table?

                Austin 0A 1 Reply Last reply Reply Quote 0
                • Austin 0A
                  Austin 0 @stephenw10
                  last edited by Austin 0

                  @stephenw10 I ran a tracert from one of the computers at the time of the failure and it only got to the Pfsense box so yes even the connection to the gateway was down. As far as the ARP table goes I have unfortunately left the building, and I won't be back until at least Friday. I will test it again asap and look at the ARP table this time.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    The ISP gateway may not appear in a traceroute. If you've tried it before and it did of course it still should.

                    Austin 0A JKnottJ 2 Replies Last reply Reply Quote 0
                    • Austin 0A
                      Austin 0 @stephenw10
                      last edited by

                      @stephenw10 Okay I have confirmed that the ISP gateway does appear in tracerts normally. Also, the ISP gateway stays in the arp table.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Hmm, hard to say then. pfSense still has an IP on the WAN I assume? But it cannot ping the WAN gateway even though it's in the ARP table?

                        Do you see the pings in a pcap on WAN?

                        Works for a few minutes then stops sure seems like it could be an ARP issue.

                        Austin 0A 1 Reply Last reply Reply Quote 0
                        • Austin 0A
                          Austin 0 @stephenw10
                          last edited by Austin 0

                          @stephenw10 Here is the packet capture. I have replaced the public IP with xxx.xxx.xxx.xxx. You can see the ICMP requests that dpringer is making to 1.1.1.1. I noticed a lot of these are getting flagged for bad checksum, but I am not quite sure what to do about that.

                          10:45:51.905405 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 77: (tos 0x0, ttl 127, id 38590, offset 0, flags [none], proto UDP (17), length 63, bad cksum 0 (->861d)!)
                              xxx.xxx.xxx.xxx.35343 > 1.1.1.1.53: [udp sum ok] 49570+ A? forum.netgate.com. (35)
                          10:45:51.905451 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 77: (tos 0x0, ttl 127, id 38591, offset 0, flags [none], proto UDP (17), length 63, bad cksum 0 (->861c)!)
                              xxx.xxx.xxx.xxx.42560 > 1.1.1.1.53: [udp sum ok] 51250+ Type65? forum.netgate.com. (35)
                          10:45:51.921752 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 83: (tos 0x0, ttl 127, id 38592, offset 0, flags [none], proto UDP (17), length 69, bad cksum 0 (->8615)!)
                              xxx.xxx.xxx.xxx.60130 > 1.1.1.1.53: [udp sum ok] 19119+ A? signaler-pa.youtube.com. (41)
                          10:45:51.921848 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 83: (tos 0x0, ttl 127, id 38593, offset 0, flags [none], proto UDP (17), length 69, bad cksum 0 (->8614)!)
                              xxx.xxx.xxx.xxx.19205 > 1.1.1.1.53: [udp sum ok] 21591+ Type65? signaler-pa.youtube.com. (41)
                          10:45:51.934131 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 127, id 21954, offset 0, flags [DF], proto TCP (6), length 40, bad cksum 0 (->c8e1)!)
                              xxx.xxx.xxx.xxx.8358 > 52.226.139.121.443: Flags [R.], cksum 0xb229 (correct), seq 3206783296, ack 1699559528, win 0, length 0
                          10:45:52.200962 60:22:32:46:45:0d > 01:80:c2:00:00:00, 802.3, length 39: LLC, dsap STP (0x42) Individual, ssap STP (0x42) Command, ctrl 0x03: STP 802.1w, Rapid STP, Flags [Learn, Forward, Agreement], bridge-id 8000.60:22:32:46:45:0c.8010, length 43
                          	message-age 0.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
                          	root-id 8000.60:22:32:46:45:0c, root-pathcost 0, port-role Designated
                          10:45:52.216361 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 43: (tos 0x0, ttl 64, id 63816, offset 0, flags [none], proto ICMP (1), length 29, bad cksum 0 (->62c5)!)
                              xxx.xxx.xxx.xxx > 1.1.1.1: ICMP echo request, id 47797, seq 1577, length 9
                          10:45:52.240408 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 15133, offset 0, flags [none], proto UDP (17), length 72, bad cksum 0 (->12a8)!)
                              xxx.xxx.xxx.xxx.6424 > 8.8.8.8.53: [bad udp cksum 0x2d26 -> 0xc857!] 13895+ PTR? 8.179.243.104.in-addr.arpa. (44)
                          10:45:52.297247 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 55: (tos 0x0, ttl 127, id 13743, offset 0, flags [DF], proto TCP (6), length 41, bad cksum 0 (->d40b)!)
                              xxx.xxx.xxx.xxx.50716 > 172.64.41.3.443: Flags [.], cksum 0x25e3 (correct), seq 110458962:110458963, ack 572074752, win 1028, length 1
                          10:45:52.720360 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 43: (tos 0x0, ttl 64, id 23499, offset 0, flags [none], proto ICMP (1), length 29, bad cksum 0 (->43)!)
                              xxx.xxx.xxx.xxx > 1.1.1.1: ICMP echo request, id 47797, seq 1578, length 9
                          10:45:52.835580 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 63, id 57232, offset 0, flags [DF], proto TCP (6), length 1500, bad cksum 0 (->4b70)!)
                              xxx.xxx.xxx.xxx.25868 > 3.95.234.235.30011: Flags [.], cksum 0x1094 (correct), seq 707216205:707217653, ack 148236916, win 166, options [nop,nop,TS val 35204921 ecr 94619178], length 1448
                          10:45:52.835654 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 63, id 57233, offset 0, flags [DF], proto TCP (6), length 1500, bad cksum 0 (->4b6f)!)
                              xxx.xxx.xxx.xxx.25868 > 3.95.234.235.30011: Flags [.], cksum 0x7d8e (correct), seq 1448:2896, ack 1, win 166, options [nop,nop,TS val 35204921 ecr 94619178], length 1448
                          10:45:52.835665 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 63, id 57234, offset 0, flags [DF], proto TCP (6), length 1500, bad cksum 0 (->4b6e)!)
                              xxx.xxx.xxx.xxx.25868 > 3.95.234.235.30011: Flags [.], cksum 0x77e6 (correct), seq 2896:4344, ack 1, win 166, options [nop,nop,TS val 35204921 ecr 94619178], length 1448
                          10:45:52.835779 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 1461: (tos 0x0, ttl 63, id 57235, offset 0, flags [DF], proto TCP (6), length 1447, bad cksum 0 (->4ba2)!)
                              xxx.xxx.xxx.xxx.25868 > 3.95.234.235.30011: Flags [P.], cksum 0x5acb (correct), seq 4344:5739, ack 1, win 166, options [nop,nop,TS val 35204921 ecr 94619178], length 1395
                          10:45:52.871216 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 70: (tos 0x0, ttl 127, id 47650, offset 0, flags [none], proto UDP (17), length 56, bad cksum 0 (->54b2)!)
                              xxx.xxx.xxx.xxx.7567 > 8.8.8.8.53: [udp sum ok] 57083+ A? dns.google. (28)
                          10:45:52.871224 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 70: (tos 0x0, ttl 127, id 38594, offset 0, flags [none], proto UDP (17), length 56, bad cksum 0 (->8620)!)
                              xxx.xxx.xxx.xxx.40601 > 1.1.1.1.53: [udp sum ok] 57083+ A? dns.google. (28)
                          10:45:52.918662 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 70: (tos 0x0, ttl 127, id 47651, offset 0, flags [none], proto UDP (17), length 56, bad cksum 0 (->54b1)!)
                              xxx.xxx.xxx.xxx.31362 > 8.8.8.8.53: [udp sum ok] 54725+ A? dns.google. (28)
                          10:45:52.918707 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 70: (tos 0x0, ttl 127, id 47652, offset 0, flags [none], proto UDP (17), length 56, bad cksum 0 (->54b0)!)
                              xxx.xxx.xxx.xxx.8219 > 8.8.8.8.53: [udp sum ok] 16179+ Type65? dns.google. (28)
                          10:45:52.919544 90:ec:77:34:73:8e > 78:ba:f9:30:82:33, ethertype IPv4 (0x0800), length 77: (tos 0x0, ttl 127, id 47653, offset 0, flags [none], proto UDP (17), length 63, bad cksum 0 (->54a8)!)
                          
                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Mmm, nothing coming back from the gateway at all though.

                            The checksum errors are because hardware checksum off-loading is enabled. That's not a problem but you can disable it in Sys > Adv > Networking

                            Austin 0A 1 Reply Last reply Reply Quote 0
                            • Austin 0A
                              Austin 0 @stephenw10
                              last edited by

                              @stephenw10 Yeah nothing comes back. It is weird.

                              stephenw10S 1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator @Austin 0
                                last edited by

                                In you can install the arping pkg you can try arping for the gateway:

                                [23.09-DEVELOPMENT][admin@4100-3.stevew.lan]/root: pkg install arping
                                Updating pfSense-core repository catalogue...
                                Fetching meta.conf:   0%
                                pfSense-core repository is up to date.
                                Updating pfSense repository catalogue...
                                Fetching meta.conf:   0%
                                pfSense repository is up to date.
                                All repositories are up to date.
                                The following 2 package(s) will be affected (of 0 checked):
                                
                                New packages to be INSTALLED:
                                	arping: 2.21_1 [pfSense]
                                	libnet: 1.2,1 [pfSense]
                                
                                Number of packages to be installed: 2
                                
                                118 KiB to be downloaded.
                                
                                Proceed with this action? [y/N]: y
                                [1/2] Fetching libnet-1.2,1.pkg: 100%   92 KiB  94.1kB/s    00:01    
                                [2/2] Fetching arping-2.21_1.pkg: 100%   26 KiB  26.5kB/s    00:01    
                                Checking integrity... done (0 conflicting)
                                [1/2] Installing libnet-1.2,1...
                                [1/2] Extracting libnet-1.2,1: 100%
                                [2/2] Installing arping-2.21_1...
                                [2/2] Extracting arping-2.21_1: 100%
                                [23.09-DEVELOPMENT][admin@4100-3.stevew.lan]/root: rehash
                                

                                Then:

                                [23.09-DEVELOPMENT][admin@4100-3.stevew.lan]/root: arping -c 3 172.21.16.1
                                ARPING 172.21.16.1
                                60 bytes from 00:08:a2:0c:c9:91 (172.21.16.1): index=0 time=767.357 usec
                                60 bytes from 00:08:a2:0c:c9:91 (172.21.16.1): index=1 time=661.690 usec
                                60 bytes from 00:08:a2:0c:c9:91 (172.21.16.1): index=2 time=682.343 usec
                                
                                --- 172.21.16.1 statistics ---
                                3 packets transmitted, 3 packets received,   0% unanswered (0 extra)
                                rtt min/avg/max/std-dev = 0.662/0.704/0.767/0.046 ms
                                

                                If the gateway doesn't respond even to arp there must be something low level disconnected somehow.

                                The ARP entry in the table will expired after ~15mins so it may appear to be there still even if it's not responding at all.

                                JonathanLeeJ 1 Reply Last reply Reply Quote 0
                                • JonathanLeeJ
                                  JonathanLee @stephenw10
                                  last edited by

                                  This post is deleted!
                                  1 Reply Last reply Reply Quote 0
                                  • JonathanLeeJ
                                    JonathanLee
                                    last edited by JonathanLee

                                    What about the MTU settings? Does that matter with ONT modems? Also a duplex mismatch could occur Is the connection set to auto or full duplex on the WAN? I think it's a duplex mismatch as it corrects with a switch so the switch could be set to auto negotiation, and somehow the firewall is set to half of something.

                                    https://docs.netgate.com/pfsense/en/latest/troubleshooting/low-throughput.html

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      There appear to be two issues here, at least. Firstly the ONT seems to be set to 100M fixed which means the interfaces on the 4100 cannot link to it directly.

                                      Secondly the ISP gateway stops responding after some time. That's unlikely to be an MTU issue because pings are tiny. As are the DHCP requests.

                                      We have seen something similar to this previously. A misbehaving ISP gateway stopped responding when it's ARP entry expired instead of sending an ARP request to renew it. IIRC we worked around it by setting the pfSense ARP expiry time low so that it sends an ARP request before the gateway expires it's entry. By default it's 20mins:

                                      [23.09-DEVELOPMENT][admin@4100-3.stevew.lan]/root: sysctl net.link.ether.inet.max_age
                                      net.link.ether.inet.max_age: 1200
                                      

                                      Try setting that to 5mins and see if that allows it to continue:

                                      [23.09-DEVELOPMENT][admin@4100-3.stevew.lan]/root: sysctl net.link.ether.inet.max_age=300
                                      net.link.ether.inet.max_age: 1200 -> 300
                                      

                                      If that works you can add it as a system tunable.

                                      Running an arping against the gateway would probably also renew the remote ARP entry.

                                      Both are hacks that shouldn't be required! 😉

                                      Austin 0A 2 Replies Last reply Reply Quote 0
                                      • Austin 0A
                                        Austin 0 @stephenw10
                                        last edited by

                                        @stephenw10 Thank you for your time on this. I will not have physical access to the device until Friday or Saturday. I will try it again and let you know what happens asap.

                                        1 Reply Last reply Reply Quote 1
                                        • Austin 0A
                                          Austin 0 @stephenw10
                                          last edited by

                                          @stephenw10 This was the result of ARPing the gateway's mac

                                          ce45fb44-7c26-4d53-9438-b0871b40eb8b-image.png

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            I assume that's after it stops responding? Does that ARPing work initially?

                                            Did you try setting a lower max_age value?

                                            Austin 0A 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.