Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    WAN interface stops working at random times often after nights or when internet has not been used as much (and we dont use much traffic since we have a 4G modem from Teltonika)

    Scheduled Pinned Locked Moved General pfSense Questions
    6 Posts 2 Posters 658 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • V
      vermium
      last edited by

      This problem started BEFORE change of hardware, so I though it was my Realtek NICs so I changed to another server with use of Dell LOM quad port gigabit nic. And started virtualising it instead, and the problem persists, I swapped my old Dlink 4G modem to a Teltonika RUT950, and problem still exists for me, I tried with and without hardware offloading, my temporary fix is to force another duplex, for example if I have it on default I can change it to 1000BaseT Full-Duplex and it will work, as soon as it stops working I can change it to 1000BaseT instead for example, and it starts working again. Then it stops after some time again, it could be 10 minutes/1 day/4 days. Litterly random times, this is really frustrating since I'm not always on the location this pfSense is hosted.

      	2022-07-28 23:41:16	notice	rtr01	CHECK_RELOAD_STATUS	Restarting OpenVPN tunnels/interfaces	notice
      	2022-07-28 23:41:16	notice	rtr01	CHECK_RELOAD_STATUS	Reloading filter	notice
      	2022-07-28 23:41:19	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:41:20	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:41:24	notice	rtr01	RADIUSD	(229) Login OK: [arvid] (from client AxelClosetAP01E port 0 via TLS tunnel)	notice
      	2022-07-28 23:41:24	notice	rtr01	RADIUSD	(230) Login OK: [arvid] (from client AxelClosetAP01E port 0 cli DC-A2-66-66-0A-39)	notice
      	2022-07-28 23:41:48	notice	rtr01	RADIUSD	(242) Login OK: [arvid] (from client AxelClosetAP01E port 0 via TLS tunnel)	notice
      	2022-07-28 23:41:48	notice	rtr01	RADIUSD	(243) Login OK: [arvid] (from client AxelClosetAP01E port 0 cli DC-A2-66-66-0A-39)	notice
      	2022-07-28 23:41:49	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:41:50	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:42:19	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:42:21	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:42:50	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:42:51	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:43:20	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:43:21	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:43:50	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:43:51	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:44:00	err	rtr01	NGINX	2022/07/28 23:44:00 [error] 78894#100710: *12076 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.14.30.60, server: , request: \"POST /widgets/widgets/system_information.widget.php HTTP/2.0\", upstream: \"fastcgi://unix:/var/run/php-fpm.socket\", host: \"rtr01.prd.se-mmx.zyner.net\", referrer: \"https://rtr01.prd.se-mmx.zyner.net/\"	err
      	2022-07-28 23:44:06	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:44:07	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:44:15	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:44:16	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:44:19	info	rtr01	KERNEL	igb0: link state changed to DOWN	info
      	2022-07-28 23:44:19	notice	rtr01	CHECK_RELOAD_STATUS	Linkup starting igb0	notice
      	2022-07-28 23:44:19	warning	rtr01	DPINGER	WAN_DHCP 100.105.216.1: sendto error: 50	warning
      	2022-07-28 23:44:20	warning	rtr01	DPINGER	WAN_DHCP 100.105.216.1: sendto error: 50	warning
      	2022-07-28 23:44:20	err	rtr01	PHP-FPM	/rc.linkup: DEVD Ethernet detached event for wan	err
      	2022-07-28 23:44:20	warning	rtr01	DPINGER	WAN_DHCP 100.105.216.1: sendto error: 50	warning
      	2022-07-28 23:44:21	warning	rtr01	DPINGER	WAN_DHCP 100.105.216.1: sendto error: 50	warning
      	2022-07-28 23:44:21	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb1: Operation not permitted	warning
      	2022-07-28 23:44:21	notice	rtr01	CHECK_RELOAD_STATUS	Reloading filter	notice
      	2022-07-28 23:44:21	warning	rtr01	DPINGER	send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 2a06:a004:7010::1 bind_addr 2a06:a004:7010::2 identifier \"ROUTE48_MALMOv6 \"	warning
      	2022-07-28 23:44:22	warning	rtr01	LLDPD	unable to send second SONMP packet on real device for igb2: Operation not permitted	warning
      	2022-07-28 23:44:23	notice	rtr01	CHECK_RELOAD_STATUS	Linkup starting igb0	notice
      	2022-07-28 23:44:23	info	rtr01	KERNEL	igb0: link state changed to UP	info
      	2022-07-28 23:44:23	warning	rtr01	DPINGER	ROUTE48_MALMOv6 2a06:a004:7010::1: Alarm latency 0us stddev 0us loss 100%	warning
      	2022-07-28 23:44:23	info	rtr01	RC.GATEWAY_ALARM	>>> Gateway alarm: ROUTE48_MALMOv6 (Addr:2a06:a004:7010::1 Alarm:1 RTT:0.000ms RTTsd:0.000ms Loss:100%)	info
      	2022-07-28 23:44:23	notice	rtr01	CHECK_RELOAD_STATUS	updating dyndns ROUTE48_MALMOv6	notice
      	2022-07-28 23:44:23	notice	rtr01	CHECK_RELOAD_STATUS	Restarting IPsec tunnels	notice
      	2022-07-28 23:44:23	notice	rtr01	CHECK_RELOAD_STATUS	Restarting OpenVPN tunnels/interfaces	notice
      	2022-07-28 23:44:23	notice	rtr01	CHECK_RELOAD_STATUS	Reloading filter	notice
      	2022-07-28 23:44:24	err	rtr01	PHP-FPM	/rc.linkup: DEVD Ethernet attached event for wan	err
      	2022-07-28 23:44:24	err	rtr01	PHP-FPM	/rc.linkup: HOTPLUG: Configuring interface wan	err
      	2022-07-28 23:44:24	notice	rtr01	CHECK_RELOAD_STATUS	rc.newwanip starting igb0	notice
      	2022-07-28 23:44:24	err	rtr01	PHP-FPM	/rc.linkup: calling interface_dhcpv6_configure.	err
      	2022-07-28 23:44:24	err	rtr01	PHP-FPM	/rc.linkup: Accept router advertisements on interface igb0	err
      	2022-07-28 23:44:24	err	rtr01	PHP-FPM	/rc.linkup: Starting rtsold process	err
      	2022-07-28 23:44:25	err	rtr01	PHP-FPM	/rc.newwanip: rc.newwanip: Info: starting on igb0.	err
      	2022-07-28 23:44:25	err	rtr01	PHP-FPM	/rc.newwanip: rc.newwanip: on (IP address: 100.105.216.234) (interface: WAN[wan]) (real interface: igb0).
      

      This is the logs when it happens, and no LLDPD message is not the problem, I have fix the LLDP problem and the problem was happening before I made the change in LLDP so they are not related to each other.

      I have PCI passthrough my Intel NIC and disabled memory ballooning and set a higher CPU share than other VMs so basicly the VM is like Bare-metal. And this happens ONLY with WAN interface, so if I plug my WAN into igb0 it starts happening after some time, and if I move WAN cable to igb3 for example, it starts working again and then when it stops woring I can change back to igb0 for example. And I also set igb0 or igb3 on WAN interface on pfSense as I should. So I think what's needed is that I take my wan interface and like make it go DOWN and then UP like it should if I would run ip link set up/down or what the command is on Linux. I would really really really enjoy some help here, I am really frustrated. Thanks! I can share more if needed.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Yeah, it shows it actually losing link but then recovering.

        At the end of that log are you still unable to pass traffic?

        You can run ifconfig igb0 down; ifconfig igb0 up and it will likely restore it.

        You might try not using PCI passthrough if you have not already. If it's losing link because of something upstream that would isolate it from that.

        Steve

        V 1 Reply Last reply Reply Quote 0
        • V
          vermium @stephenw10
          last edited by

          @stephenw10 Hey! Thanks for trying to help me out :D I think PCI passthrough disabled won't make any changes, since it was having problems even when I had it bare-metal before. Any other ideas?

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Mmm, well in this instance using a VM without PCI pass through would isolate the VM from link changes on the real NIC. So it might well behave differently from bare-metal.

            V 1 Reply Last reply Reply Quote 0
            • V
              vermium @stephenw10
              last edited by

              @stephenw10 Yeah but that seems like a dumb way of "solving" it, doesn't it?

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Yes, long term, I agree. But if it makes any difference at all then that's clue as to what the actual cause might be.

                Otherwise wait for it to fail again and then start digging into what's actually not working.
                What does ifconfig show?
                Do you see anything in a pcap?

                Steve

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.