if_pppoe with frequent connection losses due to ISP connection making firewall unstable
-
@stephenw10 nope, no crash report
-
Have you seen it before or is that the first time?
How frequently was the ISP dropping the line? I assume the link between pfSense and the modem remained up?
I haven't seen that in testing when manually disconnecting the link. Or when we hit drivers that flapped continually. But that was the local link....
-
@stephenw10 I have never seen this before but the dsl line was dropping it quite frequently maybe 2 or 3 times per minute at the worst case.
Yeah, the modem-firewall link was up. -
@Laxarus said in if_pppoe with frequent connection losses due to ISP connection making firewall unstable:
Running pfSense Plus: 25.03.b.20250409.2208
Is there any reason to be running an outdated beta version? A lot has changed since then.
-
@w0w well, that version did not give any problems until now and it has been rock solid except for this edge case. I was thinking of waiting for the stable one.
-
Ah, missed that. There have been a bunch of fixes gone in since then. Some specifically for if_pppoe. If you can please test the current public beta. As many edge case scenarios hitting that as we can get will help prove it.
-
@stephenw10 yeah, I will try to update. But it is not likely that the circumstances leading to this will repeat again in the near future or ever since this was a very rare case.
-
I understand. Still you might have some other ISP quirk we haven't seen yet.
-
@stephenw10 alright, updated to 25.03.b.20250610.1659. Everything seems ok.
-
@stephenw10 same thing happened again with the updated beta.
ISP working on the lines >> frequent disconnects >> then everything goes down.
I still could not trace the systems logs, it is filled with these up to the system.log.6.gz. And with the way it is logging these aggressively (6 lines per sec), I don't think I would be able to trace this unless I catch it immediately.
Jul 8 06:15:09 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:09 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:10 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:10 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:10 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:10 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:10 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:10 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:11 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:11 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:11 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:11 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:11 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:11 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:12 FIREWALL kernel: sonewconn: pcb 0xfffff80168821a80 (0.0.0.0:53 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (49 occurrences), euid 0, rgid 0, jail 0 Jul 8 06:15:12 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:12 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:12 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:12 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:12 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:12 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:13 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:13 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:13 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:13 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:13 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:13 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:14 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:14 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:14 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:14 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:14 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:14 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:15 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:15 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:15 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:15 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:15 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:15 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:16 FIREWALL nginx: 2025/07/08 06:15:16 [error] 5260#100353: *72974 connect() to unix:/var/run/php-fpm.socket failed (61: Connection refused) while connecting to upstream, client: 192.168.1.1, server: , request: "GET / HTTP/1.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket:" Jul 8 06:15:16 FIREWALL kernel: sonewconn: pcb 0xfffff80768185c80 (local:/var/run/php-fpm.socket): Listen queue overflow: 193 already in queue awaiting acceptance (361 occurrences), euid 0, rgid 0, jail 0 Jul 8 06:15:16 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:16 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:16 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:16 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:16 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:16 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:17 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:17 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket Jul 8 06:15:17 FIREWALL check_reload_status[614]: Could not connect to /var/run/php-fpm.socket
ppp.log
Jul 7 23:16:11 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 7 23:19:53 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 7 23:22:12 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 7 23:22:12 FIREWALL kernel: if_pppoe: pppoe0: failed to clear IP address: 49 Jul 7 23:35:17 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 7 23:41:19 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 7 23:48:59 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 00:02:59 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 00:11:19 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 00:15:59 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 00:57:59 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 01:51:49 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 02:15:29 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 02:20:59 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 02:39:09 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 03:01:39 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 03:14:39 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 04:49:39 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 05:46:19 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 05:56:39 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout Jul 8 06:30:19 FIREWALL kernel: if_pppoe: pppoe0: LCP keepalive timeout
-
You see anything churning in the output of
top -HaSP
orps -auxwwd
? -
@stephenw10 I had to reboot to get it back online, so I missed that. I will check this for the next time. Or if you want me to check any additional things when this happens again, please tell me in advance (:.
But from what I can tell
Unbound was not working
UI was inaccessible
for some weird reason haproxy was working
internet and routing was working but without dns -
Hmm, hard to know what would cause this without more data of some sort. Unclear why if_pppoe doesn't disconnect and retry with those logs. But also why that would cause services to bog down since that implies the interface is not bouncing.
-
same thing is happening again since my ISP is still working on the lines. I left it as is to troubleshoot which is kinda hard since I am remotely connecting via cellular. If you are interested to troubleshoot and get to the bottom of this, you can pm me with your discord? I have to bring this online and cannot keep it like this for a long time. Please, let me know.
console output I see
top -HaSP
ps -auxwwd
[25.03-BETA][root@<<>>]/root: ps -auxwwd USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 0 58.9 0.0 0 2480 - DLs Tue08 498:25.09 [kernel] root 11 527.6 0.0 0 128 - RNL Tue08 37708:03.62 - [idle] root 2 22.0 0.0 0 128 - WL Tue08 287:39.83 - [clock] root 1 0.0 0.0 12376 1232 - SLs Tue08 0:01.18 - /sbin/init ntopng 82894 10.3 2.3 1002512 773128 - Ss 19:55 23:24.75 |-- /usr/local/bin/ntopng /usr/local/etc/ntopng.conf root 78380 6.9 0.0 14692 3072 - Ss 23:28 0:00.00 |-- /usr/local/bin/dpinger -S -r 0 -i IPSEC_S2S_VTI_VTIV4 -B 172.28.15.1 -p /var/run/dpinger_IPSEC_S2S_VTI_VTIV4~172.28.15.1~172.28.15.2.pid -u /var/run/dpinger_IPSEC_S2S_VTI_VTIV4~172.28.15.1~172.28.15.2.sock -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 172.28.15.2 root 73869 6.8 0.0 14692 3076 - Ss 23:28 0:00.00 |-- /usr/local/bin/dpinger -S -r 0 -i VPNAC_WG -B 10.11.7.113 -p /var/run/dpinger_VPNAC_WG~10.11.7.113~10.11.0.1.pid -u /var/run/dpinger_VPNAC_WG~10.11.7.113~10.11.0.1.sock -C /etc/rc.gateway_alarm -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 10.11.0.1 root 75484 6.8 0.0 14692 3080 - Ss 23:28 0:00.00 |-- /usr/local/bin/dpinger -S -r 0 -i MNG_DHCP -B 192.168.2.3 -p /var/run/dpinger_MNG_DHCP~192.168.2.3~192.168.2.1.pid -u /var/run/dpinger_MNG_DHCP~192.168.2.3~192.168.2.1.sock -C /etc/rc.gateway_alarm -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 192.168.2.1 root 70976 6.7 0.0 14692 3084 - Ss 23:28 0:00.00 |-- /usr/local/bin/dpinger -S -r 0 -i WAN_PPPOE -B <<ip redacted>> -p /var/run/dpinger_WAN_PPPOE~<<ip redacted>>~10.98.238.224.pid -u /var/run/dpinger_WAN_PPPOE~<<ip redacted>>~10.98.238.224.sock -C /etc/rc.gateway_alarm -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 10.98.238.224 root 72461 6.7 0.0 14692 3084 - Ss 23:28 0:00.00 |-- /usr/local/bin/dpinger -S -r 0 -i OVPN_S2S_VPNV4 -B 172.31.254.1 -p /var/run/dpinger_OVPN_S2S_VPNV4~172.31.254.1~172.31.254.2.pid -u /var/run/dpinger_OVPN_S2S_VPNV4~172.31.254.1~172.31.254.2.sock -C /etc/rc.gateway_alarm -d 1 -s 500 -l 3500 -t 60000 -A 3000 -D 2000 -L 60 172.31.254.2 www 53603 5.5 0.1 139244 38444 - Rs 23:25 0:13.90 |-- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -x /tmp/haproxy.socket -st 46347 root 50665 1.0 0.0 13988 2604 - Ss 23:28 0:00.00 |-- /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d domain.org -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /etc/hosts root 35734 0.3 0.0 13980 2396 - SNC 23:28 0:00.00 |-- sleep 60 dhcpd 337 0.0 0.0 28564 15492 - Ss 23:27 0:00.02 |-- /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid lagg0 lagg0.25 lagg0.88 lagg0.200 lagg0.105 lagg0.33 lagg0.135 bridge0 lagg0.45 lagg0.40 lagg0.30 lagg0.55 lagg0.150 lagg0.95 root 591 0.0 0.1 112460 33192 - Ss Tue08 0:05.59 |-- php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm) root 76974 1.5 0.2 123852 60440 - S 17:07 0:33.73 | |-- php-fpm: pool nginx (php-fpm) root 79050 1.1 0.2 123852 62756 - S 16:51 0:44.38 | |-- php-fpm: pool nginx (php-fpm) root 12418 0.3 0.2 125900 62416 - S 19:19 0:53.11 | |-- php-fpm: pool nginx (php-fpm) root 92272 0.2 0.2 121740 55568 - S 20:04 0:01.17 | |-- php-fpm: pool nginx (php-fpm) root 9368 0.1 0.2 123788 60220 - S 19:44 0:05.10 | |-- php-fpm: pool nginx (php-fpm) root 36889 0.0 0.2 121804 59252 - I 16:46 0:52.99 | |-- php-fpm: pool nginx (php-fpm) root 58688 0.0 0.2 123788 61944 - I 19:38 0:05.65 | |-- php-fpm: pool nginx (php-fpm) root 90499 0.0 0.2 125900 62592 - S 17:40 0:28.68 | `-- php-fpm: pool nginx (php-fpm) root 636 0.0 0.0 14552 3268 - SNs Tue08 0:02.16 |-- /usr/local/sbin/check_reload_status root 638 0.0 0.0 14552 2968 - IN Tue08 0:00.00 | `-- check_reload_status: Monitoring daemon of check_reload_status (check_reload_status) dhcpd 1456 0.0 0.0 25620 11992 - Ss 23:27 0:00.56 |-- /usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid lagg0 lagg0.135 lagg0.30 lagg0.40 lagg0.45 lagg0.55 root 2077 0.0 0.0 13984 2392 - Ss 23:27 0:00.00 |-- /usr/local/sbin/dhcpleases6 -c /usr/local/bin/php-cgi -f /usr/local/sbin/prefixes.php -l /var/dhcpd/var/db/dhcpd6.leases root 2926 0.0 0.0 13980 2476 - Is Tue08 0:00.00 |-- /usr/local/bin/minicron 60 /var/run/cp_prunedb_guest.pid /etc/rc.prunecaptiveportal guest root 3149 0.0 0.0 13980 2496 - S Tue08 0:00.25 | `-- minicron: helper /etc/rc.prunecaptiveportal guest (minicron) root 2965 0.0 0.2 118484 55504 - I 23:27 0:00.33 |-- /usr/local/bin/php-cgi -q /usr/local/bin/notify_monitor.php root 6380 0.0 0.0 15672 5116 - Ss Tue08 0:03.24 |-- /sbin/devd -q -f /etc/pfSense-devd.conf root 11221 0.0 0.0 23960 10588 - Is Tue08 0:00.00 |-- sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups (sshd) root 15803 0.0 0.0 24116 11904 - Ss 23:25 0:01.77 | `-- sshd: root@pts/0 (sshd) root 16114 0.0 0.0 14644 3580 0 Is 23:25 0:00.01 | `-- -sh (sh) root 16649 0.0 0.0 14644 3352 0 I 23:25 0:00.00 | `-- /bin/sh /etc/rc.initial root 25429 0.0 0.0 14980 4544 0 S 23:25 0:00.02 | `-- /bin/tcsh root 86753 0.0 0.0 14712 3572 0 R+ 23:28 0:00.00 | `-- ps -auxwwd root 17632 0.0 0.0 21404 10496 - Ss 19:55 0:01.47 |-- /usr/local/libexec/nut/snmp-ups -a UPS root 21176 0.0 0.0 32960 10744 - Is Wed03 0:00.00 |-- nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-webConfigurator.conf (nginx) root 21458 0.0 0.0 35520 13212 - I Wed03 0:01.91 | |-- nginx: worker process (nginx) root 21500 0.0 0.0 35520 13804 - I Wed03 0:04.82 | `-- nginx: worker process (nginx) root 22186 0.0 0.0 32960 10652 - Is Wed03 0:00.00 |-- nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-guest-CaptivePortal.conf (nginx) root 22521 0.0 0.0 32960 11220 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 22624 0.0 0.0 32960 11220 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 22911 0.0 0.0 32960 11220 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 23076 0.0 0.0 32960 11220 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 23281 0.0 0.0 32960 11220 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 23562 0.0 0.0 32960 11220 - I Wed03 0:00.00 | `-- nginx: worker process (nginx) root 22617 0.0 0.0 37708 13040 - Ss 19:55 0:04.41 |-- redis-server: /usr/local/bin/redis-server 127.0.0.1:6379 (redis-server) root 23838 0.0 0.2 108592 66952 - Is 19:55 0:00.06 |-- /usr/local/sbin/radiusd root 23873 0.0 0.0 43200 10816 - Is Wed03 0:00.00 |-- nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-guest-CaptivePortal-SSL.conf (nginx) root 24020 0.0 0.0 43200 11380 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 24031 0.0 0.0 43200 11380 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 24229 0.0 0.0 43200 11380 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 24322 0.0 0.0 43200 11380 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 24624 0.0 0.0 43200 11380 - I Wed03 0:00.00 | |-- nginx: worker process (nginx) root 24734 0.0 0.0 43200 11380 - I Wed03 0:00.00 | `-- nginx: worker process (nginx) root 24110 0.0 0.0 14068 2820 - Ss Tue08 1:13.35 |-- /usr/local/sbin/radvd -p /var/run/radvd.pid -C /var/etc/radvd.conf -m syslog root 24458 0.0 0.0 14404 2976 - Is Tue08 0:00.00 |-- dhclient: system.syslog (dhclient) root 26929 0.0 0.0 19640 7792 - Is 19:55 0:00.00 |-- /usr/local/sbin/upsmon root 27267 0.0 0.0 19776 8252 - S 19:55 0:07.61 | `-- /usr/local/sbin/upsmon root 27908 0.0 0.0 14236 2916 - SC 19:55 0:00.96 |-- /usr/bin/tail_pfb -n0 -F /var/log/filter.log root 27989 0.0 0.2 129220 75424 - S 19:55 0:21.29 |-- /usr/local/bin/php_pfb -f /usr/local/pkg/pfblockerng/pfblockerng.inc filterlog root 28570 0.0 0.0 14236 2756 - S 19:55 0:00.74 |-- tail_pfb: system.fileargs (tail_pfb) root 28910 0.0 0.0 21392 10276 - S 19:55 1:59.67 |-- /usr/local/sbin/lighttpd_pfb -f /var/unbound/pfb_dnsbl_lighty.conf root 29866 0.0 0.0 23496 4044 - S Tue08 1:26.75 |-- /usr/local/sbin/pcscd root 32057 0.0 0.0 13980 2484 - Is Tue08 0:00.00 |-- /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh root 32129 0.0 0.0 13980 2504 - I Tue08 0:00.06 | `-- minicron: helper /usr/local/bin/ping_hosts.sh (minicron) root 32216 0.0 0.0 13980 2480 - Is Tue08 0:00.00 |-- /usr/local/bin/minicron 300 /var/run/ipsec_keepalive.pid /usr/local/bin/ipsec_keepalive.php root 32490 0.0 0.0 13980 2500 - I Tue08 0:00.05 | `-- minicron: helper /usr/local/bin/ipsec_keepalive.php (minicron) root 32401 0.0 0.0 22772 3272 - Is 19:55 0:01.49 |-- /usr/local/bin/mdns-bridge -s -c /usr/local/etc/mdns-bridge.conf -p /var/run/mdns-bridge.pid root 32784 0.0 0.0 13980 2488 - Is Tue08 0:00.00 |-- /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts root 32871 0.0 0.0 13980 2512 - I Tue08 0:00.00 | `-- minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.expireaccounts (minicron) root 33209 0.0 0.0 13980 2476 - Is Tue08 0:00.00 |-- /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data root 33844 0.0 0.0 13980 2500 - I Tue08 0:00.00 | `-- minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data (minicron) root 36542 0.0 0.0 14084 2536 - Is Tue08 0:00.00 |-- daemon: /usr/local/libexec/ipsec/charon[36875] (daemon) root 36875 0.0 0.1 132100 42200 - I Tue08 1:54.15 | `-- /usr/local/libexec/ipsec/charon --use-syslog root 37591 0.0 0.0 14084 2528 - Ss Tue08 0:00.51 |-- daemon: /usr/local/bin/tailscaled[37709] (daemon) root 37709 9.5 0.2 1291732 71888 - I Tue08 4:34.43 | `-- /usr/local/bin/tailscaled -port 41641 -tun tailscale0 -statedir /usr/local/pkg/tailscale/state root 38624 0.0 0.0 14404 3120 - Is Tue08 0:00.50 |-- dhclient: igb0 [priv] (dhclient) root 48936 0.0 0.0 25340 10020 - Ss Tue08 2:14.56 |-- /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid root 49891 0.0 0.0 21264 9732 - Ss 23:28 0:00.02 |-- /usr/local/sbin/openvpn --config /var/etc/openvpn/server1/config.ovpn root 38742 0.0 0.0 14644 3112 - S 23:28 0:00.00 | `-- /bin/sh /usr/local/sbin/ovpn_auth_verify tls guzelbahce-ovpn 1 1 CN=guzelbahce-ovpn, C=TR, L=Izmir root 38939 0.0 0.0 14008 2568 - S 23:28 0:00.00 | `-- /usr/local/sbin/fcgicli -f /etc/inc/openvpn.tls-verify.php -d servercn=guzelbahce-ovpn&depth=1&certdepth=1&certsubject=CN=guzelbahce-ovpn, C=TR, L=Izmir&serial=6643693052861348272&config=/var/etc/openvpn/server1/config.ovpn root 52242 0.0 0.0 14644 3248 - SN 23:28 0:00.02 |-- /bin/sh /var/db/rrd/updaterrd.sh root 86534 0.0 0.1 48768 28736 - RN 23:28 0:00.01 | `-- /usr/local/bin/php-cgi -q /usr/local/bin/dhcpd_gather_stats.php opt16 root 55732 0.0 0.0 14740 3840 - Ss Tue08 0:26.36 |-- /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid root 56926 0.0 0.2 79700 56712 - Ss Tue08 0:49.65 |-- php_wg: WireGuard service (php_wg) _dhcp 64283 0.0 0.0 14408 3252 - SCs Tue08 0:01.63 |-- dhclient: igb0 (dhclient) root 65028 0.0 0.0 14308 3468 - SCs Tue08 1:32.68 |-- /usr/sbin/syslogd -O rfc3164 -s -c -c -l /var/dhcpd/var/run/log -l /tmp/haproxy_chroot/var/run/log -P /var/run/syslog.pid -f /etc/syslog.conf root 65550 0.0 0.0 14084 2532 - Is Tue08 0:00.00 |-- daemon: sshguard[65790] (daemon) root 65790 0.0 0.0 14644 3064 - I Tue08 0:00.00 | `-- /bin/sh /usr/local/sbin/sshguard -i /var/run/sshguard.pid root 66213 0.0 0.0 14236 2916 - SC Tue08 0:25.49 | |-- tail -F -n 0 /var/log/auth.log root 66518 0.0 0.0 20636 6144 - SC Tue08 0:00.23 | |-- /usr/local/libexec/sshg-parser root 66686 0.0 0.0 14612 3332 - IC Tue08 0:00.15 | |-- /usr/local/libexec/sshg-blocker -a 10 -p 120 -s 1800 -w /usr/local/etc/sshguard.whitelist root 66891 0.0 0.0 14644 3072 - I Tue08 0:00.00 | `-- /bin/sh /usr/local/sbin/sshguard -i /var/run/sshguard.pid root 69514 0.0 0.0 14644 3060 - I Tue08 0:00.00 | `-- /bin/sh /usr/local/libexec/sshg-fw-pf root 67970 0.0 0.0 14308 3288 - I Tue08 0:00.52 |-- syslogd: syslogd.casper (syslogd) root 68384 0.0 0.0 20636 5412 - Is Tue08 0:00.00 |-- sshg-parser: system.net (sshg-parser) root 68818 0.0 0.0 14308 3136 - Is Tue08 0:00.00 |-- syslogd: system.net (syslogd) root 69296 0.0 0.0 14236 2756 - S Tue08 0:22.16 |-- tail: system.fileargs (tail) root 69437 0.0 0.0 14612 3324 - Is Tue08 0:00.00 |-- sshg-blocker: system.net (sshg-blocker) root 70460 0.0 0.0 14180 2920 - Is Tue08 0:02.33 |-- /usr/sbin/cron -s root 70620 0.0 0.0 20004 9232 - Ss Tue08 0:38.97 |-- /usr/local/sbin/miniupnpd -f /var/etc/miniupnpd.conf -P /var/run/miniupnpd.pid root 98727 0.0 0.2 167900 49912 - Ss Tue08 4:50.83 |-- /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1 root 4412 0.0 0.0 14440 3212 v0 Is 23:24 0:00.01 |-- login [pam] (login) root 4751 0.0 0.0 14644 3564 v0 I 23:24 0:00.01 | `-- -sh (sh) root 7006 0.0 0.0 14644 3332 v0 I 23:24 0:00.00 | `-- /bin/sh /etc/rc.initial root 99487 0.0 0.0 14980 4624 v0 I+ 23:24 0:00.01 | `-- /bin/tcsh root 31400 0.0 0.0 14104 2592 v1 Is+ Tue08 0:00.00 |-- /usr/libexec/getty Pc ttyv1 root 31490 0.0 0.0 14104 2580 v2 Is+ Tue08 0:00.00 |-- /usr/libexec/getty Pc ttyv2 root 31825 0.0 0.0 14104 2588 v3 Is+ Tue08 0:00.00 |-- /usr/libexec/getty Pc ttyv3 root 31828 0.0 0.0 14104 2584 v4 Is+ Tue08 0:00.00 |-- /usr/libexec/getty Pc ttyv4 root 32153 0.0 0.0 14104 2584 v5 Is+ Tue08 0:00.00 |-- /usr/libexec/getty Pc ttyv5 root 32423 0.0 0.0 14104 2580 v6 Is+ Tue08 0:00.00 |-- /usr/libexec/getty Pc ttyv6 root 32475 0.0 0.0 14104 2592 v7 Is+ Tue08 0:00.00 `-- /usr/libexec/getty Pc ttyv7 root 3 0.0 0.0 0 144 - DL Tue08 0:00.00 - [crypto] root 4 0.0 0.0 0 64 - DL Tue08 0:00.10 - [cam] root 5 0.0 0.0 0 16 - DL Tue08 0:00.00 - [busdma] root 6 0.0 0.0 0 1264 - DL Tue08 5:46.36 - [zfskern] root 7 0.0 0.0 0 16 - DL Tue08 9:15.63 - [pf purge] root 8 0.0 0.0 0 16 - DL Tue08 1:13.79 - [rand_harvestq] root 9 0.0 0.0 0 48 - DL Tue08 1:31.07 - [pagedaemon] root 10 0.0 0.0 0 16 - DL Tue08 0:00.00 - [audit] root 12 0.0 0.0 0 384 - WL Tue08 32:00.80 - [intr] root 13 0.0 0.0 0 128 - DL Tue08 0:00.00 - [ng_queue] root 14 0.0 0.0 0 48 - DL Tue08 0:00.00 - [geom] root 15 0.0 0.0 0 16 - DL Tue08 0:00.00 - [sequencer 00] root 16 0.0 0.0 0 80 - DL Tue08 0:13.19 - [usb] root 17 0.0 0.0 0 16 - DL Tue08 0:00.00 - [vmdaemon] root 18 0.0 0.0 0 128 - DL Tue08 0:47.75 - [bufdaemon] root 19 0.0 0.0 0 16 - DL Tue08 0:04.34 - [vnlru] root 20 0.0 0.0 0 16 - DL Tue08 0:03.78 - [syncer] root 21 0.0 0.0 0 16 - DL Tue08 0:00.00 - [ALQ Daemon] root 22 0.0 0.0 0 16 - DL Tue08 0:00.21 - [enc_daemon0] root 23 0.0 0.0 0 16 - DL Tue08 0:00.22 - [enc_daemon1] root 6433 0.0 0.0 0 16 - DL Tue08 0:06.48 - [iimb0] root 6434 0.0 0.0 0 16 - DL Tue08 0:07.36 - [iimb1] root 6435 0.0 0.0 0 16 - DL Tue08 0:08.64 - [iimb2] root 6436 0.0 0.0 0 16 - DL Tue08 0:04.86 - [iimb3] root 12231 0.0 0.0 0 144 - DL 21:25 83:31.82 - [KTLS]
Edit: NVM, I dont know what happened but after I sshed into the firewall, for some odd reason it recovered.
I am attaching the logs when the recovery happened.
starting from hereJul 11 23:24:32 FIREWALL kernel: sonewconn: pcb 0xfffff80507168000 (<<ip redacted>>:443 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (98 occurrences), euid 0, rgid 0, jail 0 Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:33 FIREWALL upsmon[27267]: UPS [UPS]: connect failed: Connection failure: Connection refused Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:34 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL check_reload_status[636]: Could not connect to /var/run/php-fpm.socket Jul 11 23:24:35 FIREWALL php-cgi[72032]: haproxy: reload old pid:10877 Jul 11 23:24:35 FIREWALL php-cgi[72032]: Jul 11 23:24:35 FIREWALL check_reload_status[636]: rc.newwanip starting pppoe0 Jul 11 23:24:35 FIREWALL kernel: gif0: link state changed to DOWN Jul 11 23:24:35 FIREWALL login[4412]: login on ttyv0 as root Jul 11 23:24:35 FIREWALL php-fpm[9368]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use OVPN_S2S_VPNV4. Jul 11 23:24:36 FIREWALL php-fpm[79050]: /rc.ipsec: Default gateway setting Interface WANV6_TUNNELV6 Gateway as default. Jul 11 23:24:36 FIREWALL kernel: gif0: link state changed to UP Jul 11 23:24:36 FIREWALL php-cgi[47214]: HAProxy OCSP socket update successful for frontend External..result: \x09Next Update: Jul 15 02:00:00 2025 GMT Jul 11 23:24:36 FIREWALL php-cgi[93670]: HAProxy OCSP socket update successful for frontend External..result: \x09Next Update: Jul 12 03:00:00 2025 GMT Jul 11 23:24:36 FIREWALL php-cgi[45969]: HAProxy OCSP socket update successful for frontend External..result: \x09Next Update: Jul 12 03:00:00 2025 GMT Jul 11 23:24:36 FIREWALL php-fpm[12418]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use VPNAC_WG.
The only thing I did in the console
8
CTRL+C
top -HaSP
ps -auxwwd -
Hmm, nothing obviously a problem individually. But there's a lot going on there. There is so much happening that it's still restarting services minutes after the issue is cleared.
I think I would at least try disabling the gateway monitoring action on anything that doesn't need it. Most of those tunnels and VPNs likely don't need it for example. Currently each of them is triggering a restart of all services.
-
@stephenw10
so basically, due to frequent connection losses and restores, it creates a race condition when services are getting restarted? -
Potentially.
Try doing one manual reconnect and see what is logged until it's stable again. How long does it spend reloading stuff.
-
@stephenw10 I cannot reliably do this test right now. It is losing connection almost every 20 seconds. Did the new version got tested for such extreme cases?
-
It should be fine. Or at least no worse than mpd5/netgraph. Bouncing the WAN every 20s is going to be pretty disruptive on a default install. With all the services and additional sub-interfaces you have it 's going to be a lot for the firewall to do. If it takes longer than 20s to reload all the tunnels and services then it could be in a continuous churn with high load.
-
@stephenw10 I have disabled most of the services and some gateway monitors but this still did not help.
It appears that after getting this screen a simple CTRL+C brings the firewall back online which indicates the system is getting stuck on some process.
Unless I catch this immediately, I dont think it is possible to trace the logs.
Is there a way to increase the log size or log rotation so that the next time this happens we can trace it? At this point, I am pissed at my ISP :D.