Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Problems With WAN Loss Cobnection

    Scheduled Pinned Locked Moved General pfSense Questions
    57 Posts 4 Posters 2.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      dcuadrados
      last edited by

      Good morning everyone,

      I’m going crazy because I have several pfSense systems running on Amazon MiniPCs like these:

      • https://www.amazon.es/dp/B07DHBHQYR?ref_=ppx_hzsearch_conn_dt_b_fed_asin_title_1
      • https://www.amazon.es/dp/B0B6J12LGJ/?coliid=I2DS2H9ADNN6YL&colid=36J85P61Z63IW&ref_=list_c_wl_lv_ov_lig_dp_it
      • https://www.amazon.es/FakestarPC-generaci%C3%B3n-Firewall-Micro-Device/dp/B0DQBTBZ63/ref=sr_1_3?__mk_es_ES=%C3%85M%C3%85%C5%BD%C3%95%C3%91&s=electronics&sr=1-3

      The problem is that they randomly hang with no internet and the WAN goes down. I’ve disabled ASPM, unchecked “Monitoring Action” on the gateway, and disabled TSO/Checksum/LRO offloading.

      Sometimes they’re remotely accessible, but clients can’t browse anything. In the logs I sometimes see this on two different firewalls:

      Jun  6 00:24:12  dpinger 53814 exiting on signal 15
      Jun  6 00:23:24  dpinger 53814 WANGW 192.168.1.1: Clear latency 359us stddev 169us loss 0%
      Jun  6 00:23:13  dpinger 53814 WANGW 192.168.1.1: Alarm latency 276us stddev 33us loss 33%
      Jun  7 13:09:51  dpinger 16274 WANGW 192.168.1.1: sendto error: 64
      Jun  7 13:09:50  dpinger 16274 WANGW 192.168.1.1: sendto error: 50
      Jun  7 13:09:49  dpinger 16274 WANGW 192.168.1.1: sendto error: 50
      Jun  7 13:09:49  dpinger 16274 WANGW 192.168.1.1: Alarm latency 0us stddev 0us loss 100%
      Jun 12 14:50:05  dpinger 36722 send_interval 500ms loss_interval 2000ms time_period 60000ms … dest_addr 192.168.2.1 bind_addr 192.168.2.1 identifier "IPSEC_Gateway "
      Jun 12 14:50:05  dpinger 28600 exiting on signal 15
      Jun 12 14:50:05  dpinger 28963 exiting on signal 15
      Jun 12 14:50:05  dpinger 28600 WAN_DHCP 192.168.0.1: sendto error: 50
      Jun 12 14:50:04  dpinger 28600 WAN_DHCP 192.168.0.1: sendto error: 50
      Jun 12 14:50:04  dpinger 28600 WAN_DHCP 192.168.0.1: sendto error: 50
      Jun 12 14:50:03  dpinger 28600 WAN_DHCP 192.168.0.1: sendto error: 50
      Jun 12 14:50:03  dpinger 28600 WAN_DHCP 192.168.0.1: sendto error: 50
      

      The gateway looks like this:

      GW_Movistar
      192.168.1.1  0.0ms  0.0ms  100%  Danger, Packetloss
      

      When I do a ping, e.g., I get this:

      PING 8.8.8.8 (8.8.8.8) from 192.168.1.254: 56 data bytes
      64 bytes ... time=3.423 ms
      64 bytes ... time=3.485 ms
      64 bytes ... time=4.217 ms
      
      --- 8.8.8.8 ping statistics ---
      3 packets transmitted, 3 received, 0.0% packet loss
      round-trip min/avg/max/stddev = 3.423/3.708/4.217/0.360 ms
      

      But pinging the gateway fails completely:

      PING 192.168.1.1 (192.168.1.1) from 192.168.1.254: 56 data bytes
      
      --- 192.168.1.1 ping statistics ---
      3 packets transmitted, 0 packets received, 100.0% packet loss
      

      The IPsec tunnels are down and don’t come back up. Here are the IPsec logs:

      Jun 21 20:44:59 charon 67161 01[CFG] vici client 19790 disconnected
      ...
      Jun 21 20:44:58 charon 67161 01[IKE] <con1|321> establishing IKE_SA failed, peer not responding
      Jun 21 20:44:58 charon 67161 01[IKE] <con1|321> giving up after 5 retransmits
      ...
      Jun 21 20:44:03 charon 67161 12[IKE] <con1|321> sending keep alive to PUBLIC_IP[4500]
      

      My routing table looks like this:

      Internet:
      Destination        Gateway      Flags  Netif Expire
      default            192.168.1.1  UGS    igc0
      PUBLIC_IP          192.168.1.1  UGHS   igc0
      127.0.0.1          link#8       UH     lo0
      172.16.0.0/24      link#2       U      igc1
      172.16.0.1         link#8       UHS    lo0
      172.16.2.0/24      link#11      U      igc1.2
      172.16.2.1         link#8       UHS    lo0
      172.16.10.0/24     link#12      U      ovpns1
      172.16.10.1        link#8       UHS    lo0
      192.168.1.0/24     link#1       U      igc0
      192.168.1.254      link#8       UHS    lo0
      192.168.255.254    link#8       UH     lo0
      

      All this happens while the WAN is marked as down, and it recovers only after rebooting the firewall. This is happening on multiple units. I have the following packages installed: ACME, Snort, pfBlocker, ntopng, Telegraf, Cron, Filer, OpenVPN client export, watchdog service, and system patches.

      Firewall rules are all floating; I only allow outbound Ping and essential ports for browsing and email, with VLANs for mobile devices.

      One thing I observe is that when this occurs, Grafana shows a spike in CPU usage and load, as seen in the attached screenshots.

      I’m desperate. I’ve tried versions from 2.7.2 and 2.8.0 up to Plus 24.11. I don’t know what else to provide – I hope someone can shed some light on this.

      Captura de pantalla 2025-06-22 004417.png Captura de pantalla 2025-06-22 004520.png Captura de pantalla 2025-06-22 004503.png

      GertjanG 1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Anything in the main system logs when this happens?

        I assume 192.168.1.1 is the local ISP router? Does that still have access?

        1 Reply Last reply Reply Quote 0
        • D
          dcuadrados
          last edited by dcuadrados

          In the logs I’ve shared — those of the Gateways and IPSEC — there’s nothing else unusual, assuming 192.168.1.1 is the provider’s router.

          The logs I’m seeing are as follows, nothing interesting:

          Jun  8 12:30:01 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: Starting up.
          Jun  8 12:30:01 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: Sleeping for 38 seconds.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: Starting URL table alias updates
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_Top_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_Top_v6 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_PS_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_PS_v6 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_Nivel1_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_Nivel2_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_Nivel3_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_BlockListDE_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_GeoIP_ES_v4_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_GeoIP_USA_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_Nivel4_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_MAIL_v4 does not need updating.
          Jun  8 12:30:39 firewall php-cgi[66704]: rc.update_urltables: /etc/rc.update_urltables: pfB_DNSBLIP_v4 does not need updating.
          Jun  8 12:35:00 firewall sshguard[36792]: Exiting on signal.
          Jun  8 12:35:00 firewall sshguard[3803]: Now monitoring attacks.
          Jun  8 13:54:00 firewall sshguard[3803]: Exiting on signal.
          Jun  8 13:54:00 firewall sshguard[22319]: Now monitoring attacks.
          Jun  8 14:43:21 firewall kernel: swp_pager_getswapspace(1): failed
          Jun  8 14:45:26 firewall kernel: swp_pager_getswapspace(3): failed
          Jun  8 15:13:00 firewall sshguard[22319]: Exiting on signal.
          Jun  8 15:13:00 firewall sshguard[71185]: Now monitoring attacks.
          Jun  8 16:32:00 firewall sshguard[71185]: Exiting on signal.
          Jun  8 16:32:00 firewall sshguard[13608]: Now monitoring attacks.
          Jun  8 17:52:00 firewall sshguard[13608]: Exiting on signal.
          Jun  8 17:52:00 firewall sshguard[97710]: Now monitoring attacks.
          

          In uptimekuma have this:

          dcseguridad, [21/06/2025 19:16]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 19:17]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 19:20]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 19:21]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 19:25]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 19:26]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 19:30]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 19:31]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 19:38]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 19:48]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 19:51]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 19:54]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 19:57]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 20:14]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 20:17]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 20:18]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 20:21]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 20:22]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 20:25]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 20:26]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 20:33]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 20:34]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 20:37]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          dcseguridad, [21/06/2025 20:40]
          [FW ] [✅ Up] 200 - OK
          
          dcseguridad, [21/06/2025 20:49]
          [FW ] [🔴 Down] timeout of 48000ms exceeded
          
          

          From what I can tell, the igc driver is causing issues. I’ve made modifications in the Tuneables and changed the WAN negotiation… let’s see how it behaves.

          @stephenw10 said in Problems With WAN Loss Cobnection:

          Anything in the main system logs when this happens?

          I assume 192.168.1.1 is the local ISP router? Does that still have access?

          1 Reply Last reply Reply Quote 0
          • GertjanG
            Gertjan @dcuadrados
            last edited by Gertjan

            You've listed :

            @dcuadrados said in Problems With WAN Loss Cobnection:

            Jun 7 13:09:49 dpinger 16274 WANGW 192.168.1.1: sendto error: 50

            and

            @dcuadrados said in Problems With WAN Loss Cobnection:

            Jun 12 14:50:05 dpinger 28600 WAN_DHCP 192.168.0.1: sendto error: 50

            So, dual WAN ?
            Where dos this "192.168.0.1" comes from ?
            It's not here :

            @dcuadrados said in Problems With WAN Loss Cobnection:

            My routing table looks like this:

            Internet:
            Destination Gateway Flags Netif Expire
            default 192.168.1.1 UGS igc0
            PUBLIC_IP 192.168.1.1 UGHS igc0
            127.0.0.1 link#8 UH lo0
            172.16.0.0/24 link#2 U igc1
            172.16.0.1 link#8 UHS lo0
            172.16.2.0/24 link#11 U igc1.2
            172.16.2.1 link#8 UHS lo0
            172.16.10.0/24 link#12 U ovpns1
            172.16.10.1 link#8 UHS lo0
            192.168.1.0/24 link#1 U igc0
            192.168.1.254 link#8 UHS lo0
            192.168.255.254 link#8 UH lo0

            Not related :

            @dcuadrados said in Problems With WAN Loss Cobnection:

            on.es/FakestarPC-generaci%C3%
            And it gets better :

            dbbc3a3d-853e-4282-a5f9-101d6a9f8133-image.png

            Wow ... 😧
            "That's a no-go, even if I got one for free".

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            D 2 Replies Last reply Reply Quote 0
            • D
              dcuadrados @Gertjan
              last edited by

              @Gertjan 2 Diferent Appliances, but same problem and same Hardware, but 2 appliance and 2 different client

              GertjanG 1 Reply Last reply Reply Quote 0
              • D
                dcuadrados @Gertjan
                last edited by

                @Gertjan said in Problems With WAN Loss Cobnection:

                on.es/FakestarPC-generaci%C3%
                And it gets better :

                Wow ...
                "That's a no-go, even if I got one for free".

                The ones that are failing for me are these:

                https://www.amazon.es/dp/B0CG18TT9K/?coliid=I2EAQ4NZLEJ0PO&colid=36J85P61Z63IW&ref_=list_c_wl_lv_ov_lig_dp_it

                I don't have the other one; I thought I had added another one like the one I just shared but with i225 cards.

                1 Reply Last reply Reply Quote 0
                • GertjanG
                  Gertjan @dcuadrados
                  last edited by Gertjan

                  @dcuadrados

                  You've de-acticativated the monitoring 'action', so even when pings don't come back anymore, pfSense won't pull the interface (WAN) down.
                  The FreeBSD (intel) igc NIC driver is one of the most stable drivers out there. You and I share the same code - as I'm using it also. It's - should be - rock solid.
                  That said, IF the NIC is actually an Intel NIC ...

                  If pings stop coming back, start looking upstream.

                  Btw :

                  @dcuadrados said in Problems With WAN Loss Cobnection:

                  kernel: swp_pager_getswapspace(1): failed

                  without consulting, I tend to says : pfSense prepares to use the swap.
                  That's a sign you running out of free RAM.
                  That's a major issue.
                  A native pfSense installation does have a swap space. When it starts to be used, drop the load on your system.

                  edit : wait : I consulted. It's worse. You've run out of swap space.
                  And that bad. Consider that a mayday situation.
                  You're doing huge (?) things with your firewall. That, by itself, could make the system unstable.

                  @dcuadrados said in Problems With WAN Loss Cobnection:

                  watchdog service

                  And you're not using it, I hope. The service watch dog is a whole problem by itself.

                  No "help me" PM's please. Use the forum, the community will thank you.
                  Edit : and where are the logs ??

                  D 1 Reply Last reply Reply Quote 0
                  • D
                    dcuadrados @Gertjan
                    last edited by

                    @Gertjan said in Problems With WAN Loss Cobnection:

                    @dcuadrados

                    You've de-acticativated the monitoring 'action', so even when pings don't come back anymore, pfSense won't pull the interface (WAN) down.
                    The FreeBSD (intel) igc NIC driver is one of the most stable drivers out there. You and I share the same code - as I'm using it also. It's - should be - rock solid.
                    That said, IF the NIC is actually an Intel NIC ...

                    If pings stop coming back, start looking upstream.

                    Btw :

                    @dcuadrados said in Problems With WAN Loss Cobnection:

                    kernel: swp_pager_getswapspace(1): failed

                    without consulting, I tend to says : pfSense prepares to use the swap.
                    That's a sign you running out of free RAM.
                    That's a major issue.
                    A native pfSense installation does have a swap space. When it starts to be used, drop the load on your system.

                    edit : wait : I consulted. It's worse. You've run out of swap space.
                    And that bad. Consider that a mayday situation.
                    You're doing huge (?) things with your firewall. That, by itself, could make the system unstable.

                    @dcuadrados said in Problems With WAN Loss Cobnection:

                    watchdog service

                    And you're not using it, I hope. The service watch dog is a whole problem by itself.

                    I'm looking into why the SWAP is being used — maybe too many lists in pfBlocker, which causes a heavy load, although generally the RAM usage is always below 30%, and some systems have 8 GB and others 16 GB.

                    Regarding the network card:

                    igc0@pci0:1:0:0:	class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x15f3 subvendor=0x8086 subdevice=0x0000
                        vendor     = 'Intel Corporation'
                        device     = 'Ethernet Controller I225-V'
                        class      = network
                        subclass   = ethernet
                    igc1@pci0:2:0:0:	class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x15f3 subvendor=0x8086 subdevice=0x0000
                        vendor     = 'Intel Corporation'
                        device     = 'Ethernet Controller I225-V'
                        class      = network
                        subclass   = ethernet
                    igc2@pci0:3:0:0:	class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x15f3 subvendor=0x8086 subdevice=0x0000
                        vendor     = 'Intel Corporation'
                        device     = 'Ethernet Controller I225-V'
                        class      = network
                        subclass   = ethernet
                    igc3@pci0:4:0:0:	class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x15f3 subvendor=0x8086 subdevice=0x0000
                        vendor     = 'Intel Corporation'
                        device     = 'Ethernet Controller I225-V'
                        class      = network
                        subclass   = ethernet
                    

                    This is what I have. I'm going to look into the SWAP and memory usage and try to reduce the load.

                    Regarding the watchdog service — why do you say it's a problem?

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by stephenw10

                      Generally the Service Watchdog should only ever be used for troubleshooting. It can end up restarting things unecessarily. In the worst case it can get stuck in a loop restarting services if the system is too busy to get them restarted before it triggers again. It should never be used on Snort or Suricata.

                      But, yes, exhausting the SWAP implies something is using a huge amount of RAM or you have a very large number of crash reports. Both are bad!
                      And, I'd also guess it's pfBlocker reloading the lists. But that would be a lot of large lists.

                      D 1 Reply Last reply Reply Quote 0
                      • D
                        dcuadrados @stephenw10
                        last edited by

                        @stephenw10 OK, I'm going to review everything to see if the errors go away, and I’ll monitor how everything behaves.

                        1 Reply Last reply Reply Quote 1
                        • D
                          dcuadrados
                          last edited by

                          Good evening:

                          The same thing just happened to me — the WAN is marked as offline. The monitoring is set to ping the router itself, 192.168.0.1. I change it to 8.8.8.8 just now

                          The system starts getting overloaded at 20:44, and at 20:45 it reports this:

                          2025-06-25 21:25:00.750831+02:00	dpinger	18527	send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 10.10.11.1 bind_addr 10.10.11.1 identifier "OPENVPN_NET_VPNV4 "
                          2025-06-25 21:25:00.714661+02:00	dpinger	17993	send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 8.8.8.8 bind_addr 192.168.0.254 identifier "WANGW_nueva "
                          2025-06-25 21:25:00.699415+02:00	dpinger	60730	exiting on signal 15
                          2025-06-25 21:25:00.656404+02:00	dpinger	61082	exiting on signal 15
                          2025-06-25 20:45:19.129992+02:00	dpinger	60730	WANGW_nueva 192.168.0.1: Alarm latency 200us stddev 527us loss 21%
                          

                          In the general logs I only see this:

                          2025-06-25 20:45:19.249573+02:00	snort	70534	[1:2403366:101155] ET CINS Active Threat Intelligence Poor Reputation IP TCP group 34 [Classification: Misc Attack] [Priority: 2] {TCP} 20.65.177.212:50259 -> 192.168.0.254:11740
                          2025-06-25 20:44:52.902718+02:00	snort	70534	[1:2403344:101155] ET CINS Active Threat Intelligence Poor Reputation IP TCP group 23 [Classification: Misc Attack] [Priority: 2] {TCP} 20.168.0.84:49415 -> 192.168.0.254:9529
                          2025-06-25 20:44:25.727669+02:00	snort	17463	[1:2029509:2] ET POLICY Observed DNS Query for Suspicious TLD (.management) [Classification: Potential Corporate Privacy Violation] [Priority: 1] {UDP} 10.10.10.2:53452 -> 10.10.10.1:53
                          2025-06-25 20:44:12.110295+02:00	snort	70534	[1:2010937:3] ET SCAN Suspicious inbound to mySQL port 3306 [Classification: Potentially Bad Traffic] [Priority: 2] {TCP} 64.62.197.53:42048 -> 192.168.0.254:3306
                          2025-06-25 20:44:12.051880+02:00	snort	70534	[1:2402000:7407] ET DROP Dshield Block Listed Source group 1 [Classification: Misc Attack] [Priority: 2] {TCP} 64.62.197.53:42048 -> 192.168.0.254:3306
                          2025-06-25 20:43:45.293063+02:00	snort	70534	[1:2402000:7407] ET DROP Dshield Block Listed Source group 1 [Classification: Misc Attack] [Priority: 2] {TCP} 91.196.152.221:9960 -> 192.168.0.254:21295
                          2025-06-25 20:43:26.798973+02:00	snort	70534	[1:2403486:101155] ET CINS Active Threat Intelligence Poor Reputation IP TCP group 94 [Classification: Misc Attack] [Priority: 2] {TCP} 57.129.64.10:33405 -> 192.168.0.254:8451
                          2025-06-25 20:42:08.685717+02:00	snort	70534	[1:2402000:7407] ET DROP Dshield Block Listed Source group 1 [Classification: Misc Attack] [Priority: 2] {TCP} 65.49.1.169:49230 -> 192.168.0.254:143
                          2025-06-25 20:41:37.879948+02:00	snort	70534	[1:2403330:101155] ET CINS Active Threat Intelligence Poor Reputation IP TCP group 16 [Classification: Misc Attack] [Priority: 2] {TCP} 20.106.196.31:41654 -> 192.168.0.254:1080
                          2025-06-25 20:41:31.596387+02:00	snort	70534	[1:4000000:1] Intento SSH [Classification: Misc activity] [Priority: 3] {TCP} 204.76.203.83:51406 -> 192.168.0.254:22
                          2025-06-25 20:41:25.329334+02:00	snort	17463	[1:71074:1] microsoft [Classification: Misc activity] [Priority: 3] {TCP} 10.10.10.2:55470 -> 13.85.23.206:443
                          2025-06-25 20:41:15.101913+02:00	snort	70534	[1:2402000:7407] ET DROP Dshield Block Listed Source group 1 [Classification: Misc Attack] [Priority: 2] {TCP} 198.235.24.101:56931 -> 192.168.0.254:5985
                          2025-06-25 20:41:09.858778+02:00	snort	17463	[1:71074:1] microsoft [Classification: Misc activity] [Priority: 3] {TCP} 10.10.10.130:61457 -> 52.167.222.13:443
                          2025-06-25 20:40:55.527774+02:00	snort	17463	[1:71074:1] microsoft [Classification: Misc activity] [Priority: 3] {TCP} 10.10.10.2:55462 -> 20.73.194.208:443
                          2025-06-25 20:40:06.266021+02:00	snort	70534	[1:2402000:7407] ET DROP Dshield Block Listed Source group 1 [Classification: Misc Attack] [Priority: 2] {TCP} 204.76.203.220:36665 -> 192.168.0.254:17000
                          2025-06-25 20:40:03.159113+02:00	snort	70534	[1:2403466:101155] ET CINS Active Threat Intelligence Poor Reputation IP TCP group 84 [Classification: Misc Attack] [Priority: 2] {TCP} 47.251.68.250:12393 -> 192.168.0.254:12112
                          ...
                          

                          So basically just Snort blocks.

                          In the DNS Resolver logs I see:

                          2025-06-25 20:45:23.427610+02:00	unbound	51421	[51421:2] info: 10.10.10.130 client.wns.windows.com. A IN
                          2025-06-25 20:45:23.366522+02:00	unbound	51421	[51421:3] info: 10.10.10.130 client.wns.windows.com. A IN
                          2025-06-25 20:45:22.211546+02:00	unbound	51421	[51421:3] info: 10.10.10.130 geo.prod.do.dsp.mp.microsoft.com. A IN
                          2025-06-25 20:45:21.934996+02:00	unbound	51421	[51421:2] info: 10.10.10.130 settings-win.data.microsoft.com. A IN
                          2025-06-25 20:45:21.839643+02:00	unbound	51421	[51421:2] info: 10.10.10.130 licensing.mp.microsoft.com. A IN
                          2025-06-25 20:45:21.170553+02:00	unbound	51421	[51421:3] info: 10.10.10.130 _ldap._tcp.dc._msdcs.topalia.es. SRV IN
                          2025-06-25 20:45:20.926628+02:00	unbound	51421	[51421:2] info: 10.10.10.130 settings-win.data.microsoft.com. A IN
                          2025-06-25 20:45:20.860774+02:00	unbound	51421	[51421:2] info: 10.10.10.130 settings-win.data.microsoft.com. A IN
                          2025-06-25 20:45:20.830911+02:00	unbound	51421	[51421:3] info: 10.10.10.130 licensing.mp.microsoft.com. A IN
                          2025-06-25 20:45:20.764018+02:00	unbound	51421	[51421:3] info: 10.10.10.130 licensing.mp.microsoft.com. A IN
                          2025-06-25 20:45:20.547495+02:00	unbound	51421	[51421:2] info: 10.10.10.130 _ldap._tcp.dc._msdcs.topalia.es. SRV IN
                          2025-06-25 20:45:20.201110+02:00	unbound	51421	[51421:2] info: 10.10.10.130 geo.prod.do.dsp.mp.microsoft.com. A IN
                          2025-06-25 20:45:20.138644+02:00	unbound	51421	[51421:0] info: 10.10.10.130 _ldap._tcp.dc._msdcs.topalia.es. SRV IN
                          2025-06-25 20:45:19.583488+02:00	unbound	51421	[51421:2] info: 10.10.10.130 _ldap._tcp.dc._msdcs.topalia.es. SRV IN
                          ...
                          

                          Honestly, I don’t know where else to look — I’m lost at this point...

                          GertjanG 1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            You're running Snort on WAN? In blocking mode?

                            Can you see what the per core CPU usage is when this happens? Like is one core stuck at 100%? Try using at the CLI: top -HaSP

                            D 1 Reply Last reply Reply Quote 0
                            • D
                              dcuadrados @stephenw10
                              last edited by

                              @stephenw10 yes i use snort in wan in blocking mode,

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                Are you hosting services behind the firewall? You have port forwards or routed traffic?

                                Otherwise running Snort on WAN if generally pretty useless. You just see alerts for all the drive-by traffic hitting the WAN and it's all blocked by the firewall anyway.

                                D 1 Reply Last reply Reply Quote 0
                                • D
                                  dcuadrados @stephenw10
                                  last edited by dcuadrados

                                  @stephenw10 I have OpenVPN and some services (ACME) published, although I try to limit access using GeoIP. Could Snort be the cause of the blocks?

                                  1 Reply Last reply Reply Quote 0
                                  • GertjanG
                                    Gertjan @dcuadrados
                                    last edited by

                                    @dcuadrados said in Problems With WAN Loss Cobnection:

                                    The system starts getting overloaded at 20:44, and at 20:45 it reports this:

                                    The first (== last events) lines are dpinger reloading, as a WAN event happened.
                                    These events were (the previous two events) : it was killed. The most common reason is : the WAN went down.
                                    The initial event (line at the bottom) : dpinger ... pings every 500 ms the ping-destination. You picked "8.8.8.8". packets; 21 %, didn't come back anymore.
                                    WAN uplink saturated ? remember : ping or the ICMP protocol has a low priority, so any upstream (ISP or further along) router can decide to drop these packets. The result will be : your ISP connection goes down.
                                    That's why picking "8.8.8.8" is a quick and very dirty, easy solution.
                                    Way better would be to pick a nearby, closer to you, 'main' ISP router. Find one is not many hops (a hop is a router) away, and that replies to ping.

                                    @dcuadrados said in Problems With WAN Loss Cobnection:

                                    yes i use snort in wan in blocking mode,

                                    That's like driving on the highway during the night, and you cut the head light to see how far you can go ....
                                    Or standing in the middle of the Florence supermax prison in Colorado, and start insulting everybody. You'll be a head-line within minutes.

                                    Consider this : we, the small players in the Internet world, with our pfSense, we shall not pay attention or 'scan' incoming WAN traffic that wasn't directed to our LAN(s)devices, or : filter traffic that wasn't a reply to a request coming from our LAN device, or : don't touch/look/scan/use/do something with useless random WAN traffic on your WAN interface.

                                    Imagine : I know your WAN IP. I - just me - start sending many packets to your WAN IP with content that is known to trigger for example 'snort'. Every packet that get hit and detected by snort will consume 'millions' of extra CPU cycles : your pfSense goes in 100 % over drive mode. Every positive detected packet will get a line in the log, one for every packet - see your example.
                                    I'll send you loads of small packets with a payload that make snort trigger : I'll saturate your disk.

                                    So, just me, with my 5 Gbits /sec upload can saturate your pfSense easily.

                                    Now you start to understand why you shouldn't use snort on WAN.
                                    The default firewall behavior will be : black hole the traffic : this is fast, easy to implement (it's the default method) and you're not at risk.
                                    True, you can't tell anymore that the Internet is a dark place. But guess what : we already knew that.

                                    That said, no one isn't forbidding you to do whatever you want with your pfSense.
                                    I'll hope you get the humor now : you post on the forum that you have WAN issues ...*
                                    No sh*t 😊 .....

                                    @dcuadrados said in Problems With WAN Loss Cobnection:

                                    although I try to limit access using GeoIP

                                    That's the way to go : it's fast, clean, you use pf as it was inteded to be sued.
                                    Not 100 % foolproof of course, as this method allows your neighbor, using nearly the same WAN as you, so your GeoIP rules will accept the traffic.

                                    @dcuadrados said in Problems With WAN Loss Cobnection:

                                    services (ACME) published

                                    ? How so ? You're using the build in http mode ? That method, and the rock bottom manual one, are 'last resort' solutions. Any other DNS API method is to be preferred, by far.
                                    Normally, you should pick a registrar, the one from who you rent your domain name from, with the condition it offers you a DNS API method. Way easier. No open ports. No risk.

                                    @dcuadrados said in Problems With WAN Loss Cobnection:

                                    I have OpenVPN

                                    OpenVPN server, right ?
                                    That's an exception. The OpenVPN server port (UDP 1194) doesn't need to be protected. It's meant to be used like that, it can handle the incoming rubbish.
                                    That said, I do, as you, have a WAN GEO IP filter rule that only accepts connection from 'Europe' so as long as I stay in Europe, I can connect. The rest of the planet : nope.

                                    @dcuadrados said in Problems With WAN Loss Cobnection:

                                    Could Snort be the cause of the blocks?

                                    Possible.
                                    Imagine this situation : snort detects a bad packet, so it puts the source IP into it's 'snort' alias table, and asks pf to reload the rules (and tables) => reloading the firewall rule set.
                                    Now, you receive 'many' such bad packets.
                                    The firewall will get reloaded as often. It will actually spend its time reloading, not filtering.
                                    What do you think will happen with your quality of your uplink connectivity ?

                                    So, you can use snort of course.
                                    But first : get a big uplink, like a 5+ Gbit / sec connection. I say "5"so you'll know your connection is bigger as the vast majority of the other Internet users.
                                    Get a good NIC, with the same or better speed.
                                    Get a good processor, go Xeon, assume the electricity bill. Get an airco to chill things down.
                                    Get a big classic plated, spinning drive. Not a ssd or whatever thing.
                                    Go 'iron mode'. So no VM ...
                                    Now you can use snort and detect the bad ones 😊

                                    Btw : your system will still go down when you get DDOSsed ....
                                    Don't do this : The Man Who Angered Anonymous And Lived To Regret It.

                                    No "help me" PM's please. Use the forum, the community will thank you.
                                    Edit : and where are the logs ??

                                    D 1 Reply Last reply Reply Quote 0
                                    • D
                                      dcuadrados @Gertjan
                                      last edited by

                                      @Gertjan said in Problems With WAN Loss Cobnection:

                                      DNS API

                                      Thank you very much for your response. I had been meaning to set up ACME for a while, but honestly, I hadn’t done it out of laziness. I’ve now configured it, although I had the port open for only 30 minutes and restricted to IPs from the USA. But this way is definitely better—and it also lets me remove a pfBlocker list.

                                      Regarding SNORT, I’m going to remove it from the WAN. To be honest, I set it up back in the day, but it really doesn’t make much sense—especially since all incoming WAN traffic is blocked by default, and what is open is only allowed for IPs from Spain (and if someone travels, we open access for that country).

                                      In pfBlocker, under IP > Inbound Firewall Rules, it might also be a good idea to remove WAN.

                                      As for the monitoring IPs, I’ll look for one that’s closer, although 8.8.8.8 replies in 5ms. I’ll try with another one. For now, I’m going to test all these possible solutions you’ve suggested. Thanks again for everything!

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Yeah I would at least try disabling Snort on WAN and see what happens. If OpenVPN is the only service externally available that's not s big risk.

                                        Running Snort like that means the alerts are pretty much useless because you will be seeing them continually. But more importantly it also increases the CPU loading significantly in the event of a flood of traffic. That makes it much more likely to start dropping packets if a CPU core is maxed out.

                                        D 1 Reply Last reply Reply Quote 0
                                        • D
                                          dcuadrados @stephenw10
                                          last edited by

                                          @stephenw10 perfect, i just delete WAN interface from Snort and only use LAN

                                          0501b1ae-3036-4e4e-97da-b6e65541995d-image.png

                                          GertjanG 1 Reply Last reply Reply Quote 1
                                          • GertjanG
                                            Gertjan @dcuadrados
                                            last edited by

                                            @dcuadrados

                                            Now you behave as the responsible Internet user ! 👍

                                            As soon as snort sees that you send out suspected traffic, you can deal with it locally.
                                            If some device is requesting suspected traffic from the Internet : same thing : go visit the user.

                                            No "help me" PM's please. Use the forum, the community will thank you.
                                            Edit : and where are the logs ??

                                            D 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.