[SOLVED] Internet through pfsense keeps dropping



  • I don't think I have a pfsense issue here however, I am hoping I can get some help with narrowing down this issue or get some help with configuration.

    My Setup is a Fiberline to my BellAliant Fiberop HomeHub 3000 -> Lan port to pfsense WAN. In hh3k, I have advance DMZ set to the mac address of pfsense and pfsense is getting an ext IP address.

    About once a day I am dropping internet however, the ext ip is still showing in pfsense. Doing a release and renew is getting me the same ip but I can't route out.

    If I connect directly to my hh3k I can get internet access, so we know that it's likely something with dmz or pfsense (I think)

    Tonight I dropped around 12:30AM and rebooting pfsense, release/renew did not help. To fix I need to release ip, restart my hh3k, and pfsense gets a new ip. Looking at the monitor, I went to 100% packetloss

    I've been playing around with the gateway on pfsense like data payload, using ext ip and even the hh3k internal ip (192.168.2.1) however, it does not seem to be helping.

    Attaching pastbin of my General, Gateway and routing log
    https://pastebin.com/ebFNr1Qq



  • Hi,

    Many
    igb0: link state changed to DOWN
    and
    igb0: link state changed to UP
    in there.

    Try :
    Give "dpinger" ** more time - change the IP (not a close to local one, but more upstream) or even disable it for a while, during testing..

    ** The System > Routing tab.
    These options :
    0_1552399509037_c55d959f-55d5-4374-880b-59aafd19e12d-image.png

    Link UP/DOWN issues could also be a bad connector/cable/NIC. So swap NIC/Cable.

    edit : and read this https://forum.netgate.com/topic/57419/kernel-arpresolve-can-t-allocate-llinfo-for-192-168-100-1-cable-modem



  • Thanks for the replay. I changed my Monitor IP from the internal Router to the ext gateway ip again and also enabled the "Disable Gateway Monitoring Action" for now as it seems i can send 0 payload icmp packets (ping -l 0 gateway_IP) to the actual gateway this time around.

    0_1552401492975_dbfbd5b1-1a02-4bcb-b0f9-436af2bfe39c-image.png

    You advised to give it more time. Do you mean the default 10/20 % values? I've set it to 80/99% today, but I assume that might be too high right? For now, I guess it does not really matter as I've Disable Gateway Monitoring Action.

    0_1552401523284_4b63f17a-26da-4ab9-aeca-f02926656a48-image.png

    0_1552401702965_c3f59ca9-bd2e-4ac1-b1bd-e3ab48ad4553-image.png

    8hour
    0_1552401759622_429278dc-e976-4a47-8011-5ea16fe1666e-image.png



  • This morning I found that at the exact same time, I start to receive alerts again of 100% packet loss at 12:30. This time I had disabled the gateway actions and increased the log buffer. I did not see any outage as I was in bed, but it seemed to be working this morning when. It also looks like I got a new wanip at 01:04 and 01:50.

    Because I see "sendto error: 65" and "sendto error: 64", I assume I was offline for that time?

    So what exactly does the gateway monitor do if it was enabled vs disabled and recovering on its own?

    Mar 13 01:50:47	dpinger		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 99% dest_addr 142.167.248.1 bind_addr 142.167.249.175 identifier "WAN_DHCP "
    Mar 13 01:50:43	dpinger		WAN_DHCP 142.167.248.1: sendto error: 64
    ... continues until
    Mar 13 01:50:14	dpinger		WAN_DHCP 142.167.248.1: sendto error: 65
    ... continues until
    Mar 13 01:48:20	dpinger		WAN_DHCP 142.167.248.1: sendto error: 65
    Mar 13 00:32:16	dpinger		WAN_DHCP 142.167.248.1: Alarm latency 0us stddev 0us loss 100%
    ... continues until
    Mar 13 00:30:58	dpinger		WAN_DHCP 142.167.248.1: sendto error: 65
    Mar 12 00:37:28	dpinger		WAN_DHCP 192.168.2.1: Alarm latency 0us stddev 0us loss 100%
    Mar 12 00:36:54	dpinger		WAN_DHCP 192.168.2.1: Alarm latency 0us stddev 0us loss 100%
    Mar 12 00:32:23	dpinger		WAN_DHCP 192.168.2.1: Alarm latency 0us stddev 0us loss 100%
    Mar 12 00:31:21	dpinger		WAN_DHCP 192.168.2.1: Alarm latency 0us stddev 0us loss 100%
    Mar 12 00:31:13	dpinger		WAN_DHCP 192.168.2.1: Alarm latency 249us stddev 43us loss 21%
    

    An hour later, i get a new wanip

    Mar 13 01:50:50	php-fpm	353	/rc.newwanip: IP Address has changed, killing states on former IP Address 142.167.251.43.
    Mar 13 01:50:44	php-fpm	23907	/rc.newwanip: IP Address has changed, killing states on former IP Address 142.167.251.43.
    Mar 12 01:05:03	php-fpm	23907	/rc.newwanip: IP Address has changed, killing states on former IP Address 142.134.90.210.
    Mar 12 01:04:58	php-fpm	3054	/rc.newwanip: IP Address has changed, killing states on former IP Address 142.134.90.210.
    

    0_1552477395323_212e9e60-b740-48e6-9d86-07e19f478386-image.png

    Maybe this is a coincidence however, both times before everything goes down, Suricata started to update.

    Mar 12 00:31:02	kernel		igb0: link state changed to DOWN
    Mar 12 00:31:02	check_reload_status		Linkup starting igb0
    Mar 12 00:31:02	php-fpm	353	/rc.linkup: HOTPLUG: Configuring interface wan
    Mar 12 00:31:02	kernel		arpresolve: can't allocate llinfo for 142.134.88.1 on igb0
    Mar 12 00:31:02	php-fpm	353	/rc.linkup: DEVD Ethernet attached event for wan
    Mar 12 00:31:01	kernel		arpresolve: can't allocate llinfo for 142.134.88.1 on igb0
    Mar 12 00:31:01	kernel		igb0: link state changed to UP
    Mar 12 00:31:01	kernel		arpresolve: can't allocate llinfo for 142.134.88.1 on igb0
    Mar 12 00:31:01	check_reload_status		Linkup starting igb0
    Mar 12 00:31:01	kernel		arpresolve: can't allocate llinfo for 142.134.88.1 on igb0
    Mar 12 00:31:00	kernel		arpresolve: can't allocate llinfo for 142.134.88.1 on igb0
    Mar 12 00:31:00	check_reload_status		Reloading filter
    Mar 12 00:30:59	php-fpm	69306	/rc.linkup: DEVD Ethernet detached event for wan
    Mar 12 00:30:58	kernel		igb0: promiscuous mode enabled
    Mar 12 00:30:58	kernel		igb0: link state changed to DOWN
    Mar 12 00:30:58	check_reload_status		Linkup starting igb0
    Mar 12 00:30:42	check_reload_status		Syncing firewall
    Mar 12 00:30:42	SuricataStartup	2233	Suricata START for WAN(16916_igb0)...
    Mar 12 00:30:42	php-cgi		suricata_check_for_rule_updates.php: [Suricata] The Rules update has finished.
    Mar 12 00:30:42	php-cgi		suricata_check_for_rule_updates.php: [Suricata] Suricata has restarted with your new set of rules...
    Mar 12 00:30:41	kernel		igb0: promiscuous mode disabled
    Mar 12 00:30:40	SuricataStartup	98817	Suricata STOP for WAN(16916_igb0)...
    Mar 12 00:30:40	php-cgi		suricata_check_for_rule_updates.php: [Suricata] Building new sid-msg.map file for WAN...
    Mar 12 00:30:40	php-cgi		suricata_check_for_rule_updates.php: [Suricata] Enabling any flowbit-required rules for: WAN...
    Mar 12 00:30:39	php-cgi		suricata_check_for_rule_updates.php: [Suricata] Updating rules configuration for: WAN ...
    Mar 12 00:30:36	php-cgi		suricata_check_for_rule_updates.php: [Suricata] Snort GPLv2 Community Rules file update downloaded successfully.
    Mar 12 00:30:35	php-cgi		suricata_check_for_rule_updates.php: [Suricata] There is a new set of Snort GPLv2 Community Rules posted. Downloading community-rules.tar.gz...
    Mar 12 00:30:34	php-cgi		suricata_check_for_rule_updates.php: [Suricata] Emerging Threats Open rules file update downloaded successfully.
    Mar 12 00:30:07	php-cgi		suricata_check_for_rule_updates.php: [Suricata] There is a new set of Emerging Threats Open rules posted. Downloading emerging.rules.tar.gz...
    Mar 12 00:00:09	php		[pfBlockerNG] No changes to Firewall rules, skipping Filter Reload
    Mar 12 00:00:08	php		[pfBlockerNG] Starting cron process.
    


  • So I finally found the cause for this issue, Its Suricata or my hardware + Suricata is not playing nice. Currently, Suricata is set to update at 00:30, which was what caught my eye. I changed the time to something different the issue moved to +- a few minuets. Next up, I went into Suricata and did some updates, changes, saves and that also causes the network to drop.

    The only workaround when I drop, is to restart the Bell Home Hub 3000 (hh3k).

    I've since uninstalled Suricata and installed Snort and the issues gone. Any ideas here? The plan is still to replace the dual E5520's for one 6000 series to get Crypto support.

    CPU Type Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
    16 CPUs: 2 package(s) x 4 core(s) x 2 hardware threads
    AES-NI CPU Crypto: No

    Memory usage
    4% of 18377 MiB

    MBUF Usage
    2%



  • If Snort works, then just use it instead of Suricata. There is no meaningful security difference between the two packages.

    Were you running Suricata with Inline IPS Mode? If so, then netmap is probably the issue as it will restart an interface when netmap mode is activated. So each time Suricata stopped and started it would activate netmap which in turn will cycle the interface. The Inline IPS Mode of blocking in Suricata uses Netmap. The Legacy Blocking Mode in Suricata works the same as Snort and uses libpcap instead of netmap.



  • @bmeeks said in Internet through pfsense keeps dropping:

    If Snort works, then just use it instead of Suricata. There is no meaningful security difference between the two packages.

    Were you running Suricata with Inline IPS Mode? If so, then netmap is probably the issue as it will restart an interface when netmap mode is activated. So each time Suricata stopped and started it would activate netmap which in turn will cycle the interface. The Inline IPS Mode of blocking in Suricata uses Netmap. The Legacy Blocking Mode in Suricata works the same as Snort and uses libpcap instead of netmap.

    Yes i was



  • @rcmpayne said in Internet through pfsense keeps dropping:

    @bmeeks said in Internet through pfsense keeps dropping:

    If Snort works, then just use it instead of Suricata. There is no meaningful security difference between the two packages.

    Were you running Suricata with Inline IPS Mode? If so, then netmap is probably the issue as it will restart an interface when netmap mode is activated. So each time Suricata stopped and started it would activate netmap which in turn will cycle the interface. The Inline IPS Mode of blocking in Suricata uses Netmap. The Legacy Blocking Mode in Suricata works the same as Snort and uses libpcap instead of netmap.

    Yes i was

    Is there a way to restart or cycle the interface to see if that alone will also cause issues? i no-longer have Suricata installed at this point.



  • @rcmpayne said in [SOLVED] Internet through pfsense keeps dropping:

    @rcmpayne said in Internet through pfsense keeps dropping:

    @bmeeks said in Internet through pfsense keeps dropping:

    If Snort works, then just use it instead of Suricata. There is no meaningful security difference between the two packages.

    Were you running Suricata with Inline IPS Mode? If so, then netmap is probably the issue as it will restart an interface when netmap mode is activated. So each time Suricata stopped and started it would activate netmap which in turn will cycle the interface. The Inline IPS Mode of blocking in Suricata uses Netmap. The Legacy Blocking Mode in Suricata works the same as Snort and uses libpcap instead of netmap.

    Yes i was

    Is there a way to restart or cycle the interface to see if that alone will also cause issues? i no-longer have Suricata installed at this point.

    Sure, you can disable and then re-enable the interface on the INTERFACES menu in pfSense. That will not use netmap, though. That will simply cycle the interface down and back up.


Log in to reply