Diagnostics ARP Table 504 Nginx Gateway Timeout

manicmoose

I'm seeing the same thing. I have 1074 entries currently in my arp table.

I don't recall there being so many arp entries on my WAN interface before - this is a new development.

The only change I've made recently is I added a second WAN and have them in a failover group....maybe that makes a difference??

johnpoz

@manicmoose said in Diagnostics ARP Table 504 Nginx Gateway Timeout:

maybe that makes a difference??

No, you really shouldn't be seeing arp entries in your public wan.. You should have no need to talk to these other ips. And no reason for them to arp for you, etc.

You could be on a network with 1000s of other IPs - but doesn't mean everyone has every other IP in their arp table. Only reason is they arp for your IP, or you arp for theirs for some reason. Or they are sending out gratuitous. Or some other sort of traffic where you would then need to arp..

Might want to sniff on your wan and see where the arps are coming from, some sort of traffic that has your machine then arp for the IP, or devices arping for yours?

Since they should expire the traffic would have to be something that is quite regular if your seeing that many entries.. Freebsd/Pfsense does use a long arp cache time of 20 minutes. But still to have that many listed has to be some sort of regular traffic causing it.

Your not monitoring say a broadcast IP for your wan network are you, vs just the gateway IP.. That could cause it as you would get bunch of devices reply to the arp for a broadcast address.

edit

I have 1074 entries currently in my arp table.

Do they all point to the same mac? Here I just pinged a couple of IPs on my wan network.. And they show up in arp table now since there was an answer - but they point to same mac (my cable modem)

manicmoose

@johnpoz Thanks John.

For both of the WANs, the 'gateway' and 'monitor' IP's are the same, so that should be okay...

This is what I'm seeing via a tcpdump on the WAN interface:

22:16:06.191949 ARP, Request who-has xxx.yyy.85.245 tell xxx.yyy.85.21, length 28
22:16:06.192019 IP xxx.yyy.85.21.41070 > xxx.yyy.85.42.5353: 54784 PTR (QM)? _services._dns-sd._udp.local. (46)
22:16:06.193097 ARP, Request who-has xxx.yyy.85.246 tell xxx.yyy.85.21, length 28
22:16:06.194178 ARP, Request who-has xxx.yyy.85.247 tell xxx.yyy.85.21, length 28
22:16:06.195259 ARP, Request who-has xxx.yyy.85.248 tell xxx.yyy.85.21, length 28
22:16:06.196338 ARP, Request who-has xxx.yyy.85.249 tell xxx.yyy.85.21, length 28
22:16:06.197414 ARP, Request who-has xxx.yyy.85.250 tell xxx.yyy.85.21, length 28
22:16:06.198492 ARP, Request who-has xxx.yyy.85.251 tell xxx.yyy.85.21, length 28
22:16:06.199571 ARP, Request who-has xxx.yyy.85.252 tell xxx.yyy.85.21, length 28
22:16:06.200648 ARP, Request who-has xxx.yyy.85.253 tell xxx.yyy.85.21, length 28

Is this the kind of thing you were talking about/looking for?

NB. My WAN IP is the: xxx.yyy.85.21

johnpoz

yeah for some reason your arping for the the whole range? That makes no sense..

But yeah if you are arping for them - and you get an answer, then yeah they would be in your arp table..

I have no idea what in pfsense would be doing that.. Mine sure isn't

@stephenw10 do you know of something in pfsense that could do that, or a package maybe? Arpwatch enabled on the wan interface?? Would be a guess, but I thought it just listens for arps - doesn't actual arp for stuff on its own.

manicmoose

@johnpoz Yep, I've never seen it before either.

Short of a reboot - is there some subsystem I could restart to try and 'reset' it's brain, or are there multiple involved making it difficult to tell which one(s) would need to be done?

Like the OP, I'm on 2.5.1-release as well.

Edit: I have arpwatch installed/enabled, but neither of the WANs is selected to be monitored.

johnpoz

Lets see if @stephenw10 has some idea - off the top of my head I have no idea what would cause that.. But does for sure explain why so many entries.

manicmoose

@johnpoz Sounds good.

I just checked the arpwatch 'database' entries, and none of the MACs on the WANs are in there - only local ones.

johnpoz

Yeah arpwatch shouldn't do it - but it was just off the cuff guess to something that "could" maybe do it..

Drawing a blank - maybe after some more coffee something will come to me.. But atleast we have a piece of the puzzle to work with.. What would cause pfsense to arp for every IP in a range that is on the wan.. Not thinking of any packages you could add that could be setup to scan??

The nmap package "could" do it for sure - but you would have to kick that off..

manicmoose

@johnpoz Hmm....good point - I have 'arping' installed, and that could have been at the same time I added the 2nd WAN (can't really recall).

Maybe it's the culprit as that's kind of what it's job is. I might uninstall it and see if that helps.

johnpoz

Yeah arping could prob do it too - but was not aware it had any sort of scanner option or that you could do it on a schedule sort of thing.. I currently don't have that installed.

manicmoose

@johnpoz Nope that's not it.

Uninstalled 'arping', then deleted all MACs on both WANs, leaving only the standard 2 on each.

My failover WAN is already back up to 127 again.
My primary WAN is up to 1021.

stephenw10

Nope, I'm not sure what could cause that either. Weird.

I would check the system processes. See if you have anything running that is obviously and arp process.

johnpoz

checking what processes are running would be good start for sure - but also you could just do process of elimination - how many packages exactly do you have installed? Nmap could do it for sure - but then that would have, or should have to be triggered.

You could look to see if you have any crons scheduled.

manicmoose

I already checked for 'arp' processes and there's nothing over and above what 'arpwatch' is doing:

/usr/local/sbin/arpwatch -z -Z -f /usr/local/arpwatch/arp_igb2.5.dat -i igb2.5
/usr/local/sbin/arpwatch -z -Z -f /usr/local/arpwatch/arp_igb2.99.dat -i igb2.99
/usr/local/sbin/arpwatch -z -Z -f /usr/local/arpwatch/arp_igb2.999.dat -i igb2.999
/usr/local/sbin/arpwatch -z -Z -f /usr/local/arpwatch/arp_igb2.10.dat -i igb2.10
/usr/local/sbin/arpwatch -z -Z -f /usr/local/arpwatch/arp_igb2.20.dat -i igb2.20
/usr/local/sbin/arpwatch -z -Z -f /usr/local/arpwatch/arp_igb2.50.dat -i igb2.50

My WANs are on diff. physical NICs, so different 'igb' numbers to this one - so presumably arpwatch is behaving itself.

As for cron, it all appears kosher to me:

1,31	0-5	*	*	*	root	/usr/bin/nice -n20 adjkerntz -a
1	3	1	*	*	root	/usr/bin/nice -n20 /etc/rc.update_bogons.sh
1	1	*	*	*	root	/usr/bin/nice -n20 /etc/rc.dyndns.update
*/60	*	*	*	*	root	/usr/bin/nice -n20 /usr/local/sbin/expiretable -v -t 3600 virusprot
30	12	*	*	*	root	/usr/bin/nice -n20 /etc/rc.update_urltables
1	0	*	*	*	root	/usr/bin/nice -n20 /etc/rc.update_pkg_metadata
0,15,30,45	*	*	*	*	root	/etc/rc.filter_configure_sync
0	11	4-10	*	*	root	/usr/local/bin/php /usr/local/www/pfblockerng/pfblockerng.php dcc >> /var/log/pfblockerng/extras.log 2>&1
*/1	*	*	*	*	root	/usr/sbin/newsyslog
1	3	*	*	*	root	/etc/rc.periodic daily
15	4	*	*	6	root	/etc/rc.periodic weekly
30	5	1	*	*	root	/etc/rc.periodic monthly
*/1	*	*	*	*	root	/usr/local/pkg/servicewatchdog_cron.php
*/5	*	*	*	*	root	/usr/bin/nice -n20 /usr/local/bin/php -f /usr/local/pkg/snort/snort_check_cron_misc.inc
*/2	*	*	*	*	root	/usr/bin/nice -n20 /sbin/pfctl -q -t snort2c -T expire 900
5	1,7,13,19	*	*	*	root	/usr/bin/nice -n20 /usr/local/bin/php -f /usr/local/pkg/snort/snort_check_for_rule_updates.php
15	*	*	*	*	root	/usr/local/bin/php /usr/local/www/pfblockerng/pfblockerng.php cron >> /var/log/pfblockerng/pfblockerng.log 2>&1
16	3	*	*	*	root	/usr/local/pkg/acme/acme_command.sh "renewall" | /usr/bin/logger -t ACME 2>&1

"/etc/cron.d/at" only has 'atrun' in it.

I do have nmap installed, but I've had that for as long as I remember without issues - and like you said - it does nothing unless triggered/called explicitely.

Tried restarting 'arpwatch' and deleting all MAC entries but they soon returned.

Uninstalled 'arpwatch' and nmap - no difference
Had HAproxy installed (not currently being used) so uninstalled that. No change.
Rebooted - no difference.

The only 'arp' process I've seen pop up periodially is:

/usr/sbin/arp --libxo json -an

I'm not sure what they are related to.

At this point I'm a bit stumped.

manicmoose

Think I've "solved" it.

ntopng seems to be the culprit - I disabled it and all the ARPs have stopped. I haven't modified ntopng in ages, so maybe the config has become corrupted and it's doing something it's not supposed to be. Odd one.

I'll wait awhile before re-enabling it, but I will delete the data/config and re-install it fresh to ensure it's happier than it was.

Barring another re-occurance, thanks for the help folks!

johnpoz

@manicmoose said in Diagnostics ARP Table 504 Nginx Gateway Timeout:

ntopng seems to be the culprit

While I don't profess to being a ntop guru by any means.. I don't get why it would be arping like that. I get what your saying that it stopped what your seeing - but not sure why it do that.. Ntop out of the box sure shouldn't be arping every IP for networks its attached too..

manicmoose

@johnpoz Agreed - and it never used to, which is why I think it's become a little confused...

Time to reset its brain when I re-enable it (later).

stephenw10

It will it you have enabled this:

Active Network Discovery

Toggle the periodic discovery of network devices using multiple techniques that include ARP scan, MDNS and SSDP.

That's not enabled by default.

johnpoz

Nor should it be enabled on your wan interface ;)

manicmoose

Yep, I get all that.

As mentioned, I hadn't altered the ntopng config for ages, so I don't think that was it - but I've blown away the config now so I can't check.

Nevertheless, it's solved so that's all I care about.