SG-3100 switch weird behavior (resolved)
-
@stephenw10 I believe that is the issue..
My cron is smashing pfsense with pings, let me change that cron to 1 minute -
This seems to be enough..
* * * * * /usr/bin/ping 192.168.255.249 -c 2
-
This solved the problem, 13 minutes of cron job running, no more problems..
Really thanks for the help @stephenw10 and @johnpoz :)
-
Cool. I think I prefer reducing the ARP timeout as a solution. You might try setting that to 1 min and see if that also solves it. That's just a system tunable in pfSense, all in the config.
But either will work fine.Steve
-
@stephenw10 said in SG-3100 switch weird behavior (resolved):
Cool. I think I prefer reducing the ARP timeout as a solution.
Wouldn't that change the behavior for everything? Like a global setting?
This ARP timeout would only be triggered in case of a host is not "alive" like the raspberry pi 4b we just observed ?
-
It would be global but 1min is not that unusual. I believe Windows uses 30s.
What would happen is that every minute the RasPi4 entry in the pfSense ARP table would time out. So in order to send syslog traffic to it it will ARP for the IP address and the RasPi4 will respond to that refreshing the MAC table in the switch.
It feels like a cleaner solution to me but if the ping is working for you then there then there no need to change it. It would be interesting to know that works if you're able to test it.Steve
-
@stephenw10 said in SG-3100 switch weird behavior (resolved):
It would be interesting to know that works if you're able to test it.
Sure, I'll disable the cron job and test it right now, wireshark is already running, one sec
-
@stephenw10 said in SG-3100 switch weird behavior (resolved):
I believe Windows uses 30s.
windows uses a weird way of doing it, they use 30 seconds as base and then add a random multiplier on it.. But you can adjust it if you want.
Wouldn't setting a static arp for the rpi4 also solve it? Or that is different than the switch arp cache?
I wish arp in windows actually showed you what was left on the cache, like freebsd, linux should do that too. At least in linux you can use ip -statistics neigh
I don't know of any way to actually view how much time is left on mac that is cached.
-
@johnpoz said in SG-3100 switch weird behavior (resolved):
Wouldn't setting a static arp for the rpi4 also solve it? Or that is different than the switch arp cache?
I tried that, actually the static ARP is set right now..
-
-
@mcury so what is better solution? More arps going out for everything ;) Or just pinging pfsense from the rpi4 every minute or 2 minutes..
Weird one for sure..
-
@johnpoz said in SG-3100 switch weird behavior (resolved):
so what is better solution? More arps going out for everything ;) Or just pinging pfsense from the rpi4 every minute or 2 minutes..
Weird one for sure..ehhe, that is weird indeed.
I'm not sure how it works, but it seems that every packet that goes through the switch reset that ARP timer, so the firewall wouldn't need to broadcast it as often. -
@johnpoz said in SG-3100 switch weird behavior (resolved):
Wouldn't setting a static arp for the rpi4 also solve it? Or that is different than the switch arp cache?
Yeah, nothing to do with the switch MAC table. That exists only in the switch IC.
So in fact I would expect setting a static ARP to make this worse because it will never expire, pfSense will never ARP for the IP so no responses will be generated.
So if it was still set static I'm surprised that max_age value made any difference.
-
@stephenw10 said in SG-3100 switch weird behavior (resolved):
So if it was still set static I'm surprised that max_age value made any difference.
It is set like that since the beginning, changed yesterday and the problem happened today several times..
-
Did you set that as a static ARP value to try to solve this or has it always been static?
Because static ARP entries are almost always the wrong choice. That might be the cause here if it was always set static.
Steve
-
@stephenw10 said in SG-3100 switch weird behavior (resolved):
Did you set that as a static ARP value to try to solve this or has it always been static?
I always use static, the problem happened, I changed a few IP addresses here and removed the static ARP, it happened a few times after that so I reverted to static ARP.
If you want, I can remove that entry to test too
-
Yeah, I would test just without the static ARP entry. If you still see the issue test with max_age at 120s again. Anything less than 3mins should prevent it.
Steve
-
@stephenw10 Done, disabled static ARP, left static DHCP mapping, reverted to sysctl net.link.ether.inet.max_age=1200 to test
If the problem happens, I'll change it 120 to test. -
Oh, just happened..
Changing to 120 to test.
Edit:
The change didn't take effect, had to ping 192.168.255.253 from pfsense, lets see how it goes now.
Edit2: Problem is happening again, I suppose 120 is too much, let me try 60.
-
Static ARP or not didn't change anything.
ping -c 2 cron job with an interval of 60s works.
setting the sysctl net.link.ether.inet.max_age=60 also works.Now depends on what the user prefer, it seems to me that the cron job would be a better approach since it would only affect a single host.