SG-3100 switch weird behavior (resolved)
-
@stephenw10 hm, I thought that it would be better to use redmine, let me restore this topic
-
@mcury so you have these interfaces bridged on pfsense?
You show the same vlans going over 2 different trunks on 2 different interfaces on pfsense lan3 and lan4 - this is not possible without a bridge setup.
Or wait this is a 3100, so switch ports. If I had to guess your mini is leaking, this is a flex mini?
So you send traffic to 255.253, and your PC is seeing it.. Which you are correct it never should. Even in prosc mode, those packets should only be sent out the port that 255.253 is connected to - not all of them.
You have some sort of leak if you ask me.. Could be pfsense sending it out all the lan interfaces in on the switch.. Hmmm
-
@johnpoz said in SG-3100 switch weird behavior:
You show the same vlans going over 2 different trunks on 2 different interfaces on pfsense lan3 and lan4 - this is not possible without a bridge setup.
Its not a brige, one trunk goes to the switch (unifi mini) and other trunk goes to an access point (nanohd)
-
Here its how the switch is configured:
-
Found something interesting since yesterday..
Packet capture in host 192.168.255.251: pinging from pfsense (192.168.255.249).
6645 117.957087565 192.168.255.249 192.168.255.253 ICMP 98 Echo (ping) request id=0x7640, seq=0/0, ttl=64 ( no response found !)
First ping I get the no response found as shown in packet 6645, then the following ICMP requests goes to the correct host (192.168.255.253).
The problem gets solved for a few minutes and then it starts again.Header:
Frame 510: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface enp7s0, id 0 Ethernet II, Src: ADIEngin_0c:c4:1c (00:08:a2:0c:c4:1c), Dst: Raspberr_a5:47:19 (dc:a6:32:a5:47:19) Destination: Raspberr_a5:47:19 (dc:a6:32:a5:47:19) Source: ADIEngin_0c:c4:1c (00:08:a2:0c:c4:1c) Type: IPv4 (0x0800) Internet Protocol Version 4, Src: 192.168.255.249, Dst: 192.168.255.253 Internet Control Message Protocol
-
It looks like you changed the IP address of the desktop from .254 to .251. I assume that made no difference?
In the redmine you show ping traffic to .253 going to the wrong port. But above you said:
The problem stops for a few minutes after pinging 192.168.255.253 from pfsense, and it starts again.
That's confusing. I can only imagine one of those to be true.
You also said that removing ramdisks appeared to resolve it. Did that turn out to be incorrect? When you add or remove ramdisks the firewall has to reboot. Does rebooting normally also correct it for some time?
The only thing I can imagine creating this issue is if the MAC table in the switch in the 3100 is somehow incorrect or being loaded with a bad value. pfSense has no control over that dircetly though. I've never seen that before.
Steve
-
@mcury said in Removed by user.:
First ping I get the no response found as shown in packet 6645, then the following ICMP requests goes to the correct host (192.168.255.253).
The problem gets solved for a few minutes and then it starts again.Could be something redirecting it. What does a pcap on the RasPi show when that's happening?
Almost feels like a bad subnet mask except pfSense is sending to correct MAC.
-
@stephenw10 said in SG-3100 switch weird behavior:
It looks like you changed the IP address of the desktop from .254 to .251. I assume that made no difference?
correct, same behavior
@stephenw10 said in SG-3100 switch weird behavior:
You also said that removing ramdisks appeared to resolve it. Did that turn out to be incorrect?
Unfortunately it worked for a while, but the problem started again, same as before.
@stephenw10 said in SG-3100 switch weird behavior:
When you add or remove ramdisks the firewall has to reboot. Does rebooting normally also correct it for some time?
Yes, it rebooted normally and it was fine for a while, except that the problem happened again.
-
@stephenw10 said in SG-3100 switch weird behavior:
Could be something redirecting it. What does a pcap on the RasPi show when that's happening?
I'll tcpdump it, let me wait for the problem to happen again
Almost feels like a bad subnet mask except pfSense is sending to correct MAC.
raspberry pi 4b 192.168.255.253 eth0:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether dc:a6:32:a5:47:19 brd ff:ff:ff:ff:ff:ff inet 192.168.255.253/29 brd 192.168.255.255 scope global dynamic eth0
-
@johnpoz said in SG-3100 switch weird behavior:
Or wait this is a 3100, so switch ports. If I had to guess your mini is leaking, this is a flex mini?
So you send traffic to 255.253, and your PC is seeing it.. Which you are correct it never should. Even in prosc mode, those packets should only be sent out the port that 255.253 is connected to - not all of them.
You have some sort of leak if you ask me.. Could be pfsense sending it out all the lan interfaces in on the switch.. HmmmSorry, I missed that part
Yes, its a SG-3100, the problem started, let me tcpdump it from rapsberry pi 4b, one sec -
yes, hehe, its kind of a mess
Packet capture in host 192.168.255.251
-
I'll reinstall my pfsense from scratch to test, but I can't do it right now.
Then, if the problem happens again, I'll replace this switch. -
So weird man, a ping fixes it temporarily..
-
@mcury yeah you shouldn't be seeing those. Hmmmm Even if your nic was in promiscuous mode, that mac shouldn't be sent down the port where the mac is not listed.
If you had some sort of leak or bridge where the mac was being learned on multiple interfaces that could happen..
So proper destination mac is your down the trunk (lan4) to the flex mini. But pfsense is also sending it out lan1? But the only place the mac of that pi4 should be seen by pfsense is the lan4 interface, it should never send that mac out lan1, unless there was bridge setup.
hmmmm strange...
This a good question for @stephenw10 he would know way more than me on the inter workings of the switch in the 3100. But typically a switch would only send traffic down the interface that the mac is on.
-
@johnpoz Exactly, its so weird, that packet should never go to pfsense's LAN1 ..
I'll try to fix it tonight by reinstalling my pfsense from scratch..
Then, if the problem happens again, I'll replace this switch.. -
Yeah, it's a pretty basic switch and there's no control over things like the MAC table. That's the only thing I could imagine causing that though.
If you haven't already try power cycling the 3100 entirely. That should completely reset the switch if it's somehow managed to toggle some flag.
Steve
-
@stephenw10 hm, I'll try it now a shutdown, remove the power cable, one sec, let me see who is here using the Internet
-
Done, the problem persists..
- Halt system and once the shutdown process ended, removed the power cable for a few seconds.
-
Hmm, the only other thing I could imagine causing this is if something feeding bad data into the switch MAC table. That would have to be the desktop machine.
If you run a continuous ping from the RasPi to somewhere that has to be accessed through the 3100 switch, does that prevent the issue?
If it does I'd try to find something sending the RasPi MAC from the desktop. Hard to say what that might be.... something reflected perhaps?
If you run a pcap on the desktop and filter by the RasPi MAC address whilst the problem is not happening and wait for it to start. The first thing that happens there might be the offending packet.
Steve
-
@stephenw10 said in SG-3100 switch weird behavior:
If you run a continuous ping from the RasPi to somewhere that has to be accessed through the 3100 switch, does that prevent the issue?
Testing now, ping is running from RPI4 to pfsense.
It seems to have stopped, but it may start again soon, so I'll wait a little longer this time.Packet capture set:
Edit:
This is my ARP table (desktop)
$ cat /proc/net/arp IP address HW type Flags HW address Mask Device 192.168.255.252 0x1 0x2 00:11:32:9f:ee:93 * enp7s0 192.168.255.249 0x1 0x2 00:08:a2:0c:c4:1c * enp7s0 192.168.255.250 0x1 0x2 b8:27:eb:ea:f8:65 * enp7s0 192.168.255.253 0x1 0x2 dc:a6:32:a5:47:19 * enp7s0