Strange behaviour for ICMP (ping) rule on WAN interface
-
You could probably confirm that by running a pcap on the VM.
-
You don't have em6 in the bridge also do you? That would break the VLAN.
Bridging to VLANs has always been somewhat shaky!Yes, I confirm that only em6.90 is in the bridge :(
I'm sorry...Can you move that to a separate interface/vswitch? Or do the VLAN tagging in VMWare instead?
I have to check if there is a free interface on the hypervisor, but, at this moment, I'm at home, not at the data center.
I have to think about it. I will let you know. -
@stephenw10 running pcap on the VM I can see only the Rapid STP lines and the LLDP with the switch hostname. I think I can confirm that VLAN is broken as you said. (the VM interface is in promiscuous mode)
-
You should be able to bridge a VLAN to another interface like you are doing. What you definitely can't do it bridge the parent interface (em6) and then also use a VLAN on that. But when you through in virtual NICs you could be hitting some off loading bug that would not normally happen on em.
If you can use a discrete interface for 'Public LAN' and tag in VMWare instead I think there's a much better chance of making this work. Or perhaps you don't need VLANs at all if it's all virtual and in the same host.
Steve
-
You should be able to bridge a VLAN to another interface like you are doing. What you definitely can't do it bridge the parent interface (em6) and then also use a VLAN on that.
ok, fortunately I'm not doing it :)
But when you through in virtual NICs you could be hitting some off loading bug that would not normally happen on em.
ok, I am now understanding your doubt about the virtual NICs.
If you can use a discrete interface for 'Public LAN' and tag in VMWare instead I think there's a much better chance of making this work. Or perhaps you don't need VLANs at all if it's all virtual and in the same host.
I will return to work on Wednesday (tomorrow is a holiday) and I will check if there is a free interface left to devote to the "public LAN."
I don't know if I will be able to avoid using VLANs because all "public" traffic would travel on the switches' native lan (and that is not recommended).Many thanks again for all the time you spent to help me. I really appreciated it.
See you soon.
Mauro -
Ah, is it not just a vswitch in VMWare between pfSense and the .5 VM then?
Even so you can probably just do the tagging in VMWare so pfSense just sees two separate NICs.
-
@stephenw10 Good morning, Stephen.
This is to inform you that I just checked the number of free interfaces on the VMware hypervisor. Unfortunately, there is no free interface to be dedicated to the "public LAN".
Anyway, before giving up, I made a search on pfSense Forum looking for "pfsense vlan bridge" and I found this interesting topic:
https://forum.netgate.com/topic/136900/help-with-vlans-in-bridge/23?_=1667375382000
In particular, at the end of discussion, I read that "broonu" user says:
With tcpdump i see the ARP Request but pfsense dont send the ARP Reply.
Im going to clear everything and reconfigure from scratch.and it seems he solved his issue with the following action:
@delerict thank you for your time and help!
it was a vmware misconfiguration, e1000 nic instead of vmxnet3.After reading that, I checked the type of virtual nic assigned to WAN and "public LAN" interfaces of my pfSense instance and I noticed that the interfaces type is E1000. So, I decided to reply to "broonu" user with the following message:
@broonu Hello, sorry if I'm replying to this old topic, but I'm experiencing the same problem trying to bridge the WAN interface with a VLAN created on a LAN interface.
The behavior is almost the same: no reply to ARP requests from pfsense + I cant ping the pfsense upstream gateway.
Before giving up, I noticed that the WAN and LAN interfaces are E1000 (not VMXNET3).
I would like to change the nic type as last attempt.
Anyway, before doing that, I would like to know if there is a particolar relation between bridge and vmxnet3.Could you please help me?
ThanksIn addition, I would like to answer your question:
Ah, is it not just a vswitch in VMWare between pfSense and the .5 VM then?
Yes, it is not just a vswitch in VMware between pfSense and the .5 VM. Virtual router and pfsense are on the same hypervisor, but the "client" machines (virtual and physical ones) are connected to pfSense through a switch.
Even so you can probably just do the tagging in VMWare so pfSense just sees two separate NICs.
This is the current configuration:
- physical pfSense interface em0 (WAN) is connected to the router;
- physical pfSense interface em6 is connected to a vswitch (named "LAN") that has been configured to manage/accept all VLANs (trunk mode);
- on vmware side, the vswitch mentioned above is logically connected to a TRUNK port on the physical switch;
- on pfsense side, several VLANs have been created adding tags (for example, "VLAN em6.90", "VLAN em6.10" and so on);
- the VM is connected to an ACCESS port (with VLAN ID 90) on the same physical switch.
If you think that the main cause is still the VLAN over the bridge (and not the E1000 nic type), could you help me to better understand how to apply your last suggestion?
Thanks in advance,
Mauro -
I think I understand your suggestion:
On vmware side, on top fo the vswitch "LAN" (the one that accepts every VLAN), I created a portgroup named "Public LAN" with tag 90.
I also added a virtual nic to the pfsense instance (let's say, em8) and I connected this nic to the "Public LAN" portgroup.So, in this new situation, em8 will be a new interface of pfsense without any VLAN on top of itself.
As soon as possible, I will do a reboot of pfsense and I will test the bridge again.
If it will fail, I will try also to change E1000 to VMXNET3.Crossing fingers,
Mauro -
Yes, that's what I meant. So pfSense sees only the virtual NIC without a VLAN but VMWare tags the traffic before it leaves.
If changing the virtual NIC type makes any difference there I would expect it to be because the hardware off-loading options between them are different.
Steve
-
Dear @stephenw10 ,
nothing to do, my first attempt (using em8 interface without any VLAN) failed.
The behaviour is always the same. Doing pcap on pfsense involved interfaces, I can see the .5 requests, but without any reply.
If remove the bridge and I assign 10.10.10.1 IP to the "Public LAN" interface, I can ping it successfully from the VM (10.10.10.2).So, I decided to go ahead trying with the nic type change from E1000 to VMXNET3.
After powering off the VM, I made the needed changes to the WAN and "Public LAN" interfaces (at least). But this action changed the MAC address of the interfaces and, after rebooting the VM, pfSense ask me to reassign the interfaces and recreate the VLANs.
This is due to the new Interfaces names vmx1, vmx2, vmx3 instead of em1, em2 and so on...
Unfortunately I can't assign the VLAN em6.X to the vmx2 interface since the VLAN for vmx2 has not been yet created.Now, pfsense has been restored from a backup.
Is there a particular procedure to change the nic type without reconfigure the entire pfsense?Thanks in advance,
Mauro -
You can rename the interfaces in the config file manually and then restore it.
It's possible to create the new VLANs at the CLI (so vmx6.90) and then assign them to the interfaces but it can get a bit complex if you have lot.
If you only have that one VLAN you could reassign it to some other interface temporarily (if you have any spare) and then create the new VLAN on vmx after swapping it in and assign it back.
Steve
-
Hello Stephen,
something changed after moving from E1000 to VMXNET3 nics type.
The problem seems to be still here, but I think that your professional experience could help to resolve the issue.In order to do some test without involving the pfSense instance that is in production, I preferred to perform the following steps:
- clone of the production pfSense instance;
- change of the nic type from E1000 to VMXNET3 (only for the WAN and "Public LAN" interfaces);
- update of the pfsense interfaces names (vmx0 for the WAM, vmx2 for the public LAN);
- assignment of a new public IP (belonging to the same public subnet mentioned in the messages above) to the WAN interface (y.y.y.15/25);
- disconnection of every virtual interface defined on the pfsense instance (except for WAN and public LAN).
After that, I started doing some test.
Ping from VM (IP .5) to pfSense (IP .15):
output on VM:
ping y.y.y.15 (ping seems waiting for the answer)output on PFSENSE:
[2.5.2-RELEASE][admin@pfSense_LAN_CMCC.home.arpa]/root: tcpdump -i vmx2 host y.y.y.5
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vmx2, link-type EN10MB (Ethernet), capture size 262144 bytes
15:08:35.989761 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:08:36.780956 IP y.y.y.5 > y.y.y.15: ICMP echo request, id 19822, seq 65, length 64
15:08:36.780982 ARP, Request who-has y.y.y.5 tell y.y.y.15, length 28
15:08:37.004806 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:08:37.780931 IP y.y.y.5 > y.y.y.15: ICMP echo request, id 19822, seq 66, length 64
15:08:37.780952 ARP, Request who-has y.y.y.5 tell y.y.y.15, length 28
15:08:38.028828 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:08:38.780927 IP y.y.y.5 > y.y.y.15: ICMP echo request, id 19822, seq 67, length 64
15:08:38.780965 ARP, Request who-has y.y.y.5 tell y.y.y.15, length 28
15:08:39.780917 IP y.y.y.5 > y.y.y.15: ICMP echo request, id 19822, seq 68, length 64
15:08:39.780966 ARP, Request who-has y.y.y.5 tell y.y.y.15, length 28
15:08:40.780948 IP y.y.y.5 > y.y.y.15: ICMP echo request, id 19822, seq 69, length 64
15:08:40.781007 ARP, Request who-has y.y.y.5 tell y.y.y.15, length 28Now, the ICMP request can reach PFSENSE, but PFSENSE didn't reply to this request.
It seems that something is blocking this kind of traffic.Ping from VM (IP .15) to pfSense (IP .5):
[2.5.2-RELEASE][admin@pfSense_LAN_CMCC.home.arpa]/root: ping y.y.y.5
PING y.y.y.5 (y.y.y.5): 56 data bytes
ping: sendto: Host is down
ping: sendto: Host is downARP TABLE on PFSENSE:
[2.5.2-RELEASE][admin@pfSense_LAN_CMCC.home.arpa]/root: arp -a
? (192.168.120.1) at 00:0c:29:db:d8:06 on vmx3.192 permanent [vlan]
? (10.0.0.1) at 00:0c:29:db:d8:06 on vmx3.10 permanent [vlan]
? (192.168.44.1) at 00:0c:29:db:d8:d4 on lagg0.44 permanent [vlan]
? (192.168.43.1) at 00:0c:29:db:d8:d4 on lagg0.43 permanent [vlan]
? (192.168.40.1) at 00:0c:29:db:d8:d4 on lagg0.40 permanent [vlan]
? (192.168.34.1) at 00:0c:29:db:d8:d4 on lagg0.34 permanent [vlan]
? (192.168.33.1) at 00:0c:29:db:d8:d4 on lagg0.33 permanent [vlan]
? (192.168.32.1) at 00:0c:29:db:d8:d4 on lagg0.32 permanent [vlan]
? (192.168.31.1) at 00:0c:29:db:d8:d4 on lagg0.46 permanent [vlan]
? (192.168.30.1) at 00:0c:29:db:d8:d4 on lagg0.30 permanent [vlan]
? (y.y.y.5) at 00:0c:29:2b:ee:a6 on vmx2 expires in 1109 seconds [ethernet]
? (y.y.y.15) at 00:0c:29:db:d8:c0 on vmx0 permanent [ethernet]
? (y.y.y.1) at 00:0c:29:02:f1:99 on vmx0 expires in 1200 seconds [ethernet]
? (192.168.100.1) at 00:0c:29:db:d8:f2 on em4 permanent [ethernet]
pfSense_LAN_CMCC.home.arpa (192.168.240.2) at 00:0c:29:db:d8:de on em2 permanent [ethernet]ARP TABLE on VM:
gateway (y.y.y.1) at <incomplete> on ens192
? (y.y.y.15) at 00:0c:29:db:d8:10 [ether] on ens912It seems that the bridge is allowing the traffic from VM to pfsense and is blocking the traffic from pfsense to VM.
Do you have some other idea about the cause of this issue?Many thanks in advance,
Mauro -
It seems that pfsense is trying to reach the .5 using a different route.
Maybe it is trying to do it via the upstream gateway .1. But VM .5 is not "on internet", it is behind the firewall itself. -
pfSense there is sending an ARP request for .5 every second. So probably each time it tries to reply to a ping. That implies it never sees a response. And in fact the pcap never shows the VM responding to those requests.
The odd thing there is that the pfSense ARP table appears to have an entry for .5 even though it's still sending queries for it. Was that taken after the pings stopped?Steve
-
@stephenw10
No it was taken during the ping.
This is what I see when the ping is stopped:[2.5.2-RELEASE][admin@pfSense_LAN_CMCC.home.arpa]/root: tcpdump -i vmx2 host y.y.y.5
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vmx2, link-type EN10MB (Ethernet), capture size 262144 bytes
15:31:05.356375 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:06.382707 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:07.406663 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:13.187718 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:14.190695 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:15.214728 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:22.649162 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:23.662695 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46
15:31:24.686714 ARP, Request who-has y.y.y.5 tell y.y.y.1, length 46 -
Right but you still see an entry in the ARP table in pfSense for .5?
-
@stephenw10 I just checked, I don't see any entry for .5 :-(
Do you think that I should give up?Is my opinion wrong?
It seems that pfsense is trying to reach the .5 using a different route.
Maybe it is trying to do it via the upstream gateway .1. But VM .5 is not "on internet", it is behind the firewall itself. -
This looks like a layer 2 issue. pfSense is sending ARP requests and the VM never replies, so it's probably not seeing them. A pcap on the VM would confirm that.
-
I still don't understand why everything works if I assign a static IP to the "public LAN" interface... connectivity stops working as soon as the bridge is enabled.
Why VM stops sending ARP replies as soon as I change the IP address?I don't want to disturb you again, sorry.
I'm doing these questions to myself :) -
Is the VM actually seeing the ARP requests? More likely the virtual interface is not sending them from pfSense because to do so it has to use the wrong MAC address. That's why it needs to be in promiscuous mode. Likely something has to be set in VMWare to allow that for the hypervisor side.
https://kb.vmware.com/s/article/1004099