HAProxy TCP/use client ip and carp cluster problem

brlamnr

@piba said in HAProxy TCP/use client ip and carp cluster problem:

@brlamnr said in HAProxy TCP/use client ip and carp cluster problem:

The handshake sequence looks OK (as seen on both tcpdump captures from the LAN and WAN interfaces), the problem starts after the client sends the "client hello", the server answers with "server hello change cypher", but, this packet doesn't show up on the WAN interface, and of course, never reaches the client

If you take 1 of the 2 pfSense systems offline, does that 'fix' the issue? Even if you do use carp on that then single box.?

No, it didn't fix the issue. I started with only 1 pfsense in the CARP pair (I hadn't build the 2nd one yet), the problem started there. I then added the 2nd one thinking HAproxy would be expecting it, no luck, same behavior.

I disabled carp manually on the primary, the backup became master, but, no changes, didn't work.

Do you have pfSync state synchronization enabled? Are real interfaces ordered the same? (wan=em1 lan=em2 and wan=em1 lan=em2) or something similar.. Does the firewall-log log something? Can you disable pfSync for testing? As for the tcpdump, please do check the mac-addresses in the traffic, not only the ip-addresses the -e parameter of tcpdump shows them.. Though i guess if first tcp handshake succeeds those are not the actual issue..

Pfsync synchronization is enabled, it also works synchronizing HA-proxy's settings. When I started, I only had 1 firewall up, pfsync wasn't active then. I will try to disable and test laster, will report back.

The order of the interfaces is correct. In my case, I am using VLANs, wan=cxl0.78, LAN=cxl0.79

There is nothing in the logs (checked using cli clog).

The tcpdump sessions were written to a file, and later reviewed with Wireshark. Now that I took a closer look at the MAC addresses, I just noticed the following (capture on the pfsense interface facing the server):

incoming packets generated from the client, show the SRC MAC address as the adapter's physical MAC (in my case, the OUI is from Chelsio), the DST mac is the server's MAC (OUI is from vmware).
outbound packets (answers from the server), show the SRC MAC as per above, but, the DST mac address is now the VRRP/CARP MAC address.
This pattern is the same, during the handshake (works) until it stops (reply to client hello).

Since it worked when no CARP was in place, it appears the problem would be in the reply to the VRRP/CARP MAC (?), but then, why the reply to the SYN gets back fine?

Thanks.

PiBa

@brlamnr
You do have the vSwitch of ESX configured to allow spoofing and promiscuous mode? Im not even sure packets should ever be using the mac of vmware host itself.. Or do you have the real hardware nic's passed through to the pfSense VM? Ive never done that..

brlamnr

@piba said in HAProxy TCP/use client ip and carp cluster problem:

@brlamnr
You do have the vSwitch of ESX configured to allow spoofing and promiscuous mode? Im not even sure packets should ever be using the mac of vmware host itself.. Or do you have the real hardware nic's passed through to the pfSense VM? Ive never done that..

That's a good point, while preparing to deploy the virtual pfsense that will eventually replace the ones above, I read about the vswitch requirements, I didn't think about it since the ones with the problem are physical appliances connected to a physical switch, but, you never know. I will ask to have those settings enabled in the vmswitch (or port group) where the servers connect to. Will report back.

Thanks.

PiBa

@brlamnr
Only the pfSense VM that is using the carp ip would need to have such special vSwitch 'permissions'.. The webservers should not need it.. If they are currently still on hardware that should not be required.. I'm kinda running low on ideas though..

brlamnr

@piba said in HAProxy TCP/use client ip and carp cluster problem:

@brlamnr
Only the pfSense VM that is using the carp ip would need to have such special vSwitch 'permissions'.. The webservers should not need it.. If they are currently still on hardware that should not be required.. I'm kinda running low on ideas though..

The servers are actually virtual servers, the pfsense are physical appliances, I'll give it a try tomorrow anyway, nothing to lose.

brlamnr

@brlamnr said in HAProxy TCP/use client ip and carp cluster problem:

@piba said in HAProxy TCP/use client ip and carp cluster problem:

@brlamnr
Only the pfSense VM that is using the carp ip would need to have such special vSwitch 'permissions'.. The webservers should not need it.. If they are currently still on hardware that should not be required.. I'm kinda running low on ideas though..

The servers are actually virtual servers, the pfsense are physical appliances, I'll give it a try tomorrow anyway, nothing to lose.

The changes on the vswitch didn't make any difference. As soon as client-ip is turned on, the client stops seeing the server.

PiBa

the 'real' clients connect to haproxy, and that connection is likely still working properly.. as nothing changes on that side when enabling use-client-ip on haproxy for the backend connection.
Haproxy however probably nolonger sees the reply from the server.. which is strange.. as you see them in the packet captures...

Do you have any plugins like suricata/snort running? Or do you use the captive-portal which also uses ipfw for some 'low level' firewall tasks..

brlamnr

@piba said in HAProxy TCP/use client ip and carp cluster problem:

the 'real' clients connect to haproxy, and that connection is likely still working properly.. as nothing changes on that side when enabling use-client-ip on haproxy for the backend connection.
Haproxy however probably nolonger sees the reply from the server.. which is strange.. as you see them in the packet captures...

Do you have any plugins like suricata/snort running? Or do you use the captive-portal which also uses ipfw for some 'low level' firewall tasks..

No, there are no plugins nor captive portals. The appliance was configured to do load balancing only.

PiBa

@brlamnr
Can you check result of command 'ipfw show' ?

brlamnr

@piba said in HAProxy TCP/use client ip and carp cluster problem:

@brlamnr
Can you check result of command 'ipfw show' ?

Following after activating client-ip:

00010 0 0 fwd ::1 tcp from 10.3.128.10 443 to any in recv cxl0.79
00011 108 20412 fwd ::1 tcp from 10.3.128.11 443 to any in recv cxl0.79
65535 48732381 4651172490 allow ip from any to any

PiBa

@brlamnr
hmm it looks like that has IPv6 and IPv4 mixed together..

Mine currently look like:

00012           0              0 fwd 127.0.0.1 tcp from 192.168.8.15 444 to any in recv em1

Maybe there lies the root cause..
Can you try and add a rule manually?:

ipfw add 50 fwd 127.0.0.1 tcp from 10.3.128.11 443 to any in recv cxl0.79

even though the rule is counting traffic.. it 'seems' to work..

brlamnr

@brlamnr said in HAProxy TCP/use client ip and carp cluster problem:

@piba said in HAProxy TCP/use client ip and carp cluster problem:

@brlamnr
Can you check result of command 'ipfw show' ?

Following after activating client-ip:

00010 0 0 fwd ::1 tcp from 10.3.128.10 443 to any in recv cxl0.79
00011 108 20412 fwd ::1 tcp from 10.3.128.11 443 to any in recv cxl0.79
65535 48732381 4651172490 allow ip from any to any

It didn't work. Same behavior. Thanks.