Is 'lan only' load balancer/relayd possible?
So either I'm missing something dead obvious or hit a bug in load-balancer/relayd.
I have a service that runs on several nodes on the lan. The service is not http, but just an unencrypted, single session, vanilla tcp service. I can 'telnet <lan node=""><port>' and have the correct conversation from each of the nodes.
My project is to have the load balancer create a virtual ip on the lan, then distribute/redirect calls from other nodes on the lan to the virtual ip to whichever of the servernodes on the lan that's up.
Seems simple enough. But try as I might I can't telnet to the virtual IP on the lan mentioned on the load balancer. I even created a virtual CARP ip on the lan with the named address and restarted relayd – didn't work. I tried to set the load balancer IP to localhost/127.0.0.1 then 'nat' the request on the lan to that -- no telnet connect.
Tried DNS and TCP on the load balancer setting-- same answer.
So, I did a netstat and also an fstat -p on relayd: There's some serious bug in the pfsense load balancer. My choice in the load balancer is 'tcp' and the os reports its only opened udp datagram streams. Something is far wrong. Latest stable pfsense build.
Result: When the load to be balanced goes in and out the same interface, relayd has to be used in the 'relay' mode and not 'redirect' mode. Moreover, 'load balancing' only really works in the 'relay mode', as implemented what's going on is more 'round robining' than 'load balancing'.
pfsense does not appear to support the 'relay' mode of relayd.
To do that, the clients and servers must be in different subnets, or you need outbound NAT to translate the traffic so it appears to come from the firewall itself.
The problem is this:
Client A -> request -> VIP -> relayd -> Server B
Server B sees Client A's original IP, and shortcuts the response:
Server B -> respond -> Client A
Client A drops the traffic because the response does not match the request (Server B's IP != the VIP)
Switch to manual outbound NAT, add a rule to NAT from the LAN subnet to your server IPs, translating to the Interface address, and then save/apply. It should work that way. The NAT will cause the traffic to flow back via the firewall.
On the one hand, 9 / 10 on the hack-o-meter. On the other hand, never argue with success. Thanks!
It would be good to add "check script …" to the possible monitors.
It would be good to warn of the aforementioned should the 'forward to' and the 'monitor address' be on the same subnet.
It would be good to warn if the virtual address isn't among the addresses this system arps for. I'm assuming that if the address is a carp vip in backup mode the system won't advertise via the load balancer facility that it owns the virtual ip.