Very interesting comments.
wallabybob, I don't think that iperf is the bottleneck. We did several tests running iperf as both the client and the server on machines and were able to get performance that was very high. I don't have the numbers handy from the workstations that we were testing on, but I was just able to get 2.43Gbps running both the iperf server and client on an old Atom 230 server that I have handy. We also did the test with the workstations on the same sub-net/VLAN, taking the routing performance of the pfSense box out of the equation, and were able to get much better numbers.
idmud, thanks for the suggestion. I'll give that a try when we have a spare moment.
cmb, I understand the issue with a single source-destination pair and LACP. In all cases we were using at least three pairs of machines, with several tests using eight pair. Your point about the switches is well-taken and probably spot-on. I may PM you regarding suggestions for better switches. (I have some money left in a budget for experimentation, only a month to make use of it, and switches had my attention anyway.)