Yes, the ring setup is to simplify and to test any possible solutions. The real system has 8 nodes, so 28 tunnels according to your equation - so some connections will go over double hops, especially those which do not see a lot of traffic. Obviously, finding a cost distribution which fixes the issue is not that easy either with 8 nodes.
If it were possible to have ACK traffic be forced to go the same way SYN traffic came, problem solved. Jimp hinted at something in the other thread, with a solution in 2.5.0, but it doesn't seem to work yet. (See here).