I've replaced the switches, no change.
I also tried a different computer, with totally different hardware, same problem.
I using the latest 2.0 BETA 3 version.
Now, I'm wondering if this is a load problem. I'm testing at between 100 and 300 Megabits/sec.
The only thing I've noticed is that when transferring at 100Mbit (the max of the new test machine) that when I do a "top" the interrupts are at between 45% and 60%. On the other systems it was the same story. At that load level, I was getting a dropout about once every 5 minutes. My test is simply doing an scp of about 100 gigabytes of information between two computers.
Now, when I limit the speed on the transmitting computer to about 20 Mbits, the dropouts were much fewer; I would guess that I saw the first dropout after about 8 minutes. The interrupts were less than 10%, generally it was bouncing between 7 and 15%, but usually below 10%
Next test, same hardware, only was transferring at 10Mbit. The interrupts are mostly less than 5%. After 15 minutes, no dropouts.
I tried turning off the hardware checksumming. I found that I had to reboot the system to make it work. Unfortunately, no change.
I did get an interrupted connection with scp at the 20 Mbit level, here is the error:
read from remote host 192.168.230.59: Connection reset by peer
lost connection
Finally I tried it with polling enabled. At 100 Mbit, the CPU was at 98+%, but the interrupts was at 0% (as expected). Unfortunately, at about 5-6 minutes, it dropped again.
At about 20 Mbit, it dropped again at 10 minutes.
I'm going to keep monitoring this, but for now will have to go with an alternative solution.
Bummer.
JBB