CIFS: Pathetic performance across pfSense
-
I am experiencing significant performance problems with CIFS traffic traversing pfSense. Meanwhile iSCSI traffic has no issue, nor does CIFS traffic on the same subnet.
This is a CIFS performance example on the same subnet:
(root@vm4srvp01:/mnt/win/Images)# dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 1.11347 seconds, 58.9 MB/s (root@vm4srvp01:/mnt/win/Images)# dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 1.11088 seconds, 59.0 MB/s (root@vm4srvp01:/mnt/win/Images)# dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 1.13448 seconds, 57.8 MB/s
This is an iSCSI performance example on the same subnet:
(root@vm4srvp01:/vmware)# dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 0.937938 seconds, 69.9 MB/s (root@vm4srvp01:/vmware)# dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 0.929954 seconds, 70.5 MB/s (root@vm4srvp01:/vmware)# dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 0.931392 seconds, 70.4 MB/s
This is an iSCSI performance example traversing pfSense:
(root@my1mdbp01:/mnt/db)$ dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 0.863001 s, 75.9 MB/s (root@my1mdbp01:/mnt/db)$ dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 0.752081 s, 87.1 MB/s (root@my1mdbp01:/mnt/db)$ dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 0.720176 s, 91.0 MB/s
This is an example of the CIFS issue when traversing pfSense:
(root@my1mdbp01:/mnt/win/Images)$ dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 55.074 s, 1.2 MB/s (root@my1mdbp01:/mnt/win/Images)$ dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 55.0722 s, 1.2 MB/s (root@my1mdbp01:/mnt/win/Images)$ dd bs=64k count=1000 if=/dev/zero of=test conv=fdatasync 1000+0 records in 1000+0 records out 65536000 bytes (66 MB) copied, 55.0715 s, 1.2 MB/s
So, this appears to be a protocol problem, and not an infrastructure issue, since there is no bandwidth limitation. This issue was first noted on two different Windows 7 clients, before being testing on the Linux box, so this issue is not specific to a particular OS or flavor. About the only thing in common is the pfSense box.
Any clue what might be causing this?
-
so how exactly are you mounting it via cifs? and its really cifs or smb or smb2?
I would like to duplicate your testing.
-
This is the /etc/fstab entry:
\\MyServer\MyShare /mnt/win cifs user,uid=500,rw,suid,username=MyUser,password=MyPasswd 0 0
I believe it's actually SMB (over port 445). Of course, Windows is far less transparent regarding the protocol it's negotiating.
Wireshark reports the following SMB Service Response Time Statistics:
SMB Commands Index Procedure Calls Min SRT Max SRT Avg SRT 47 Write AndX 1000 0.007591 0.058489 0.016606 50 Trans2 4 0.000328 0.025732 0.006738 4 Close 1 0.000463 0.000463 0.000463 5 Flush 1 0.000230 0.000230 0.000230 Transaction 2 Sub-Commands Index Procedure Calls Min SRT Max SRT Avg SRT 8 SET_FILE_INFO 2 0.000328 0.000410 0.000369 5 QUERY_PATH_INFO 1 0.000481 0.000481 0.000481 6 SET_PATH_INFO 1 0.025732 0.025732 0.025732
It also gives the following overall statistics:
Avg. packets/sec 1201.671 Avg. packet size 995 bytes Bytes 70666546 Avg. bytes/sec 1195960.346 Avg. Mbit/sec 9.5868
-
It just occurred to me that I had a traffic shaper enabled, specifically CODELQ. I tried to delete those queues, but after applying changes I lost all connectivity with the box. I used the console to restore to a point before I delete and then restarted the box. After I got control back I deleted it again, and this time they are gone and the box is still running.
I repeated the CIFS test and the performance problem seems to have been resolved. But now the question turns to why would the traffic shaper do that?