Jetway JNF9D-2550 + Jetway AD3RTLANG 3 x Gigabit LAN Port Daughter Board

n2qcn

the pciconf output confirms wallabybob's theory that the two internal (re0, re1) are each on their own pci express lane and the three others (re2, re3, re4) share one pci express lane (to a shared PCI rev.2.3, 32-bit, 33/66MHz) and hence have much lower potential throughput. funny how the slower interfaces aren't an issue :-)

differences between 8111E and 8111EVL chips

the 8111EVL is older than the 8111C, and uses pci express 1.0a rather than the 1.1a interface used by RTL8111C, RTL8111E and Intel 82574L. Not that it should make any difference, just pointing out its a very old design.

elgo

@wallabybob:

Reports so far suggest the problem might be related to heavy receive traffic. PERHAPS the device experiences receive overflow and the driver doesn't really know how to recover.

I haven't found any way of setting "flow control" in FreeBSD. (Receiver sends XOFF when it wants transmitter to stop, sends XON when transmitter can resume.) Maybe it needs to be set on these devices because they have minimal buffering and the driver doesn't set it. Maybe it can't be set. Maybe it is set and the other end is ignoring it.

@wallabybob:

It could be interesting to see what numbers come out of the following test:
While iperf (or similar) test is running and before it terminates, give shell commandsvmstat -i ; sleep 10 ; vmstat -iand```
netstat -i ; sleep 10; netstat -i

Following this idea, I changed the test case slightly: indeed, I noticed that when the pfsense box and the server where directly connected (without any switch), the server reported "Flow control is off for TX and off for RX" (broadcom BCM5723 with tigon3 linux driver). That's when the issue was easy to trigger.

pfsense
  |		  
8111EVL	
  |	
  |		  
BCM5723	 
  |		 
server

I now have a "switch" (some 450G cheap crap from mikrotik, but working (slowly but) perfectly now it runs openwrt) inserted between the pfsense box and the server. Server now proudly report "Flow control is on for TX and on for RX". Is there a way to see that flow control state from pfsense, I cant' see it from the switch device right now. I'd like to confirm if it's Realtek 8111-EVL that doesn't support flow control or if it's some sort of failed negiciation/capabilities advertisment.

pfsense
  |		  
8111EVL	
  |
  |
switch	
  |
  |		  
BCM5723	 
  |		 
server

And guess what? I can't reproduce the issue with this setup. I can get over 13k-18k interrupts/s (mesured with vmstat -w 5) with UDP traffic induced by the same nc commands than before. Got 80+ MB/s with UDP in both direction without a hick.

So… Is there some flow control guru around? ;)

elgo

Found this: Flow control in FreeBSD

Running the commande when I can't reproduce the issue (that is, with the swith):

ifconfig -m re1
re1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
	options=2098 <vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic>capabilities=39db <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum,tso4,wol_ucast,wol_mcast,wol_magic>ether 00:30:18:a3:2b:b3
	inet 192.168.10.200 netmask 0xffffff00 broadcast 192.168.10.255
	inet6 fe80::230:18ff:fea3:2bb3%re1 prefixlen 64 scopeid 0x2
	nd6 options=1 <performnud>media: Ethernet autoselect (1000baseT <full-duplex,master>)
	status: active
	supported media:
		media autoselect mediaopt flowcontrol
		media autoselect
		media 1000baseT mediaopt full-duplex,flowcontrol,master
		media 1000baseT mediaopt full-duplex,flowcontrol
		media 1000baseT mediaopt full-duplex,master
		media 1000baseT mediaopt full-duplex
		media 1000baseT mediaopt master
		media 1000baseT
		media 100baseTX mediaopt full-duplex,flowcontrol
		media 100baseTX mediaopt full-duplex
		media 100baseTX
		media 10baseT/UTP mediaopt full-duplex,flowcontrol
		media 10baseT/UTP mediaopt full-duplex
		media 10baseT/UTP
		media none</full-duplex,master></performnud></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,polling,vlan_hwcsum,tso4,wol_ucast,wol_mcast,wol_magic></vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic></up,broadcast,running,simplex,multicast>

Errr, so no flow control?

–
edit:
indeed, no flow control.

ifconfig re1 media autoselect mediaopt flowcontrol
ifconfig -m re1
re1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
[...]
	media: Ethernet autoselect <flowcontrol> (1000baseT <full-duplex,flowcontrol,rxpause,txpause>)</full-duplex,flowcontrol,rxpause,txpause></flowcontrol></up,broadcast,running,simplex,multicast>

Now I have flow control.
I'll run some more tests without the switch in a couple of days, playing with this flow control feature.

wallabybob

Good work discovering the flow control mediaopt. It is not documented in either the ifconfig man page or the re man page.

Guest

If you'd like me to run any of your tests let me know.
All 5 ports on my system are the realtec ones as described in the first post.

I played with the flow control setting, the only different I saw it make was that this was posted in the system log when flow control was enabled by using your command line.
'kernel: interrupt storm detected on "irq256:"; throttling interrupt source'

This is when running the iperf server on the motherboard's interface. It seems to max out at about 300 Mbps.

What is the advantage to enable flowcontrol in this case? I had always assumed that it was always used.

elgo

@JoeMcJoe:

If you'd like me to run any of your tests let me know.
All 5 ports on my system are the realtec ones as described in the first post.

Sure, if you can do the very same test with nc which allow to involve only one box and the pfsense one on the 2 EVIL (onboard) NICs, that's really valuable :)
Being sure that my issue is not due to a failing motherboard sample would priceless.

To describe the method I use:
On the box generating traffic, from example under linux:

cat /dev/zero | nc <ip_pfsense_box> <a_reachable_port></a_reachable_port></ip_pfsense_box>

On the pfsense box to recieve traffic:

nc -l <same_ip_pfsense_box_previous_command> <same_port_previous_command> > /dev/null</same_port_previous_command></same_ip_pfsense_box_previous_command>

To generate more throughtput, add "-u" to both nc commands to use UDP mode instead of TCP mode. Of course, you can do the opposite and generate traffic from the pfsense box with similar commands (or do both if you are in a nasty mood).
To see current interrupt rate (int/s average for 5 sec), look at "int" row from the output of the following command:

vmstat -w 5

The fact that both box are directly connected (no switch) or not, seems to matter too. So if you can, please specify what is your exact hardware test setup.

@JoeMcJoe:

I played with the flow control setting, the only different I saw it make was that this was posted in the system log when flow control was enabled by using your command line.
'kernel: interrupt storm detected on "irq256:"; throttling interrupt source'

This is when running the iperf server on the motherboard's interface. It seems to max out at about 300 Mbps.

What is the advantage to enable flowcontrol in this case? I had always assumed that it was always used.

That new message is great too. That means that playing with the last driver tunable I didn't test is worthy, in your case: according to the re documentation:
@re:

dev.re.%d.int_rx_mod
Maximum amount of time to delay receive interrupt processing in
units of 1us. The accepted range is 0 to 65, the default is
65(65us). Value 0 completely disables the interrupt moderation.
The interface need to be brought down and up again before a
change takes effect.

Of course, I would try to set it to 0 to see if it breaks something else that could be meaningfull :)

I always assumed flox control was negociated and likely to be on by default too, but proof is in my case that it may not :) I'll work on that probably monday.

Guest

Will do this weekend, I have to setup a client linux system, unless there is a command in windows I can use. Will let you know.

n2qcn

I can turn on and off flowcontrol on my switch and don't see any media changes with seedrs's amd64 re driver for 2.0.2.

Using the```

one % nc -l 5555 > /dev/null &
one % ssh two
two % cat /dev/zero | nc one 5555

test, my pfsense box can generate traffic and still have enough cpu remaining to let pings pass through it.```

procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr ad4 da0   in   sy   cs us sy id
 0 0 0   2187M  5806M   485   0   0   1   489   0   0   0  212   16  880 15  6 79
 0 0 0   2187M  5806M     1   0   0   1     1   0   3   0   75  311 2489  0  0 100
 0 0 0   2187M  5806M   310   0   0   0   319   0   3   0   75  844 2426  0  0 100
 0 0 0   2187M  5806M     2   0   0   0     1   0   0   0   67  255 2382  0  0 100
 0 0 0   2187M  5806M     0   0   0   0     0   0   0   0   69  297 2387  0  0 100
 1 0 0   2201M  5805M    40   0   0   0     9   0   0   0 20248 334214 81340  6 34 60
 1 0 0   2201M  5805M  2347   0   0   0  2405   0   3   0 22065 364382 88527  6 38 56
 1 0 0   2201M  5805M     3   0   0   0     1   0   0   0 22521 363193 88808  5 37 58
 1 0 0   2201M  5805M     1   0   0   3     1   0  11   0 21971 362757 87897  6 38 56
 1 0 0   2201M  5805M     1   0   0   0     1   0   0   0 22463 362896 88939  6 37 57
 0 0 0   2201M  5805M     5   0   0   0     0   0   0   0 22081 362817 88033  6 38 56
 0 0 0   2201M  5805M     5   0   0   0     1   0   0   0 21558 362825 86785  7 38 55
 0 0 0   2187M  5806M     2   0   0   0    27   0   2   0 1602 25553 8316  0  3 97
 0 0 0   2187M  5806M   309   0   0   0   318   0   0   0   70  552 2419  0  0 100
 0 0 0   2187M  5806M     2   0   0   1     2   0   8   0  120  383 2630  0  0 100
 0 0 0   2187M  5806M     1   0   0   0     1   0   0   0   62  124 2350  0  0 100
 0 0 0   2187M  5806M     2   0   0   0     0   0   0   0   77  315 2424  0  0 100

            input        (Total)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
      6.3K     0     0       609K       166K     0        72M     0
      6.5K     0     0       749K       167K     0        72M     0
      6.4K     0     0       693K       166K     0        72M     0
      6.5K     0     0       782K       166K     0        72M     0
      6.6K     0     0       791K       166K     0        72M     0
      6.5K     0     0       729K       166K     0        72M     0
      6.4K     0     0       733K       166K     0        72M     0
      6.5K     0     0       813K       166K     0        72M     0
      6.6K     0     0       909K       166K     0        72M     0
      6.1K     0     0       696K       156K     0        68M     0
       356     0     0       342K        757     0       349K     0
       488     0     0       474K       1.0K     0       486K     0
       525     0     0       539K       1.1K     0       551K     0
       395     0     0       378K        780     0       384K     0

but when my iMac generates the traffic, one core of my pfsense i3-3220 is completely saturated and it drops pings passing through pfsense.


procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr ad4 da0   in   sy   cs us sy id
 0 0 0   2187M  5806M   485   0   0   1   489   0   0   0  212    4  878 15  6 79
 0 0 0   2187M  5806M    69   0   0   0     1   0   0   0   73  584 2401  0  0 100
 0 0 0   2187M  5806M     1   0   0  24     0   0  43   0  138  287 2716  0  0 100
 0 0 0   2187M  5806M   312   0   0   0   319   0   0   0  105  657 2539  0  0 100
 0 0 0   2194M  5805M    18   0   0   0     4   0   0   0   78  292 2415  0  0 100
 0 0 0   2194M  5805M     1   0   0   0     1   0   0   0   39  242 2281  0  0 100
 2 0 0   2194M  5805M     1   0   0   0     1   0   6   0  302 619572 151937 15 44 41
 2 0 0   2194M  5805M     3   0   0   0     1   0   0   0  933 757011 129951 13 53 34
 0 0 0   2194M  5805M  2345   0   0   1  2383   0   6   0  976 708221 142489 15 51 33
 2 0 0   2194M  5805M     2   0   0   0     1   0   0   0  398 738376 145782 13 51 35
 2 0 0   2194M  5805M     2   0   0   0     0   0   0   0 1101 700710 142078 16 50 34
 2 0 0   2194M  5805M     3   0   0   0     1   0   0   0  284 696816 163463 16 49 35
 2 0 0   2194M  5805M     2   0   0   0     1   0   5   0  889 700871 148974 16 50 35
 2 0 0   2194M  5805M     2   0   0   0     1   0   0   0  364 723701 154982 13 50 37
 2 0 0   2194M  5805M   309   0   0   3   314   0   7   0 1772 729066 118540 15 53 33
 0 0 0   2194M  5805M     3   0   0   0     1   0   0   0  497 439803 92152  9 31 60
 0 0 0   2194M  5805M     1   0   0   0     1   0   3   0  102  485 2505  0  0 100
 0 0 0   2194M  5805M     2   0   0   0     1   0   0   0   47  247 2320  0  0 100
 0 0 0   2194M  5805M     0   0   0   0     2   0   6   0   38  193 2277  0  0 100

            input        (Total)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
       285     0     0       258K        531     0       262K     0
       212     0     0       169K        349     0       166K     0
       408     0     0       395K        864     0       403K     0
       372     0     0       365K        828     0       368K     0
       390     0     0       369K        806     0       378K     0
        80     0     0        25K         50     0        22K     0
       51K     0     0        72M        44K     0       3.0M     0
       79K     0     0        95M        51K     0       3.3M     0
       79K     0     0        99M        57K     0       3.6M     0
       79K     0     0       110M        71K     0       4.8M     0
       79K     0     0        95M        50K     0       3.3M     0
       94K     0     0       111M        59K     0       3.8M     0
       91K     0     0       110M        61K     0       3.9M     0

I don't see any re errors in dmesg. I'm not sure what this shows other than my mac is faster than my pfsense box. :-)

elgo

Enabling flow control has no impact on my issue :'(
(what follows has been tested with and without flow control enabled on 8111 EVL. I verified that the linux server on the other end of the link validated symetric flow control if enabled)

Precise test case: pfsense box directly connected to a linux server, generating UDP from toward pfsense box.

linux server: cat /dev/zero | nc pfsense 10000 -u
pfsense: nc -u -l psense 10000 > /dev/null

It won't suffice to trigger the watchdog timeout, it goes to 15k int/s.

I add then some TCP traffic from pfsense box:

linux server: nc -l -p 10000 > /dev/null
pfsense: cat /dev/zero | nc linux_server 10000

And bam, interrupts skyrocket to 26k int/s, and "watchdog timeout"

re1: watchdog timeout
re1: link state changed to DOWN
re1: link state changed to UP

Ok, now if I want to verify if it's an int/s related issue, I disable interrupts rate limiting (flow control is on, even if we don't care now…)

sysctl dev.re.1.int_rx_mod=0
dev.re.1.int_rx_mod: 65 -> 0

Generating only UDP from toward pfsense box now gives 55k int/s… without watchdog barking. Damn.
Generating TCP from pfsense to linux server... brings int/s down?? I stop, retry, get 176k int/s... and watchdog. Ok, so nothing interrupts related at first sight.
But this time, the link won't come back after terminating frame generating processes. I try from the pfsense box:

ping linux_server
PING linux_server (192.168.10.1): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available
^C

Woot! Something new when I finally got nothing left to look at.

Soooo, this time I don't know a thing about network related buffers on freebsd (I'll give it an eye though). Someone to the rescue? :)

–
edit:
letting the pfsense box "as is" for a couple of minutes, without rebooting. Link is still "lost", despite no network activity at all:

re1: watchdog timeout
re1: link state changed to DOWN
re1: link state changed to UP
re1: watchdog timeout
re1: link state changed to DOWN
re1: link state changed to UP
...

And now no buffer related message anymore:

ping linux_server
PING linux_server (192.168.10.1): 56 data bytes
^C
--- linux_server ping statistics ---
47 packets transmitted, 0 packets received, 100.0% packet loss

elgo

Ok, I give up.
I gave a shot to OpenWRT AA (so linux 3.3.8), and FYI, I had to include additionnal firmware (rtl8168e-3.fw), that I didn't needed before owning this RTL chip. It is related to these 8111EVL as they are seen as RTL8168evl/8111evl whereas older NICs are seen as 8169/8110. I don't know if it is required for some offloading, or if it is a "fix" to harware issues.
Used the exact same commands without reproducing the issue.
Btw, linux driver seems to have had problem with this chip before being fixed: https://bugzilla.kernel.org/show_bug.cgi?id=14962

I'm definitly thinking about bsd re driver having an issue with this chip (more likely the other way around, but you get the idea :)).