Bug: rt2500 driver causes Ethernet down until reset.



  • After testing pfSense v1.0.1-imbedded for a few hours on wrap.2c v1.11, I encountered a bug where the ethernet spontaneously went down and stayed down.  I checked the ethernet cable on another unit and it was good.  I wasn't doing anything other than wirelessly downloading a 3GB file for stress testing.  The gui Status-Interfaces shows status "up", oddly.  The ethernet stayed down (couldn't ping out or in) until I logged into the gui wirelessly, went to ethernet interface, and clicked Save, then it came back up.  Here are the possible relevant system log entries; there was nothing unusual above or below them:

    Nov 1 18:43:52 kernel: sis0: link state changed to DOWN
    Nov 1 18:43:52 check_reload_status: rc.linkup starting
    Nov 1 18:43:57 kernel: sis0: link state changed to UP
    Nov 1 18:43:57 php: : Hotplug event detected for sis0 but ignoring since interface is not set for DHCP

    And, it just happened again while downloading a large file; here are the only extra lines in the system log:

    Nov 1 20:06:45 check_reload_status: reloading filter
    Nov 1 20:06:46 dnsmasq[785]: exiting on receipt of SIGTERM
    Nov 1 20:06:48 dnsmasq[35367]: started, version 2.22 cachesize 150
    Nov 1 20:06:48 dnsmasq[35367]: read /etc/hosts - 4 addresses
    Nov 1 20:06:48 dnsmasq[35367]: reading /etc/resolv.conf
    Nov 1 20:06:48 dnsmasq[35367]: using nameserver 10.0.0.1#53
    Nov 1 20:06:52 php: : FTP proxy disabled for interface LAN - ignoring.
    Nov 1 20:06:52 php: : FTP proxy disabled for interface opt1 - ignoring.
    Nov 1 20:07:01 check_reload_status: reloading filter
    Nov 1 20:07:07 php: : FTP proxy disabled for interface LAN - ignoring.
    Nov 1 20:07:07 php: : FTP proxy disabled for interface opt1 - ignoring.

    The only differences between this unit and my other working units is: the older working units are 1.0 snapshots and release candidates, and the mini-pci wireless cards in this wrap unit are one Atheros-based and one Ralink-rt2500-based; all of my older units have two atheros wireless cards.  I just tested this exact same config on imbedded snapshot 09-27-06 and the same thing happened, but there was nothing in the logs around the time it went down.  I also took out secondary atheros card leaving just the primary Ralink card and still had the same result.
    So, it may have something to do with the Ralink driver (confirmed - see post below).

    Thank you,
    -Pete



  • How long is the cable?



  • "How long is the cable? "
    Ethernet Switch with 10ft to PoE then 6ft to wrap.
    I changed both cables but same result.



  • I tested the Ralink card over a dozen times, and I couldn't download much more than 1/2 GB before an interface stopped working (yet still showed as "up").  Usually the ethernet interface stopped working, but sometimes it happened to the wireless ral0 interface.  After switching the from Ralink/rt2500 to Atheros/CM9 I've had zero trouble so far after many attempts and GB downloaded.  So I conclude it was the Ralink/rt2500 driver causing this bug.  Caveat, I only tested "Ad-Hoc" mode (because I planned to use it in a mesh), also sometimes the bug would happen right when I powered up another unit nearby (in ad-hoc mode on same channel for mesh).  I am curious about why a wireless driver bug would often cause the ethernet port to stop functioning while the wireless kept working.
    Thanks,
    -Pete



  • Maybe the problem is power related (does the ralink card need more power than the atheros)? Did you try with the psu directly connected instead of PoE? Cany ou try with a more powerful PSU?



  • The PSU is 18V 0.825A
    Ralink specs:
    http://www.nvtechusa.com/pdf/922W_spec.pdf

    I'll try this again with more Amps using a PSU like from this post:
    http://forum.pfsense.org/index.php/topic,2647.msg15534.html#msg15534



  • I'm using a rt2500 pci without problems. Normal power supply.


Locked