Packet Loss



  • Hardware is a DL385 G1 with two 1.8 dual core opterons, 4gb memory and using one of the onboard NICs with VLANS connected to a HP5412 switch.

    pfSense is 2010 02 25 (installed from CD) with SMP kernel

    The onboard NIC is a Broadcom (bge) updated to latest HP firmware.

    dmesg fragment as below

    bge0: <hp nc7782="" gigabit="" server="" adapter,="" asic="" rev.="" 0x002100="">mem 0xf7ff0000-0xf7ffffff irq 28 at device 6.0 on pci3
    miibus0: <mii bus="">on bge0
    brgphy0: <bcm5704 10="" 100="" 1000basetx="" phy="">PHY 1 on miibus0
    brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
    bge0: [ITHREAD]

    All of the interfaces are VLANS on a single port (bge0) connected to a gigabit port on the HP switch.

    It is connected to three ADSL2+ lines in bridge mode for "wan" access; the ADSL2+ lines have an average sync of 15mb down 1.5mb up. The HP switch is the LAN gateway (we have multiple internal subnets)

    When testing (single client using pfsense), CPU and memory usage is minimal, but in the "Gateways" dashboard applet, I am seeing frequent packet loss (of up to 20%) on the LAN and WAN gateways. While packet loss on the ADSL2+ connections I can accept as possibly issues with the ISP, I am at a loss to explain why I am seeing ANY packet loss on the LAN gateway.

    The HP switch is reporting that the port is up at 1000baseT as is pfSense

    The HP switch is not reporting any errors on the port.

    We installed Windows on the server and ran iperf to another similar windows server on the same switch with zero packet loss.

    I also have setup a HP DC5100MT  P4 HT with similar broadcom NIC

    dmesg fragment as below

    bge0: <broadcom netxtreme="" gigabit="" ethernet="" controller,="" asic="" rev.="" 0x004001="">mem 0xf0400000-0xf040ffff irq 17 at device 0.0 on pci64
    miibus0: <mii bus="">on bge0
    brgphy0: <bcm5750 10="" 100="" 1000basetx="" phy="">PHY 1 on miibus0
    brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
    bge0: [ITHREAD]

    and I see zero packet loss on any connection with this.

    Any constructive suggestions would be appreciated.</bcm5750></mii></broadcom></bcm5704></mii></hp>



  • Running out of states maybe? Though the state table is sized dynamically based on RAM in 2.0, so with 4 GB RAM that would leave you with a huge state table, so that's highly unlikely.

    Leave a constant ping going in a SSH session and see if you're seeing loss there as well.



  • Hi,

    With no traffic (the only connections to it were from my laptop) going over it, we were seeing packet loss to the LAN switch on the gateway "applet" from the front page. This was confirmed by pinging it from my laptop which was also connected to the same switch and also seeing ICMP packet loss, as well as experiencing the web front end become unresponsive while the local console (via iLo) was still fine.

    With 4GB of memory, the automatically sized state table was quite big :-) however as there was no traffic going over it, just my laptop connecting to it, the number of states was <1% of the available.

    When I swapped it out for the DC5100MT desktop PC, I backed up and restored the config file, connected it to the same switch port, and had no issues. When I tested the DL385 with Windows (2k8 x86) I was able to run iPerf and saturate the 1GB link while pinging it and still have no packet loss. I also ran a full set of HP diagnostics under windows, and it passed with no errors.

    The bge(4) http://www.freebsd.org/cgi/man.cgi?query=bge&sektion=4&manpath=FreeBSD+8.0-RELEASE lists the BCM570x, but doesn't list the HP7782, which is leading me to think that there may be a 'generic' FreeBSD/HP7782 problem here, which I will try an eliminate next week by doing a clean FreeBSD install and seeing if I get any packet loss.

    I will also see if I can find another 10/100/1000 NIC I can put into the DL385 to test that it isn't something else in the DL385 that is causing the problem, like running 4 way SMP or the PCI-x bridge or…

    Cheers



  • Hi,

    I installed FreeBSD 8.0, installed iperf and ran it against another server, was able to saturate the link, and had no dropped packets while pinging it on the bge0 interface.

    I'm having trouble understanding how to setup VLANs in rc.conf, but managed it manually on the bge0 interface, and also ran iperf with no dropped packets across VLAN tagged interfaces.

    
    ifconfig bge0.205 create
    ifconfig bge0.205 inet 10.205.1.253 netmask 255.255.255.0
    
    

    I'm now at a loss on what I can try next…



  • disable rxcsum/txcsum which only lately have been fixed in bge(4) driver on FreeBSD.



  • @ermal:

    disable rxcsum/txcsum which only lately have been fixed in bge(4) driver on FreeBSD.

    do you mean putting a tick in the box for "Hardware Checksum Offloading" ?

    Cheers



  • I'm hoping that do you didn't mean putting a tick in the box for "Hardware Checksum Offloading", as the Dashboard Gateways display is still showing loss (currently 2%) on both WAN and LAN interfaces.

    I'm wondering if its something in gateway monitoring that for some reason is having trouble ?



  • I take it back, it looks as if putting a tick in the box, has made a difference.

    Cheers

    Arne



  • I spoke too soon, it has started reporting high packet loss again :-(



  • I had the same problem. I turned off the traffic shaper and the problem's gone.



  • I'm not using the packet shaper :-(



  • I've just seen that the AMD64 build are back again :-)

    I'll try and give the latest snapshot a go tomorrow.

    Cheers

    Arne



  • tried the latest AMD64 livecd, and although it boots, it gets as far as

    IPsec: Initialized Security Association Processing.

    but no further :-(


Log in to reply