[Solved] CentOS 7 + VirtualBox Bridged Networking + pfSense 2.4.0 problems.



  • After getting a decent multi-point packet capture set up, I tracked this down to generic receive offload (GRO) not working correctly on my host / guest.  On the host I could see packets with a payload size greater than the MTU and those packets were missing from the guest.

    Both networks I was having issues with see very little LAN activity from VM guests, so it's not surprising I didn't notice the issue until the pfSense update surfaced it.

    There's an old VirtualBox bug that deals with fixing similar GRO issues.  I also found this Arch post that indicates newer kernels warn:

    Driver has suspect GRO implementation, TCP performance may be compromised.

    …for the the same driver + version I have:

    driver: e1000e
    version: 3.2.6-k
    

    Disabling GRO fixed my issue instantly.  My original post is below…


    Hi,

    After updating to 2.4.0 I'm having a bit of trouble, but only with a very specific configuration.  I can reproduce it consistently when I try to access the webGUI.  I've also been able to reproduce it on two completely separate networks, so it's not likely I'm seeing a localized problem.

    I can reproduce the issue consistently with the following:

    • A CentOS 7 (3.10.0-693.2.2.el7.x86_64) host.
    • VirtualBox 5.1.30 (and 5.1.28 and 5.1.26).
    • Accessing the webGUI from a Win10 virtual machine that uses bridged networking.

    I've tested using a PCEngines APU2C2 box on one network and an APU2C4 on another network.  If I switch to a Windows host for VirtualBox it works.  If I switch to VirtualBox's NAT networking it works.  If I put pfSense in a VM using bridged networking it works.

    So, my initial instinct says something's broken with VirtualBox's bridge network driver on CentOS (or Linux), but:

    • I don't have problems with pfSense 2.3.x.
    • If I connect to the webGUI on a firewall that's NOT on the local network it seems to work ok.

    I can't see anything unusual in the firewall logs.  I briefly watched the traffic with tcpdump, but I didn't see anything obviously broken.  I need to do some more diagnosing, but figured I'd post here just in case anyone has any ideas or suggestions that could help.

    I'll attach a diagram of the connections I've tested.  Anecdotally the broken connections seem to fail fetching fonts, bootstrap, and jquery.