• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

VLANs seems to be mostly broken with Intel SR-IOV VF

L2/Switching/VLANs
3
22
1.3k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N
    nazar-pc
    last edited by May 26, 2024, 8:39 AM

    I have an Intel 82599ES NIC and crated several virtual functions for it with SR-IOV. One of them was passed through to VM where pfSense is running.

    Then on this ixv0 device I have created VLAN and assigned it to WAN.

    The problem is that most of the time pfSense boot hangs on "Configuring VLAN interfaces..." for a very long time and then hangs on WAN as well and when it finally boots WAN doesn't work at all no matter what I try.

    However, I noticed that occasionally it does work, when that happens, I see following line in OS boot logs right next to "Configuring VLAN interfaces...":

    [fib_algo] inet.0 (bsearch4#20) rebuild_fd_flm: switching algo to radix4_lockless

    When VLANs hang it doesn't show up until around the end of the boot or even after that.

    Not sure what, but something seems broken. Anyone experienced anything like this before?

    H 5 Replies Last reply Jul 13, 2024, 9:47 PM Reply Quote 0
    • H
      HLPPC Galactic Empire @nazar-pc
      last edited by HLPPC Jul 13, 2024, 10:11 PM Jul 13, 2024, 9:47 PM

      @nazar-pc I used to get errors like that a lot with my bare metal static block of IPs. I was able to create a VM for each Static IPv4 address and had it all working.

      It eventually crashed but I think each VM may have needed a different localhost IP. Or I had dns errors.

      Recently when getting radix lockless errors I would try forcing the routing algorithm to bnet(?) via a sysctl.

      The Proxmox VM with public IPs was pretty neat and I wish I knew now what I did then. I was able to pause my virtual routers individually. BUT, don't recommend that. Maybe in the far future.

      You may need to remove vlan1 off of your network entirely. And changing localhost IPs may not be a bad way to go the range is 127.0.0.1-127.255.255.255 but I cannot say if that is changeable in pfSense.

      It is also possible that PCI hardware address and msix mapping are creating internal routing conflicts. Like, if your whitebox mobo has a built in WiFi chip or USB interfaces or a PS2 interface, everything is autodetected normally by the kernel.

      You may want to set up firewall rules in the parent hypervisor too. Proxmox lets you segregate VMs.

      1 Reply Last reply Reply Quote 0
      • H
        HLPPC Galactic Empire @nazar-pc
        last edited by HLPPC Jul 13, 2024, 10:14 PM Jul 13, 2024, 9:49 PM

        @nazar-pc also maybe vlans on the wan should only be a type-2 hypervisor and you may need to tag the vlans in the hypervisor itself too. The number of virtual cores on the VM and how they are mapped to the NIC matter too.

        https://forums.freebsd.org/threads/fib_algo-inet-0-bsearch4-20-rebuild_fd_flm-switching-algo-to-radix4_lockless.91474/

        Also, if this helps. IOMMU groups get complicated.

        https://pve.proxmox.com/wiki/PCI_Passthrough

        1 Reply Last reply Reply Quote 0
        • H
          HLPPC Galactic Empire @nazar-pc
          last edited by HLPPC Jul 13, 2024, 10:50 PM Jul 13, 2024, 10:17 PM

          @nazar-pc the VLANs KLD module has dependencies too, and, at least in OPNsense, you can delete it and have OPNsense kernel errors tell you what those dependencies are on reboot (I haven't tried that in pfSense)

          Also, maybe try disabling IPv6 and making sure no IPv6 multicast reaches the machine. In freebsd and OPNsense the command would be

          ifconfig ivx0 ifdisabled

          pfSense doesn't seem to work with the ifdisabled command in my system on bare metal, but in your VM, maybe.

          1 Reply Last reply Reply Quote 0
          • H
            HLPPC Galactic Empire @nazar-pc
            last edited by HLPPC Jul 13, 2024, 11:14 PM Jul 13, 2024, 10:53 PM

            @nazar-pc If your ivx0 interface is infiniband based, maybe try

            kldload infiniband.ko

            as a boot shell or early shell command. There are other modules for SFP+ which may load automagically in pfSense official hardware but not in a VM; you may have to load them manually BUT as to what they are, I don't know.

            https://shop.netgate.com/products/sfp-10gbase-sr-transceiver

            If you have pfSense+ in a VM, L2 firewall rules may help too.

            1 Reply Last reply Reply Quote 0
            • H
              HLPPC Galactic Empire @nazar-pc
              last edited by Jul 13, 2024, 11:59 PM

              @nazar-pc if you have these vlan tuneables

              dev.xxx.X.vlan_only
              Require that incoming frames must have a VLAN tag on them that
              matches one that is configured for the NIC. Normally, both
              frames that have a matching VLAN tag and frames that have no
              VLAN tag are accepted. Defaults to 0.

                 dev.xxx.X.vlan_strip
                     When non-zero the NIC strips VLAN tags on receive.  Defaults to
                     0.
              
              1 Reply Last reply Reply Quote 0
              • N
                nazar-pc
                last edited by Jul 14, 2024, 4:35 AM

                Honestly I'm not following what you're trying to say, @HLPPC, these messages look like AI-generated hallucination to me

                H G 9 Replies Last reply Jul 14, 2024, 2:58 PM Reply Quote 0
                • H
                  HLPPC Galactic Empire @nazar-pc
                  last edited by HLPPC Jul 14, 2024, 3:45 PM Jul 14, 2024, 2:58 PM

                  @nazar-pc I am suggesting features available in BSD systems which might work with your custom SR-IOV program work with routing vlans. Pre-coded, without having to code yourself. And linux vm features.

                  There are different routing algorithms that pfSense can switch too with radix trees. They effect hardware/software. The sysctl is net.route.algo. It is exposed in C code. It may be available both as a system call for compiled programs, and an administrator command for interactive use and scripting.

                  BUT, for everyone else not caring about the infinite permutations of configuration in routing, we lock it down with everyone's official settings and hope it generates, forwards, rejects, drops, denies NATs and accepts packets correctly. And we firewall it. Good luck with netmap though.

                  1 Reply Last reply Reply Quote 0
                  • H
                    HLPPC Galactic Empire @nazar-pc
                    last edited by HLPPC Jul 14, 2024, 3:32 PM Jul 14, 2024, 3:13 PM

                    @nazar-pc https://man.freebsd.org/cgi/man.cgi?route

                    And none of that is AI generated. I search for crap until my wifi works. And my lan runs smoothly. Usually. VMs were great for switching IPv4 public addresses on the fly. If I used AI to summarize all of that maybe it'd make more sense.

                    You asked a lunatic question involving SFP+s in an unknown VM with an unknown CPU and mobo and whether or not it is official hardware. I gave you lunatic answers trying to make it work.

                    I definitely disable dhcp in these setups, and having it enabled even in the VM may cause issues too.

                    1 Reply Last reply Reply Quote 0
                    • H
                      HLPPC Galactic Empire @nazar-pc
                      last edited by HLPPC Jul 14, 2024, 3:56 PM Jul 14, 2024, 3:48 PM

                      @nazar-pc also, the VMs may need to do some encryption with the cloud, and auto-configure your interface drivers. And maybe each VM with cpu encryption keys is a little off depending on the setup. Or whether there is TPM passthrough With other VMs.

                      Like, if you've only purchased one instance of pfSense+, can you clone it with 5 public IPs? I have and it worked for a bit pre-pfsense monetization, but maybe it caused issues.

                      I eventually used VIPs on bare metal instead.

                      1 Reply Last reply Reply Quote 0
                      • H
                        HLPPC Galactic Empire @nazar-pc
                        last edited by HLPPC Jul 14, 2024, 5:14 PM Jul 14, 2024, 5:06 PM

                        This post is deleted!
                        1 Reply Last reply Reply Quote 0
                        • H
                          HLPPC Galactic Empire @nazar-pc
                          last edited by HLPPC Jul 14, 2024, 6:00 PM Jul 14, 2024, 5:48 PM

                          @nazar-pc There are also NTP files that sync the kernel and bios time which affects interfaces. If you disable NTP and maybe not send kiss of death packets to the VM, could help. Wake on Lan/magic packets may need to be blocked too. Killing syslog PIDs could also reduce interference.

                          1 Reply Last reply Reply Quote 0
                          • H
                            HLPPC Galactic Empire @nazar-pc
                            last edited by HLPPC Jul 14, 2024, 6:31 PM Jul 14, 2024, 6:30 PM

                            @nazar-pc

                            Here are proxmox and ntp instructions too

                            pve

                            https://tinyurl.com/ru7jn2c8

                            ntp burst issues
                            https://tinyurl.com/6fwfuezx

                            1 Reply Last reply Reply Quote 0
                            • H
                              HLPPC Galactic Empire @nazar-pc
                              last edited by HLPPC Jul 14, 2024, 7:08 PM Jul 14, 2024, 7:05 PM

                              @nazar-pc

                              https://docs.netgate.com/pfsense/en/latest/network/broadcast-domains.html

                              The fewer broadcast domains, the better I think at least from the VM's perspective. Or the hypervisor.

                              1 Reply Last reply Reply Quote 0
                              • H
                                HLPPC Galactic Empire @nazar-pc
                                last edited by Jul 14, 2024, 7:42 PM

                                @nazar-pc

                                Could be a checksum offloading issue too. Disable Hardware Checksums with Proxmox VE VirtIO.

                                1 Reply Last reply Reply Quote 0
                                • N
                                  nazar-pc
                                  last edited by Jul 15, 2024, 12:16 AM

                                  @HLPPC said in VLANs seems to be mostly broken with Intel SR-IOV VF:

                                  You asked a lunatic question involving SFP+s in an unknown VM with an unknown CPU and mobo and whether or not it is official hardware. I gave you lunatic answers trying to make it work.

                                  I appreciate the effort, but you can always ask clarification questions about setup if I missed something important, I'm happy to clarify.

                                  I have an Intel NIC with two SFP+ ports as mentioned in the very first post that supports SR-IOV. VM is just a simple KVM-based one on Linux host with virtual function device assigned to the VM running pfSense. I don't have Infiniband, Wi-Fi, IPv6, cloud, TPM or some other seemingly random things you have mentioned. I have no idea what NTP and WoL has to do with any of this either.

                                  The interface works fine without VLANs and also works with VLANs until reboot, but when VLANs are added it hangs on boot and interfaces are not working after that.

                                  So as far as I'm concerned there are no hardware issues here, no driver issues either, it is just something pfSense-specific (or maybe FreeBSD in general) that is problematic when it comes to VLANs specifically and specifically at boot time. Maybe ordering of stuff at boot is off or something.

                                  1 Reply Last reply Reply Quote 1
                                  • G
                                    Gblenn @nazar-pc
                                    last edited by Gblenn Jul 15, 2024, 11:52 AM Jul 15, 2024, 11:51 AM

                                    @nazar-pc said in VLANs seems to be mostly broken with Intel SR-IOV VF:

                                    Honestly I'm not following what you're trying to say, @HLPPC, these messages look like AI-generated hallucination to me

                                    +1 on that... I'm seeing similar in other threads unfortunately.

                                    But regarding your problem, you mention pfsense running as a VM. So you create these "virtual functions" of your NIC in the hypervisor? What hypervisor are you running and how is your setup exactly?
                                    Are you saying that you are using the physical port for more VM's than pfsense, and for other things than WAN?

                                    N H 2 Replies Last reply Jul 15, 2024, 12:05 PM Reply Quote 0
                                    • N
                                      nazar-pc @Gblenn
                                      last edited by Jul 15, 2024, 12:05 PM

                                      @Gblenn said in VLANs seems to be mostly broken with Intel SR-IOV VF:

                                      But regarding your problem, you mention pfsense running as a VM. So you create these "virtual functions" of your NIC in the hypervisor? What hypervisor are you running and how is your setup exactly?

                                      I'm creating virtual function with udev on Linux host like this:

                                      ACTION=="add", SUBSYSTEM=="net", KERNEL=="intel-ocp-0", ATTR{device/sriov_numvfs}="2"
                                      

                                      By the time VM starts they already exist like "normal" PCIe devices. VM is created with libvirt and I just take such PCIe device and assign it to the VM. pfSense mostly treats them as normal-ish Intel NICs as far as I can see.

                                      @Gblenn said in VLANs seems to be mostly broken with Intel SR-IOV VF:

                                      Are you saying that you are using the physical port for more VM's than pfsense, and for other things than WAN?

                                      The physical port I have is connected to a switch on the other end. Switch wraps 2 WANs into VLANs and I want to extract both WAN and WAN2 from virtual function in pfSense. In this particular case physical function may or may not be used on the host, but it is mostly irrelevant to what is happening to a specific virtual function I'm assigning to the VM to the best of my knowledge.

                                      As mentioned this whole setup works. I boot VM, create VLANs, assign them to WAN and WAN2 and everything works as expected. Its just when I reboot it hangs, times out and VLANs are "dead" in pfSense.

                                      G H 3 Replies Last reply Jul 15, 2024, 12:46 PM Reply Quote 0
                                      • G
                                        Gblenn @nazar-pc
                                        last edited by Gblenn Jul 15, 2024, 4:14 PM Jul 15, 2024, 12:46 PM

                                        @nazar-pc Aha, but do you really need pfsense to be "involved" with the VLAN's on the switch? In fact, do you even need VLAN on the switch at all??

                                        I guess it depends on your ISP and what type of connection you have of course. I have two public IP's from the same ISP and in my case it's the MAC on each respective WAN that determines which IP is offered to which port. But even if that doesn't work for you, which it doesn't if it is two different ISP's, couldn't you limit the VLAN to just be something between the switch and libvirt?

                                        I run Proxmox and set ID's on some ports to "tunnel" some traffic between individual ports in my network. So that VLAN ID is not used or even known by pfsense at all, it's only for the switches and e.g. one single VM...

                                        1 Reply Last reply Reply Quote 0
                                        • H
                                          HLPPC Galactic Empire @nazar-pc
                                          last edited by HLPPC Jul 15, 2024, 4:02 PM Jul 15, 2024, 3:33 PM

                                          @nazar-pc

                                          as a system tunable consider

                                          hw.ix.unsupported_sfp=1 (or whatever other hw.intel card you have options)

                                          maybe try

                                          sysctl -a | grep (your intel driver)
                                          pciconf -lvvv
                                          ifconfig -vvv

                                          and then consider disabling msix in the vm if it is on. btw ifdisabled disables duplicate address dectection with ipv6 and others have had success in freebsd VMs by disabling it; it isn't a robot suggestion. Dual stack sucks sometimes and pfSense HAS to be dual stack compliant to partner with AWS; hence it is forced to be enabled.

                                          I haven't actually seen a intel card driver show up itself a vm or tried passthrough.

                                          https://man.freebsd.org/cgi/man.cgi?query=iovctl&sektion=8&manpath=freebsd-release-ports

                                          There might be a setup where bridging the WAN helps it out in the VM

                                          I am just throwing bsd at you to see if it helps. Because you know, it is the reason it isn't :) there are certainly more efficient ways of doing things.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.