Interfaces not showing in Dataplane KVM
-
Hi there,
I've been trying to explore and test TNSR in Proxmox (KVM) however I've been struggling to have the interfaces show up in the Dataplane.
Initially, I had 2 interfaces assigned to the VM, validated that both were disabled in the Host OS, renamed the interfaces with
dataplane dpdk dev 0000:00:14.1 network name WAN
,service dataplane restart
, etc but still, when I do show interface nothing is returned.After this, I've added a 3rd interface, reinstalled TNSR and had the new interface configured during setup, I was wondering if TNSR 100% requires to have a dedicated management interface.
After the installation was complete, the same situation, no matter what I do they won't show up to be used in TNSR.I've also tried changing the NIC type on Proxmox side, from VirtIO (paravirtualized) to Intel E1000 but no success.
From reading around, including the dpdk docs, my understanding is that both VirtIO or Intel E1000 types should work.The plan is not to use TNSR on a VM permanently, eventually it would go into dedicated hardware if I like it, but so far for testing, I can't justify it.
Any ideas?
Thanks.
Edit: The interfaces on Proxmox, are Linux Bridges to the NIC. Each bridge I'm using on TNSR VM has a dedicated NIC port currently.
-
@ralms-0 What kind of NICs are you setting in the proxmox VM? You should be using virtio.
dataplane cpu workers 1 dataplane ethernet default-mtu 1500 dataplane dpdk dev 0000:00:12.0 name outside dataplane dpdk dev 0000:00:13.0 name transit dataplane buffers buffers-per-numa 32768 dataplane statseg heap-size 256M
Be sure the host does not "own" them.
https://docs.netgate.com/tnsr/en/latest/setup/setup-host-interfaces.html#disable-host-os-nics-for-tnsr
tnsr-a1 tnsr# host shell ip add 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 4: ens20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 36:fb:7a:13:c5:05 brd ff:ff:ff:ff:ff:ff inet 172.21.17.231/26 brd 172.21.17.255 scope global dynamic ens20 valid_lft 7218sec preferred_lft 7218sec inet6 fe80::34fb:7aff:fe13:c505/64 scope link valid_lft forever preferred_lft forever
Regarding having an out-of-band host interface, I will always configure one. Else if you somehow render the dataplane into a state it will not start, you must resort to the console for access to get it corrected.
-
@derelict Thank you for reaching out.
I initially was using VirtIO, it's what I normally use, only tried the E1000 yesterday since I'm not getting it to work.
Here is my Proxmox host:
Here is the TNSR VM:
In the host, the 3rd interface (ens20) is being used right now.
The others show up in the host also as "DOWN" state.
Triple checking the article regarding disabling Host OS Nics,
I've noticed that "NM_CONTROLLED" wasn't set.
I've set it on both intended interfaces, rebooted the VM and still no success :(
Responding to the topic of having an out-of-band host interface, the reason I didn't feel the need neither in VM form or in the future, is because in VM I can use the console and in the future with hardware, I have HP ILO.
So TNSR can work fine without an out-of-band host interface correct?Thank you.
-
To add to my previous reply.
After I've added theNM_CONTROLLED=no
, I tried to set the interfaces name again and reboot dataplane as mentioned here: https://docs.netgate.com/tnsr/en/latest/ztp/index.html#dataplane-interfaces -
Do ens18 and ens19 still show in the host? tnsr/vpp is not going to be able to use them until that is not the case.
So TNSR can work fine without an out-of-band host interface correct?
I wouldn't do it but it is technically a possible configuration scheme. In that case the host namespace will not be able to access anything. You will have to protect things like ssh and snmp, etc using dataplane ACLs, and probably other things I am not covering that are simply not a concern if you have a proper management network with a host interface.
-
@derelict Yes they do, show up as down but they still show up.
You mean in hostip addr
correct?Should Network Manager be disabled?
I was reading about it here:
https://www.thegeekdiary.com/centos-rhel-7-how-to-disable-networkmanager/amp/ -
@ralms-0 Not sure what you have going on.
ONBOOT=no
andNM_CONTROLLED=no
is all I have ever seen as being necessary as outlined in the docs. -
@derelict I didn't even do anything special, was just a standard installation from the ISO :(
-
@derelict
So, doingnmclid device status
I can confirm that those 2 interfaces are not being managed:Now how to make them literally not show, I haven't figured it out.
-
@ralms-0 I get this:
[root@tnsr-a1 etc]# nmcli device status Error: NetworkManager is not running.
Wonder what the difference between your KVM install and mine is.
I have not yet installed 21.03-2 from ISO here and have only upgraded existing installations. Let me see if I can do that soon and if there is any difference.
-
@ralms-0 No idea what the difference is. After reinstalling fresh with three vtnet adapters and configuring the third one in the installer as a host interface, Networkmanager is still not running and the first two interfaces are available to the dataplane with no action by me.
-
@derelict hm ok, I can reinstall again.
There must be something different.
What did you configure regarding networking during the wizard? -
@ralms-0 I just enabled the third interface and enabled DHCP on it. Set it to apply after boot and to the installer wizard. When I quit that it showed it had received a DHCP address on ens20.
-
@derelict hm ok, that was the same thing I've done, with the difference that I've set also DNS on the ens20, but that shouldn't be any different.
I will reinstall it and report back all the steps I've took :)
-
@derelict To split my replies, here is the installation wizard phase:
VM:
The network devices were added to the VM with Firewall off:
VM starting point:
Installation Wizard base:
Wizard starting point:
Changed timezone to Lisbon:
Setting storage:
Installation Wizard Network:
Starting point:
Set hostname to
tnsr
:
Configure ens20:
Set IPv4 to
dhcp
and IPv6 toignore
:
Tick option 7 and 8 for
Connect automatically after reboot
andApply Configuration in installer
Go back to wizard home and not touch the other 2 interfaces:
Installation Wizard finish:
Confirm the Software Selection:
Processing:
All set:
Being installation:
-
Installation Complete:
Automatic Reboot.
First Login:
Interfaces and Network Manager:
So it shows as "Not running", so I might have enabled it by accident when I set the NM_CONTROLLED.
show interface
returns nothing.
Set Interfaces Down as mention in the docs:
Default config from ENS18:
Set NM_CONTROLLED to no:
Without Reboot, still nothing:
Rebooted the system.
Interfaces still now showing up.
Set network name:
So yeah, I have no idea what is going on, what I'm doing wrong.
This is a Proxmox 6.3-6 running on an HP DL360p Gen8.
-
I am also struggling with this. This is my first time attempting to install, registered earlier this week. I am not using Proxmox, but KVM libvirt natively. I installed latest version from ISO, serial setup, with 8 virtio network bridge interfaces. I am not looking to pass anything through or use SR-IOV, as eventually I'd like this to migrate between my hosts. Upon completed installation, only my configured host interface is active. I tried both static and/or dhcp, this always works.
tnsr# show version Version: tnsr-v21.03-2 Build timestamp: Thu Mar 4 10:29:54 2021 CST
I dropped to host shell, NetworkManager service is not running.
systemctl start NetworkManager
systemctl enable NetworkManagerThis get's NetworkManager going again. I am not sure why it wasn't running and enabled. I am then able to issue "nmcli device status" to get results of all my interfaces. I then apply "NM_CONTROLLED=no" to respective ifcfg-enp1s0, etc interfaces so they are unmanaged. Reboot for good measure.
And thats where I'm at... It's still not working after reboot.
tnsr(config)# dataplane dpdk dev 0000:01:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 0000:02:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 0000:03:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 0000:04:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 0000:05:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 0000:06:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 0000:07:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 0000:08:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) default
Trying to configure name's for interfaces.
tnsr(config)# dataplane dpdk dev 0000:01:00.0 network name WAN1 Changes to dataplane startup settings require a dataplane restart to take effect. tnsr(config)# dataplane dpdk dev 0000:02:00.0 network name WAN2 Changes to dataplane startup settings require a dataplane restart to take effect. tnsr(config)# service dataplane restart
Show interface results in nothing.
I've read through release notes known issues and I am not sure if this is not a supported configuration or not and known. My KVM hosts, running Ubuntu 20.10, netplan interfaces are configured as bridges with VLAN's, on bonded 10Gbit interfaces.
-
Some more troubleshooting
[root@esg01 admin]# sudo dmesg | grep virtio_net [ 4.078543] virtio_net virtio1 enp2s0: renamed from eth1 [ 4.084586] virtio_net virtio2 enp3s0: renamed from eth2 [ 4.091946] virtio_net virtio3 enp4s0: renamed from eth3 [ 4.101909] virtio_net virtio4 enp5s0: renamed from eth4 [ 4.136131] virtio_net virtio0 enp1s0: renamed from eth0 [ 4.146091] virtio_net virtio5 enp6s0: renamed from eth5 [ 4.171685] virtio_net virtio7 enp8s0: renamed from eth7 [ 4.203872] virtio_net virtio6 enp7s0: renamed from eth6
[root@esg01 admin]# sudo tnsrctl status vpp.service: activating clixon-backend.service: activating clixon-restconf.service: activating tnsr-boot.service: active tnsr-dataplane-netns.service: active frr-dataplane.service: inactive strongswan-dataplane.service: inactive nginx-dataplane.service: inactive ntpd-dataplane.service: inactive unbound-dataplane.service: inactive sshd-dataplane.service: inactive snmp-subagent-dataplane.service: inactive snmpd-dataplane.service: inactive nginx.service: inactive ntpd.service: inactive snmp-subagent.service: inactive snmpd.service: inactive DHCPv4 server: inactive
[root@esg01 admin]# sudo systemctl status vpp ● vpp.service - Vector Packet Processing Process Loaded: loaded (/usr/lib/systemd/system/vpp.service; enabled; vendor preset: enabled) Drop-In: /usr/lib/systemd/system/vpp.service.d └─intentional-restart.conf, nm-wait-online.conf, on-failure.conf, requires-dataplane.conf Active: activating (auto-restart) (Result: exit-code) since Fri 2021-03-19 16:24:36 PDT; 4s ago Process: 2591 ExecStopPost=/bin/cp /etc/tnsr/tnsr-running.xml /etc/tnsr.xml (code=exited, status=0/SUCCESS) Process: 2590 ExecStopPost=/bin/echo TNSR startup mode switch : using running DB (code=exited, status=0/SUCCESS) Process: 2588 ExecStopPost=/bin/echo VPP stopped, modifying TNSR startup mode (code=exited, status=0/SUCCESS) Process: 2586 ExecStart=/usr/bin/vpp -c /etc/vpp/startup.conf (code=exited, status=1/FAILURE) Process: 2584 ExecStartPre=/sbin/modprobe uio_pci_generic (code=exited, status=0/SUCCESS) Process: 2582 ExecStartPre=/bin/rm -f /dev/shm/db /dev/shm/global_vm /dev/shm/vpe-api (code=exited, status=0/SUCCESS) Main PID: 2586 (code=exited, status=1/FAILURE) Tasks: 0 (limit: 49476) Memory: 0B CGroup: /system.slice/vpp.service [root@esg01 admin]# sudo systemctl status clixon-backend ● clixon-backend.service - Clixon backend Loaded: loaded (/usr/lib/systemd/system/clixon-backend.service; enabled; vendor preset: enabled) Active: activating (start-post) since Fri 2021-03-19 16:24:21 PDT; 46s ago Process: 2404 ExecStartPost=/usr/bin/echo TNSR startup mode switch : using none (code=exited, status=0/SUCCESS) Process: 2403 ExecStartPost=/usr/bin/cp -f /etc/tnsr/tnsr-none.xml /etc/tnsr.xml (code=exited, status=0/SUCCESS) Process: 2402 ExecStartPost=/usr/bin/echo clixon_backend started successfully, modifying TNSR startup mode (code=exited, status=0/SUCCESS) Process: 2341 ExecStart=/usr/sbin/clixon_backend (code=exited, status=0/SUCCESS) Tasks: 1 (limit: 49476) Memory: 23.8M CGroup: /system.slice/clixon-backend.service └─2401 /usr/sbin/clixon_backend Mar 19 16:24:21 esg01 clixon_backend[2341]: os_priv_change: changing uid from 0 to 0 Mar 19 16:24:21 esg01 clixon_backend[2341]: Mar 19 16:24:21: os_priv_change: changing uid from 0 to 0 Mar 19 16:24:21 esg01 clixon_backend[2341]: Startup successful, no backup needed Mar 19 16:24:21 esg01 clixon_backend[2341]: Mar 19 16:24:21: Startup successful, no backup needed Mar 19 16:24:21 esg01 clixon_backend[2401]: clixon_backend: 2401 Started Mar 19 16:24:21 esg01 systemd[1]: clixon-backend.service: Can't convert PID files /var/tnsr/tnsr.pidfile O_PATH file descriptor to proper file descriptor: Permission denied Mar 19 16:24:21 esg01 echo[2402]: clixon_backend started successfully, modifying TNSR startup mode Mar 19 16:24:21 esg01 echo[2404]: TNSR startup mode switch : using none Mar 19 16:24:21 esg01 systemd[1]: clixon-backend.service: Can't convert PID files /var/tnsr/tnsr.pidfile O_PATH file descriptor to proper file descriptor: Permission denied Mar 19 16:24:21 esg01 systemd[1]: clixon-backend.service: Can't convert PID files /var/tnsr/tnsr.pidfile O_PATH file descriptor to proper file descriptor: Permission denied [root@esg01 admin]# sudo systemctl status clixon-restconf ● clixon-restconf.service - Clixon restconf Loaded: loaded (/usr/lib/systemd/system/clixon-restconf.service; enabled; vendor preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Fri 2021-03-19 16:25:19 PDT; 2s ago Process: 2711 ExecStart=/www-data/clixon_restconf (code=exited, status=203/EXEC) Main PID: 2711 (code=exited, status=203/EXEC)
-
@lastactionhero Does the host see the interfaces when running something like
sudo ip link
in the host namespace?
If so, then you must get the host to relinquish the interfaces before you can add them to the dataplane.
I do not know off-hand the particular recipe for doing so in your case. All I know is I have zero problems installing tnsr in proxmox-wrapped KVM virtual machines using virtio NICs. If I ever do find myself in that predicament I just add ONBOOT and NM_CONTROLLED set to no and reboot.
tnsr-b1 tnsr# host shell ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 4: ens20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 2a:d4:de:20:c8:db brd ff:ff:ff:ff:ff:ff tnsr-b1 tnsr(config)# dataplane dpdk dev 0000:00:12.0 Ethernet controller: Red Hat, Inc. Virtio network device 0000:00:13.0 Ethernet controller: Red Hat, Inc. Virtio network device 0000:00:14.0 Ethernet controller: Red Hat, Inc. Virtio network device 0000:00:15.0 Ethernet controller: Red Hat, Inc. Virtio network device default tnsr-b1 tnsr# show config run json snip "dpdk": { "dev": [ { "id": "0000:00:12.0", "name": "outside" }, { "id": "0000:00:13.0", "name": "inside" }, { "id": "0000:00:15.0", "name": "opt1" } ] }, tnsr-b1 tnsr# show interface ip Interface: inside IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv4 addresses: 172.29.101.1/29 Interface: loop0 IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 Interface: opt1 IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv4 addresses: 172.29.105.1/24 Interface: outside IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv4 addresses: 172.25.228.57/24
I wish I could be more help right now but that's really all I have.
-
Some more troubleshooting. This looks to be permissions related issue.
@Derelict When you install, are you enabling root account with a password? Or are you creating a user account, making it a member of administrator (root)? I'm doing the later, making an admin user account, member of administrators (root). The root account remains disabled.
I was able to get this working, but only for troubleshooting more of the issue, not resolving it. I think I would need support at this point to chime in, fix a bug here, which I'm not going to pay for at this time. I really want this to just work because I've heard good things regarding performance and I really want this. I'd like to switch from my current VyOS which just works.
My steps above with NetworkManager work to sort out the interfaces making them unmanaged. As the journalctl -xe logs indicated, vpp.service is the issue, it's not starting. I can manually start this if I drop to host shell and execute vpp with the config.
[admin@esg01 ~]$ sudo /usr/bin/vpp -c /etc/vpp/startup.conf [sudo] password for admin: /usr/bin/vpp[7433]: perfmon: skipping source 'intel-uncore' - intel_uncore_init: no uncore units found /usr/bin/vpp[7433]: tls_init_ca_chain:609: Could not initialize TLS CA certificates /usr/bin/vpp[7433]: tls_mbedtls_init:644: failed to initialize TLS CA chain /usr/bin/vpp[7433]: tls_init_ca_chain:710: Could not initialize TLS CA certificates /usr/bin/vpp[7433]: tls_openssl_init:784: failed to initialize TLS CA chain
Once this is running, I am then able to see my interfaces in tnsr CLI, dataplane dpdk dev. Everything jives up at this point as you mention in your replies as it should. No more errors in the CLI either.
esg01 tnsr# show interface Interface: Guest Admin status: down Link down, link-speed 10 Gbps, unknown duplex Link MTU: 1500 bytes MAC address: 52:54:00:7b:f7:69 IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv6 MTU: 0 bytes IPv6 Route Table: ipv6-VRF:0 VLAN tag rewrite: disable Rx-queues queue-id 0 : cpu-id 1 counters: received: 0 bytes, 0 packets, 0 errors transmitted: 0 bytes, 0 packets, 0 errors protocols: 0 IPv4, 0 IPv6 0 drops, 0 punts, 0 rx miss, 0 rx no buffer Interface: Management Admin status: down Link down, link-speed 10 Gbps, unknown duplex Link MTU: 1500 bytes MAC address: 52:54:00:d4:64:23 IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv6 MTU: 0 bytes IPv6 Route Table: ipv6-VRF:0 VLAN tag rewrite: disable Rx-queues queue-id 0 : cpu-id 1 counters: received: 0 bytes, 0 packets, 0 errors transmitted: 0 bytes, 0 packets, 1 errors protocols: 0 IPv4, 0 IPv6 0 drops, 0 punts, 0 rx miss, 0 rx no buffer Interface: WAN1 Admin status: down Link down, link-speed 10 Gbps, unknown duplex Link MTU: 1500 bytes MAC address: 52:54:00:a5:5c:04 IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv6 MTU: 0 bytes IPv6 Route Table: ipv6-VRF:0 VLAN tag rewrite: disable Rx-queues queue-id 0 : cpu-id 1 counters: received: 0 bytes, 0 packets, 0 errors transmitted: 0 bytes, 0 packets, 0 errors protocols: 0 IPv4, 0 IPv6 0 drops, 0 punts, 0 rx miss, 0 rx no buffer Interface: WAN2 Admin status: down Link down, link-speed 10 Gbps, unknown duplex Link MTU: 1500 bytes MAC address: 52:54:00:f1:5a:97 IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv6 MTU: 0 bytes IPv6 Route Table: ipv6-VRF:0 VLAN tag rewrite: disable Rx-queues queue-id 0 : cpu-id 1 counters: received: 0 bytes, 0 packets, 0 errors transmitted: 0 bytes, 0 packets, 0 errors protocols: 0 IPv4, 0 IPv6 0 drops, 0 punts, 0 rx miss, 0 rx no buffer Interface: WiFi Admin status: down Link down, link-speed 10 Gbps, unknown duplex Link MTU: 1500 bytes MAC address: 52:54:00:22:9e:9b IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv6 MTU: 0 bytes IPv6 Route Table: ipv6-VRF:0 VLAN tag rewrite: disable Rx-queues queue-id 0 : cpu-id 1 counters: received: 0 bytes, 0 packets, 0 errors transmitted: 0 bytes, 0 packets, 0 errors protocols: 0 IPv4, 0 IPv6 0 drops, 0 punts, 0 rx miss, 0 rx no buffer Interface: Workstation Admin status: down Link down, link-speed 10 Gbps, unknown duplex Link MTU: 1500 bytes MAC address: 52:54:00:45:31:92 IPv4 MTU: 0 bytes IPv4 Route Table: ipv4-VRF:0 IPv6 MTU: 0 bytes IPv6 Route Table: ipv6-VRF:0 VLAN tag rewrite: disable Rx-queues queue-id 0 : cpu-id 1 counters: received: 0 bytes, 0 packets, 0 errors transmitted: 0 bytes, 0 packets, 0 errors protocols: 0 IPv4, 0 IPv6 0 drops, 0 punts, 0 rx miss, 0 rx no buffer esg01 tnsr#
I will reinstall again tomorrow trying with root account enabled. I don't know what else to do beyond this.