2.3.1-p1 Unstable on Hyper-V (packet loss)
-
I just found another case of this yesterday where I had to revert this. Completely different network, WAN, building, server, etc. Exact same behavior we are describing. Reverting to 2.2.6 again resolved the problem.
Please let this forum post serve as warning to Hyper-V users. Do not upgrade to 2.3.x until this serious issue can be diagnosed and resolved. Stay on 2.2.6 which appears to be extremely stable on Hyper-V.
Phil
-
Well, I have been on 2.3 since it's release on Hyper-V 2012 R2…. And everything has worked perfectly.
So the issue certainly is not universal. It could be dependent on packages installed, and VM configuration I suppose.
-
May I ask, is your traffic substantial? We did not notice it at our first upgrade location as traffic was casual. We just had some drops but no one noticed until we ramped up traffic.
Phil
-
Substantial is all relative, of course.
I would call mine not substantial though. The link is 300 Mbit down, 20 mbit up.
I regularly do 250 mbit down sustained, but only for short times (10-20 minutes), and my total simultaneous users is low (50 maybe).
The pfSense box is also doing inter-VLAN routing, but again, only ~50 nodes.
-
Those who are having issues, what Windows version?
It's certainly not a universal problem with Hyper-V, but from the sounds of it there must be something to it in some edge case.
-
Both of my two cases are Hyper-V on Windows 2012 R2. They are both managed under Systems Center 2012 (SCVMM). They both use DELL hardware. One is using NIC trunking, but the other is not. Both have IPsec tunnels. One of my locations is a branch office, I can clone the 2.2.6 VM and upgrade the clone to do parallel testing if you want to look at this further. The other unit is in a data center handling very critical traffic. But, if we find it on one, then no doubt it will fix us globally.
Phil C
-
My case is Hyper-V on Windows 2012 R2 (Datacenter), using HP hardware (ProLiant ML350 G6).
1xNIC "HP NC382T PCIe DP" (2 Ports - 1.Port NIC Team#1 Hyper-V Host, 2.Port NIC Team#2 Hyper-V VMs)
1xNIC "HP NC326i PCIe Dual Port" (2 Ports - 1.Port NIC Team#1 Hyper-V Host, 2.Port NIC Team#2 Hyper-V VMs)
1xNIC "Intel(R) PRO/1000 PT" (2 Ports - 1. Port = WAN1, 2.Port = OPT1)The PFSense VM uses Team#2 for its LAN interface, Intel Port 1 for WAN1, Intel Port 2 for OPT1.
VMQ is disabled on all VMs/interfaces.
-
Same problem here after upgrading to 2.3.1
Running Server 2012 (not R2) with 3 network cards.
Watching Video Streams is a mess. always interrupts, and broken remote sessions too.
Update to 2.3.1p5 no change.
-
No movement here. Tried some dev releases no change so far.
Is there a way to get back to 2.2.6
Didn't find the download, have a 2.2.4 image, can it be upgraded to 2.2.6 and not to the latest release?
Can I restore a 2.3.1 backup to 2.2.6?Thx for your support
-
I had the same issues with pci-passthrough on esxi 5.1 and a DUAL NIC Intel PCI-E card (82575EB); awful latency and packet loss.
I removed the pci-passthrough, added the NICs to a virtual switch and used virtual nics instead and everything is back to normal.
Had the same issue with Hyper-V server 2012 r2 on a Supermicro with 2x 10GB onboard NICs and thought it was a port negociation problem. Switched to virtual NICs and the problem was gone.
But it might not be related with pci-passthrough for all of you.
Are you guys using pci-passthrough?
-
No movement here. Tried some dev releases no change so far.
Is there a way to get back to 2.2.6
Didn't find the download, have a 2.2.4 image, can it be upgraded to 2.2.6 and not to the latest release?
Can I restore a 2.3.1 backup to 2.2.6?Thx for your support
You can update or reinstall 2.2.6 and restore config. I ran into this problem when I tried to upgrade from 2.2.2 to 2.2.6 and could not find the update as 2.3.1 was the only one available. So I updated to 2.3.1 and the firewall would not even boot. Tthey must have made some major changes as I used to always be able to upgrade versions. I also do not think they tested in Hyper-V to check compatibility.
Luckily I did a snapshot before upgrading so I was able to restore back.
2.2.6 update: https://atxfiles.pfsense.org/mirror/updates/old/pfSense-Full-Update-2.2.6-RELEASE-amd64.tgz
2.2.6 full: https://portal.pfsense.org/firmware/2.2.6/
-
I also do not think they tested in Hyper-V to check compatibility.
Not true or even close to it. We fully verified Hyper-V and Azure. Microsoft themselves even tested 2.3 as well to approve it for Azure.
If it didn't boot, it's probably because of the drive type change from old versions that made the fstab invalid, so it needed updating.
-
Sorry I may not have been clear. I meant that the upgrade process may not have been tested. If so is there any documentation that explains what needs to be done when upgrading form 2.2.x to 2.3 in hyper-v so that you do not get the mount error?
-
https://doc.pfsense.org/index.php/Upgrade_Guide#Disk_Driver_Changes
should be fine just running ufslabels.sh prior to upgrade. Otherwise manually specify the appropriate drive at the mountroot prompt. ufs:/dev/da0s1a replacing da0 as needed.
-
Thank you! ;D
-
I have the same situation with Pfsense 2.3.2 on KVM (Proxmox PVE) with virtio nic drivers. I use two WANs with routing groups. Both have significant package losses. One of these WAN interfaces switches to offline sometimes and stays in this status. I have a second Pfsense on an APU board with CARP with same issue.
I use following services:
- Dual WAN with three routing groups
- OpenVPN
- CARP
- Captive Portal
- Free Radius
- Watch Dog
I did some investigations and found following other behaviours than in 2.2.6:
- I find in syslog "check_reload_status" with "reloading filter". This interrupts the traffic and provoques packages losses. This reload is absolutely unnecessary.
- Every few minutes there is a process "xinetd" with "readjusting service 6969-udp" even if TFTP-Proxy isn't activated. This service doesn't stop.
I tried to switch off "Flush all states when a gateway goes down" to avoid state killing if an interface is shortly stated as offline. But if the interface doesn't come up again users are excluded from internet access because the switch from tier1 to tier2 is done but the routing state isn't killed.
So it's really unusable and I have to go back to 2.2.6 also for the moment. But how can we find out if 2.3.x will be ok?
-
Does anyone know what the underlying issue of this was? Or if it is going to be resolved? I had 2.3.1 in Hyper-V on 2012R2 experiencing HEAVY packet loss when approaching 5mbps on our MPLS. Once I downgraded to 2.2.6 everything was fine again. I couldn't find any bug referencing this issue. I'm glad to find this problem is more widespread then just me.