Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2.3.1-p1 Unstable on Hyper-V (packet loss)

    Scheduled Pinned Locked Moved Virtualization
    21 Posts 9 Posters 6.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      pciccone
      last edited by

      We run a very stable 2.2.6 release on Hyper-V. When we upgraded to 2.3.0 it would freeze hard periodically. We understand that was fixed in 2.3.1. Great! We upgraded to 2.3.1-p1 but now see extreme packet loss that comes in spurts on all interfaces (it comes, then goes away for a few minutes). The loss is significant enough that it will break RDP or SSH sessions across the firewall. We can, however, keep a clean ping running from the WAN to the pfSense appliance itself. We quickly had to roll-back the snapshot to 2.2.6 as it was in production use.

      I am opening this forum post in case others have this problem, maybe we can find a pattern / cause.

      Our pfSense environment uses a lot of features. We do VLAN pass-through/trunking via single NIC on Hyper-V, we use IPSec, OpenVPN and Snort.

      1 Reply Last reply Reply Quote 0
      • I
        InAr
        last edited by

        We switched from PFS 2.2.6 to 2.3.1-p1 on hyper-v hoping for the apinger/dpinger change to solve some problems when switching gateways in groups.

        With 2.3.1-p1 we are seeing the same extreme packet loss now.

        Our pfSense environment is quite simple: 2 Gateway groups, 1 OpenVPN connection.
        It has 2 dedicated NICs (1 for each WAN connection) and 1 NIC shared with other VMs for LAN.
        1 gateway group for openvpn traffic, the other gateway group with inverted tiers for internet access.

        It looks like if I put some traffic on one of the WAN connections both tend to get a lot of packet loss and the remotedesktop access via openvpn connection (and the other connection too) gets very laggy.

        I'll roll-back to 2.2.6 today and check if this fixes the problem. Maybe I just didn't recognize it under 2.2.6.

        1 Reply Last reply Reply Quote 0
        • P
          pciccone
          last edited by

          Do indeed let me know if you become stable on your 2.2.6 rollback. I have a feeling things will go back to normal. Also check (at least for testing) that VMQ is disabled on Hyper-V for all vNICs on the pfSense VM. There are some manufactures that this will cause packet loss. This was a known issue for us, on certain DELL servers that use Broadcom (I think?) chipsets.

          1 Reply Last reply Reply Quote 0
          • I
            InAr
            last edited by

            We have/had VMQ disabled on all interfaces with 2.2.6 and with 2.3.1-p1.

            With 2.3.1-p1 the packet loss started every morning when user started to login onto the ms remotedesktop server via openvpn connection or whenever a bit more traffic occured.

            Now back to 2.2.6 everything seems to works fine.
            This morning the connection remained stable without any packet loss during the usual rdp login time.
            And putting some heavy load onto the wan connections isn't causing any packet loss.

            1 Reply Last reply Reply Quote 0
            • P
              pciccone
              last edited by

              I just found another case of this yesterday where I had to revert this. Completely different network, WAN, building, server, etc. Exact same behavior we are describing. Reverting to 2.2.6 again resolved the problem.

              Please let this forum post serve as warning to Hyper-V users. Do not upgrade to 2.3.x until this serious issue can be diagnosed and resolved. Stay on 2.2.6 which appears to be extremely stable on Hyper-V.

              Phil

              1 Reply Last reply Reply Quote 0
              • J
                JasonJoel
                last edited by

                Well, I have been on 2.3 since it's release on Hyper-V 2012 R2…. And everything has worked perfectly.

                So the issue certainly is not universal. It could be dependent on packages installed, and VM configuration I suppose.

                1 Reply Last reply Reply Quote 0
                • P
                  pciccone
                  last edited by

                  May I ask, is your traffic substantial? We did not notice it at our first upgrade location as traffic was casual. We just had some drops but no one noticed until we ramped up traffic.

                  Phil

                  1 Reply Last reply Reply Quote 0
                  • J
                    JasonJoel
                    last edited by

                    Substantial is all relative, of course.

                    I would call mine not substantial though. The link is 300 Mbit down, 20 mbit up.

                    I regularly do 250 mbit down sustained, but only for short times (10-20 minutes), and my total simultaneous users is low (50 maybe).

                    The pfSense box is also doing inter-VLAN routing, but again, only ~50 nodes.

                    1 Reply Last reply Reply Quote 0
                    • C
                      cmb
                      last edited by

                      Those who are having issues, what Windows version?

                      It's certainly not a universal problem with Hyper-V, but from the sounds of it there must be something to it in some edge case.

                      1 Reply Last reply Reply Quote 0
                      • P
                        pciccone
                        last edited by

                        Both of my two cases are Hyper-V on Windows 2012 R2. They are both managed under Systems Center 2012 (SCVMM). They both use DELL hardware. One is using NIC trunking, but the other is not. Both have IPsec tunnels. One of my locations is a branch office, I can clone the 2.2.6 VM and upgrade the clone to do parallel testing if you want to look at this further. The other unit is in a data center handling very critical traffic. But, if we find it on one, then no doubt it will fix us globally.

                        Phil C

                        1 Reply Last reply Reply Quote 0
                        • I
                          InAr
                          last edited by

                          My case is Hyper-V on Windows 2012 R2 (Datacenter), using HP hardware (ProLiant ML350 G6).

                          1xNIC "HP NC382T PCIe DP" (2 Ports - 1.Port NIC Team#1 Hyper-V Host, 2.Port NIC Team#2 Hyper-V VMs)
                          1xNIC "HP NC326i PCIe Dual Port" (2 Ports - 1.Port NIC Team#1 Hyper-V Host, 2.Port NIC Team#2 Hyper-V VMs)
                          1xNIC "Intel(R) PRO/1000 PT"  (2 Ports - 1. Port = WAN1, 2.Port = OPT1)

                          The PFSense VM uses Team#2 for its LAN interface, Intel Port 1 for WAN1, Intel Port 2 for OPT1.

                          VMQ is disabled on all VMs/interfaces.

                          1 Reply Last reply Reply Quote 0
                          • T
                            tsolp2001
                            last edited by

                            Same problem here after upgrading to 2.3.1

                            Running Server 2012 (not R2) with 3 network cards.

                            Watching Video Streams is a mess. always interrupts, and broken remote sessions too.

                            Update to 2.3.1p5 no change.

                            1 Reply Last reply Reply Quote 0
                            • T
                              tsolp2001
                              last edited by

                              No movement here. Tried some dev releases no change so far.

                              Is there a way to get back to 2.2.6
                              Didn't find the download, have a 2.2.4 image, can it be upgraded to 2.2.6 and not to the latest release?
                              Can I restore a 2.3.1 backup to 2.2.6?

                              Thx for your support

                              1 Reply Last reply Reply Quote 0
                              • H
                                headhunter_unit23
                                last edited by

                                I had the same issues with pci-passthrough on esxi 5.1 and a DUAL NIC Intel PCI-E card (82575EB); awful latency and packet loss.

                                I removed the pci-passthrough, added the NICs to a virtual switch and used virtual nics instead and everything is back to normal.

                                Had the same issue with Hyper-V server 2012 r2 on a Supermicro with 2x 10GB onboard NICs and thought it was a port negociation problem. Switched to virtual NICs and the problem was gone.

                                But it might not be related with pci-passthrough for all of you.

                                Are you guys using pci-passthrough?

                                1 Reply Last reply Reply Quote 0
                                • K
                                  kapara
                                  last edited by

                                  @tsolp2001:

                                  No movement here. Tried some dev releases no change so far.

                                  Is there a way to get back to 2.2.6
                                  Didn't find the download, have a 2.2.4 image, can it be upgraded to 2.2.6 and not to the latest release?
                                  Can I restore a 2.3.1 backup to 2.2.6?

                                  Thx for your support

                                  You can update or reinstall 2.2.6 and restore config.  I ran into this problem when I tried to upgrade from 2.2.2 to 2.2.6 and could not find the update as 2.3.1 was the only one available.  So I updated to 2.3.1 and the firewall would not even boot. Tthey must have made some major changes as I used to always be able to upgrade versions.  I also do not think they tested in Hyper-V to check compatibility.

                                  Luckily I did a snapshot before upgrading so I was able to restore back.

                                  2.2.6 update: https://atxfiles.pfsense.org/mirror/updates/old/pfSense-Full-Update-2.2.6-RELEASE-amd64.tgz

                                  2.2.6 full:  https://portal.pfsense.org/firmware/2.2.6/

                                  Skype ID:  Marinhd

                                  1 Reply Last reply Reply Quote 0
                                  • C
                                    cmb
                                    last edited by

                                    @kapara:

                                    I also do not think they tested in Hyper-V to check compatibility.

                                    Not true or even close to it. We fully verified Hyper-V and Azure. Microsoft themselves even tested 2.3 as well to approve it for Azure.

                                    If it didn't boot, it's probably because of the drive type change from old versions that made the fstab invalid, so it needed updating.

                                    1 Reply Last reply Reply Quote 0
                                    • K
                                      kapara
                                      last edited by

                                      @CMB

                                      Sorry I may not have been clear.  I meant that the upgrade process may not have been tested.  If so is there any documentation that explains what needs to be done when upgrading form 2.2.x to 2.3 in hyper-v so that you do not get the mount error?

                                      Skype ID:  Marinhd

                                      1 Reply Last reply Reply Quote 0
                                      • C
                                        cmb
                                        last edited by

                                        https://doc.pfsense.org/index.php/Upgrade_Guide#Disk_Driver_Changes

                                        should be fine just running ufslabels.sh prior to upgrade. Otherwise manually specify the appropriate drive at the mountroot prompt. ufs:/dev/da0s1a replacing da0 as needed.

                                        1 Reply Last reply Reply Quote 0
                                        • K
                                          kapara
                                          last edited by

                                          Thank you!  ;D

                                          Skype ID:  Marinhd

                                          1 Reply Last reply Reply Quote 0
                                          • E
                                            Enrica_CH
                                            last edited by

                                            I have the same situation with Pfsense 2.3.2 on KVM (Proxmox PVE) with virtio nic drivers. I use two WANs with routing groups. Both have significant package losses. One of these WAN interfaces switches to offline sometimes and stays in this status. I have a second Pfsense on an APU board with CARP with same issue.

                                            I use following services:

                                            • Dual WAN with three routing groups
                                            • OpenVPN
                                            • CARP
                                            • Captive Portal
                                            • Free Radius
                                            • Watch Dog

                                            I did some investigations and found following other behaviours than in 2.2.6:

                                            • I find in syslog "check_reload_status" with "reloading filter". This interrupts the traffic and provoques packages losses. This reload is absolutely unnecessary.
                                            • Every few minutes there is a process "xinetd" with "readjusting service 6969-udp" even if TFTP-Proxy isn't activated. This service doesn't stop.

                                            I tried to switch off "Flush all states when a gateway goes down" to avoid state killing if an interface is shortly stated as offline. But if the interface doesn't come up again users are excluded from internet access because the switch from tier1 to tier2 is done but the routing state isn't killed.

                                            So it's really unusable and I have to go back to 2.2.6 also for the moment. But how can we find out if 2.3.x will be ok?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.