Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Need Help Resolving ?Asymmetric Routing? Issue in a Network with pfSense and Netgear Managed Switch (GS724Tv4)

    L2/Switching/VLANs
    4
    31
    2.0k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • O
      oliverus000
      last edited by oliverus000

      Hello everyone,

      I'm currently facing a challenging issue with asymmetric routing in my network setup and would appreciate any insights or suggestions you might have. Here's a brief overview of my network configuration:

      Internet Connection: DSL connected to a FritzBox router (IP: 192.168.10.1) using PPPoE.

      Firewall: pfSense firewall, running on a Proxmox server with 4 NICs. NIC1 is the proxmox management port connected to VLAN 100 (switch), NIC2 and NIC3 are bridged in pfSense, serving as WAN and LAN, and connected to a trunk port on the switch (g24). NIC4 bridged to pfSense and is connected to VLAN 100 (switch) to manage the pfSense. The pfSense WAN interface has the IP 192.168.10.2 and is connected to FritzBox. VLAN 100 and VLAN 10 both bridged to NIC3 in pfSense.
      cf6fb61a-242b-40be-843e-f92df86297b5-image.png

      Managed Switch: Netgear Managed Switch handling two VLANs - Trusted (VLAN 10) and Management (VLAN 100). The switch is connected to pfSense firewall via a trunk port. Other ports on the switch are untagged, connecting devices belonging to the respective VLANs.
      0290d0b8-e4c2-4936-ba02-6dbf79359c23-image.png

      Network Layout: VLAN 100 is configured with the IP range 10.76.100.0/24, and the pfSense interface for this VLAN is 10.76.100.1. VLAN 10 has the IP range 10.76.28.0/24. My pfSense is set up to handle all inter-VLAN routing, while the Netgear switch is responsible for VLAN creation and port assignments.

      Issue:
      I'm experiencing asymmetric routing issues, particularly in VLAN 100. Devices in VLAN 10 trying to communicate with those in VLAN 100 often encounter dropped packets or connection instability, which I suspect is due to asymmetric routing. This is especially noticeable when connecting from a VLAN 10 device to Proxmox (via VLAN 100 IP address) via a browser and then using novnc to access one of the VMs. It constantly gives me a Code 1006 Disconnect Error after 5-10 seconds after using the VM.

      This is the firewall log
      4f6bc6e1-181f-4ad2-b4df-7db1db16fa56-image.png

      Troubleshooting Steps Taken:
      Ensured all VLAN configurations are correct on both the pfSense and Netgear switch.
      Simplified pfSense firewall rules to a bare minimum for troubleshooting purposes. Currently, there are no specific rules in the WAN interface, and all traffic is allowed to pass freely in both VLAN 10 and VLAN 100.
      Checked firewall rules in pfSense to ensure they aren't inadvertently blocking or rerouting traffic.

      Checked https://docs.netgate.com/pfsense/en/latest/routing/static.html Did not work.

      Specific Questions:
      How can I ensure that all return traffic in VLAN 100 (or other VLANs) consistently routes through the pfSense firewall?
      Are there specific settings on the Netgear Managed Switch that I should look at which might be causing or contributing to this asymmetric routing issue?
      Could this issue be related to the way pfSense handles VLANs or firewall rules that I might have overlooked?
      Any advice or diagnostic tips you could offer would be greatly appreciated. Thank you in advance for your time and assistance!

      Best regards,

      johnpozJ C 2 Replies Last reply Reply Quote 0
      • johnpozJ
        johnpoz LAYER 8 Global Moderator @oliverus000
        last edited by

        @oliverus000 you sure its not just a state expired or was deleted?

        Now if those were SA, yeah it would scream asymmetrical flow, but just A being blocked could just be a state is gone... Few different reasons why that could happen.. PA is just an Ack with the PSH flag set..

        An intelligent man is sometimes forced to be drunk to spend time with his fools
        If you get confused: Listen to the Music Play
        Please don't Chat/PM me for help, unless mod related
        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

        O 1 Reply Last reply Reply Quote 0
        • O
          oliverus000 @johnpoz
          last edited by oliverus000

          @johnpoz Here an extract of the firewall from one sesssion to reproduce the error (only blocks displayed):
          f9856bec-b547-48bc-9fe1-7505e7878e98-image.png

          My client x.28.20 (in VLAN10) connects via browser to Proxmox via an address in VLAN100 (x.100.221) and then I open a novnc session. After 5-10 seconds of being able to interact with the vms bash it suddenly disconnects (with code 1006).

          As I said, I am only assuming something asymmetric but as of now I cant prove it (except the logs from the firewall).

          There is no other obvious log so I am really stuck at the moment...

          H 1 Reply Last reply Reply Quote 0
          • H
            heper @oliverus000
            last edited by

            @oliverus000
            i've never used netgear switches but something in the screenshot has me wondering.

            i'm assuming g24 is your uplink (vlan trunk to pfsense)?
            -your pvid is set to vlan1 but that portmembership doesn't include vlan1 - it shouldn't really matter ... but who knows
            -how is the relationship between "vlan member" and "vlan tag" on netgear hardware? are you certain the switch isn't sending untagged frames towards pfsense? (could try to change "acceptable frame types)

            • what the reason for the port priority settings ?

            I've seen some buggy switches (or firmware) in the past .... so it wouldn't surprise me if something wonky is happening with tagged/untagged frames.

            Also re-verify your proxmox configuration. If your virtualswitches somehow strip/change vlan tags in either direction then that could also cause a heap of trouble

            O 1 Reply Last reply Reply Quote 0
            • O
              oliverus000 @heper
              last edited by

              @heper
              Great points. Let me try to tackle them one by one:

              Yes, g24 is Trunk and connected to pfsense to the port vtnet2 (as in the screenshot above) No other devices are connected to any other port except the untagged ones.

              • g24 has PVID 1 because I just didnt touch it and it was defaulted... (netgear has VLAN1 as default management) Any ideas what it should be and I will try it out..
              • relationship of vlan member and vlan tag: the device I am runnning the browser is connected to g4 on the switch and sends untagged packages which should become tagged once they enter the switch and then should be routed to g24 with a tag attached. I've changed the config for g24 to Acceptable Frame Types "VLAN Only" since this port SHOULD only handle VLAN traffic since it only improves the setup. But the error unfortunetaly persists.
              • Port Priority was just a desperate try to improve the flow (since i thought this would might help but it didn't. I have set back every priority to 0. But the error still persists.
              • Firmwar of the netgear has been updated to its newest version.
              • Proxmox Config: All ports are VLAN Aware except the one connected to the fritzbox router.

              I am becoming more and more desperate 🦌

              Attached a log from wireshark during the failure of such a session:
              1cd6746b-b4bf-43c5-af16-a66c995b3016-image.png

              johnpozJ 1 Reply Last reply Reply Quote 0
              • johnpozJ
                johnpoz LAYER 8 Global Moderator @oliverus000
                last edited by

                @oliverus000 still not seeing any SA blocks.. I see a FA there, which could close the states..

                If your flow was asymmetrical, your SA (syn,ack) would be seen on pfsense on the wrong interface for where the state was created.. Not saying you don't have asymmetrical flow.. But I do not see a smoking gun that shows that..

                if your flow was truly asymmetrical you wouldn't be able to actually make a connection, because the syn,ack wouldn't be allowed because of lack of state.

                Its quite possible your states are just being reset.. I believe out of the box pfsense will reset states on a wan IP change, or loss of wan, etc.

                NIC2 and NIC3 are bridged in pfSense

                You have a bridge setup in pfsense? Why? Or do you have a "bridge" in your VM software to the physical nic on the box?

                An intelligent man is sometimes forced to be drunk to spend time with his fools
                If you get confused: Listen to the Music Play
                Please don't Chat/PM me for help, unless mod related
                SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                O 1 Reply Last reply Reply Quote 0
                • O
                  oliverus000 @johnpoz
                  last edited by

                  @johnpoz
                  I really wanted to dig deeper on the states situation and tried to look at the state in pfsense but when I look for the state that has my destination 10.76.100.221 included the result shows no matches (while I had my browser running the usecase described above):
                  c21e7f77-4e78-4ed5-b663-f964bd025ec7-image.png

                  Why is there no State at all? Am I missing out on something? A state should be there until it really gets reset, right?

                  To the bridge topic: pfsense runs as a VM on proxmox and therefore I need to bridge the NICs to make them available to the VM.

                  johnpozJ 1 Reply Last reply Reply Quote 0
                  • johnpozJ
                    johnpoz LAYER 8 Global Moderator @oliverus000
                    last edited by johnpoz

                    @oliverus000 if you see no states for an IP that is part of a active conversation that is working, then that screams the traffic is not going through pfsense at all.. Not possible for traffic to be flowing through pfsense without a state..

                    So either the state just got flushed and you haven't noticed that conversation isn't working - or your traffic is not flowing through pfsense like you want it too..

                    If I was to guess, your issue would be something related to your VM and bridging setup where you don't actually have stuff isolated like you think you do.

                    An intelligent man is sometimes forced to be drunk to spend time with his fools
                    If you get confused: Listen to the Music Play
                    Please don't Chat/PM me for help, unless mod related
                    SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                    O 1 Reply Last reply Reply Quote 0
                    • O
                      oliverus000 @johnpoz
                      last edited by

                      @johnpoz
                      I tried it now intensively and I could reproduce the behaviour: When I open a pure novnc window shell, nothing else in the background it creates a couple of states (different ports to the same ip, dont ask me why proxmox or novnc does that)
                      57d6e25d-271f-49bd-b1c3-ee06124e2321-image.png

                      Then suddenly when something bad happens - ALL of them are gone and the table is empty
                      acd63759-e47e-47a1-ae64-c98368a4adf6-image.png

                      Any chance to find out why out of a sudden all states are getting flushed, even the closed ones??

                      johnpozJ 1 Reply Last reply Reply Quote 0
                      • johnpozJ
                        johnpoz LAYER 8 Global Moderator @oliverus000
                        last edited by johnpoz

                        @oliverus000 is your wan IP changing?

                        There is this setting..

                        states.jpg

                        Under advanced / misc

                        Then there is this setting on specific gateway under routing

                        specificgate.jpg

                        I would look in your logs - do you see anything around the time you see the states go away?

                        But yeah if your states go away, that would for sure explain why your seeing the blocks in your firewall.

                        But also keep in mind, closed states will drop off the states list after the specific timing..

                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                        If you get confused: Listen to the Music Play
                        Please don't Chat/PM me for help, unless mod related
                        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                        O 1 Reply Last reply Reply Quote 0
                        • O
                          oliverus000 @johnpoz
                          last edited by oliverus000

                          @johnpoz
                          These two settings are set to do not kill states...

                          To the logs:
                          I cleared all logs before running my test. Performed the test and checked ALL logs:
                          System/General: nothing
                          System/Gateway: nothing
                          System/Routing: nothing
                          System/GUI Service: nothing special

                          Firewall: the same as already posted, a lot of blocks

                          DHCP: nothing special

                          Rest is empty.

                          :-(

                          1 Reply Last reply Reply Quote 0
                          • C
                            coxhaus @oliverus000
                            last edited by coxhaus

                            @oliverus000
                            If you are running L3 switching then look at your gateways.
                            If you are not running L3 switching then it is not asymmetric routing as routing is layer 3.
                            By the way I am doing asymmetrical routing and it works on my current setup. I use Cisco for my L3 switching.

                            O johnpozJ 2 Replies Last reply Reply Quote 0
                            • O
                              oliverus000 @coxhaus
                              last edited by

                              @coxhaus
                              Yes my Switch is running on L3. Can you elaborate a little bit more what you mean by Gateways? As of now i have not specifically assigned any gateway information on the switch itself. I have only created the VLANs and the port assignments (PVID and Untagged/Tagged) on the switch itself. I have not touched any routing configuration on the switch since this should be handled via pfsense. The connected clients get the gateway info from DHCP which tells all the clients to use the specified VLAN gateway x.x.100.1 and x.x.28.1

                              @johnpoz I have checked the proxmox network and bridge config and I am clueless what can be improved :-(

                              1 Reply Last reply Reply Quote 0
                              • johnpozJ
                                johnpoz LAYER 8 Global Moderator @coxhaus
                                last edited by johnpoz

                                @coxhaus said in Need Help Resolving ?Asymmetric Routing? Issue in a Network with pfSense and Netgear Managed Switch (GS724Tv4):

                                By the way I am doing asymmetrical routing and it works on my current setup

                                Why would anyone do asymmetrical routing on purpose? Please explain..

                                I have done it when there is no other way, you can do host routing to work around it.. But why would anyone design a network to be asymmetrical? My answer to that would be your doing it wrong.

                                My switch is in L3 mode, but I am currently not routing anything on the switch, but I could if I wanted to - but routing on the switch does not mean your doing asymmetrical routing.. You would use a transit network

                                A switch with a trunkport and then ports in access mode doesn't say asymmetrical routing - Do you have svi set on the switch for these vlans and then pointing to them as gateways on the devices in these vlans vs the IPs on pfsense in those vlans?

                                An intelligent man is sometimes forced to be drunk to spend time with his fools
                                If you get confused: Listen to the Music Play
                                Please don't Chat/PM me for help, unless mod related
                                SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                O C 2 Replies Last reply Reply Quote 0
                                • O
                                  oliverus000 @johnpoz
                                  last edited by

                                  @johnpoz said in Need Help Resolving ?Asymmetric Routing? Issue in a Network with pfSense and Netgear Managed Switch (GS724Tv4):

                                  A switch with a trunkport and then ports in access mode doesn't say asymmetrical routing - Do you have svi set on the switch for these vlans and then pointing to them as gateways on the devices in these vlans vs the IPs on pfsense in those vlans?

                                  If this question was for me here what I have setup (no routing for VLAN at all)
                                  8916c028-1928-4574-851b-8724bdd99aca-image.png

                                  038b56a0-14b4-47a8-88e2-6d18ca647ee3-image.png

                                  0452c397-c821-42ef-8d80-f07c83a4bac5-image.png

                                  Routing in pfsense defined as follows:
                                  ba5d5216-8a7d-49a0-8bf3-5cd3ab00b53b-image.png

                                  Same for the other vlan10.

                                  Firewall for 10 and 100, Pass all traffic:
                                  226cb920-e280-49dd-b583-00b49f06f888-image.png

                                  DHCPs for 10 and 100:
                                  86686e28-498d-4d2d-8119-6fade621cf4c-image.png

                                  johnpozJ 1 Reply Last reply Reply Quote 0
                                  • johnpozJ
                                    johnpoz LAYER 8 Global Moderator @oliverus000
                                    last edited by johnpoz

                                    @oliverus000 So you have no other IPs on the switch, other than its management IP, and your not pointing the gateway on any clients to these IPs on the switch..

                                    If your not doing anything like that, then you wouldn't have asymmetrical routing.. You say the states just go away? That would be problematic..

                                    Asymmetrical routing on a firewall causes issues when return traffic hits the firewall, but there is no state to allow the traffic..

                                    This is a typical scenario where you would have asymmetrical traffic...

                                    ass.jpg

                                    Client via some other router that has connection sends the syn to the destination.. But the device sending the syn,ack back to some other router.. When this router is a firewall as well.. Since it never saw the syn, it has no state to allow the return syn,ack - and would block this traffic.

                                    Normally you would see this..

                                    acks.jpg

                                    So when you send a syn, and the firewall allows it creates a state.. And sends the traffic on.. The syn,ack back is allowed by the state.. Now you have traffic flowing in both directs, just normal acks.. if the state goes away.. Traffic in either direction would be blocked.

                                    Until a new state is created via a syn..

                                    If in your blocks on your firewall you were seeing SA blocked, that would scream there is an asymmetrical flow that the firewall is not going to allow.. When you see just acks blocked, this points to just a removal of a state..

                                    Either they just timed out because there was no traffic keeping them open, or they were deleted/killed. If devices are talking to each other an there is no traffic being sent, the state will timeout and close.. Now if one of the clients says hey I wasn't done talking here is some data and sends an ack, that ack will be blocked because there is no state.. Doesn't matter which end is sending the ack..

                                    edit: once a handshake has been completed, ie the syn / syn,ack / ack - now all traffic between these devices wil have the ack flag on them..

                                    ack.jpg

                                    If there is no existing state - this traffic will be blocked in either direction.. Just seeing blocks for Acks - where a connection was working before points to a loss of state.. You can see this with phones or wifi devices quite often where they will say wake up out of standby or something and try to continue a conversation they were using before.. But by this time the state has expired on the firewall, and is blocked..

                                    edit
                                    you can use pftop to see age of states, etc.. when they will expire, etc.. You can filter this for specific IPs, etc.

                                    viewpftop.jpg

                                    An intelligent man is sometimes forced to be drunk to spend time with his fools
                                    If you get confused: Listen to the Music Play
                                    Please don't Chat/PM me for help, unless mod related
                                    SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                    O 1 Reply Last reply Reply Quote 0
                                    • O
                                      oliverus000 @johnpoz
                                      last edited by oliverus000

                                      @johnpoz said in Need Help Resolving ?Asymmetric Routing? Issue in a Network with pfSense and Netgear Managed Switch (GS724Tv4):

                                      You say the states just go away? That would be problematic..

                                      I would love to record it and its a very weird behaviour, let me describe EXACTLY what happens:

                                      1. I wait until there are no states available any more for any connection to the server x.x.100.221 on pfsense
                                        b9225737-f00b-4446-83b2-b6f875f04555-image.png

                                      2. I refresh the window with a connection to x.x.100.221 which has a shell opened to the server via novnc.
                                        c77efda8-bd5e-4e8b-a04b-633a7fb77ae8-image.png

                                      3. I have around 20 new states on different ports:
                                        4600bc81-8129-4406-a081-160ba031656e-image.png

                                      4. I type in stuff in my shell (really just interacting with the server, nothing fancy, just typing in text or even not doing anything, just looking at the shell and then out of nowhere BAM:
                                        41d185da-79d6-409e-a82f-009db478b82c-image.png

                                      5. ALL STATES ARE GONE just at the time when I got kicked out of my connection to the shell. ALL OF THEM

                                      Now what comes to mind mind are two things:

                                      • Is my pfsense detecting something and then flushes all the states and that is really disconnecting me (pfsense is the enemy)
                                      • Is my connection somewhere breaking because something is bad and that leads to the flush of all states. (proxmox is doing some stuipid novnc stuff that pfsense does not like)

                                      The reason why I cant let it go is because my IT head is not liking the fact that this could also happen to any other connection from VLAN10 to VLAN100 (not only me using a novnc shell)

                                      WHY is pfsense flushing all states without telling me the reason? I cant imagine this is happening because they are all expired at the same time, especially when I have a window open connecting to the shell via novnc?

                                      What i did now is a hping3 -S 10.76.100.221 -p 80 -c 1000 from a client in 10.76.28.x which should send TCP-SYN packages to port 80.

                                      I have a packages loss of 10%

                                      6aea9569-3cd6-4b5e-87e2-2e722d89f9f9-image.png

                                      Is this related???

                                      johnpozJ 1 Reply Last reply Reply Quote 0
                                      • johnpozJ
                                        johnpoz LAYER 8 Global Moderator @oliverus000
                                        last edited by johnpoz

                                        @oliverus000 said in Need Help Resolving ?Asymmetric Routing? Issue in a Network with pfSense and Netgear Managed Switch (GS724Tv4):

                                        I have a packages loss of 10%

                                        I wouldn't expect there to be any packet loss on something your just talking to locally - 10% is quite a lot.. Does it come in a bunch, ie see a bunch of loss and then its all back to normal - or is it a packet here, packet there out of 1000 for example.. That adds up to 10%

                                        How are you determining that you have 10% packet loss? (edit: oh I see) Is that in clumps all together now and then or just random here or there..

                                        If all of the states you see are in closing or closed - then yeah I would expect them to all go away at like the same time.. But if your saying your loosing all states, even active ones - that points to something flushing the state table..

                                        But if your sending data, and getting an answer the state should be active - unless you are not flowing traffic through pfsense??

                                        Those states you show - don't show any response they are all just one sided.. 8/0 etc... that is not what a normal active conversation would look like..

                                        ESTABLISHED:ESTABLISHED

                                        And you should see packets on both sides of the / like

                                        normalstates.jpg

                                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                                        If you get confused: Listen to the Music Play
                                        Please don't Chat/PM me for help, unless mod related
                                        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                        O 1 Reply Last reply Reply Quote 1
                                        • O
                                          oliverus000 @johnpoz
                                          last edited by oliverus000

                                          @johnpoz
                                          I changed from wifi to a cable and paket loss reduced to almost 0%. So most probably not really connected to my issue.

                                          BUT your comment most probably leads to something.... You are absolutely right. There are only one sided states and it never shows "established" when I am connected with a browser to my server... WHAT could this mean???

                                          I only see something like this but this looks also very one sided:
                                          5ece27d2-3a64-4fbd-9621-19ab325d700e-image.png

                                          johnpozJ 1 Reply Last reply Reply Quote 0
                                          • johnpozJ
                                            johnpoz LAYER 8 Global Moderator @oliverus000
                                            last edited by johnpoz

                                            @oliverus000 and your answer is not going back through pfsense.

                                            So in my above example if client A talking B sends its syn through pfsense it will open a state if the firewall rules allow the traffic. But if the answers do not flow back through pfsense then the would never be an established connection.. And even if you continue to send traffic from A through pfsense.. At some point this state will close, and now traffic from A to B would be blocked..

                                            So this points to symmetrical flow - but in the other direction.. So you could have something like this..

                                            reverse.jpg

                                            pfsense will open the state and send your traffic on - but since it never sees any return traffic.. At some point these states will expire.. And now your sender sending traffic will be blocked until he sends a new syn to open up a new state.

                                            This some examples of why asymmetrical flow is almost never a good idea.. That @coxhaus mentions he is doing it - on purpose?? That is horrible design.. And can be very problematic - especially when you have a stateful firewall doing the routing..

                                            You can see this sort of issue with multi homed devices.. As well

                                            So for example my client on 192.168.1.x sends traffic to 192.168.2.x through pfsense.. But the device on 192.168.2 also has a connection in the 192.168.1 network and answers via this path then at some point pfsense will kill off the states.. And further traffic will be blocked until a new syn opens a new state..

                                            multihomed.jpg

                                            Asymmetrical flow, mult-homed devices is just asking for problematic issues.. They should almost always be avoided..

                                            Now you would hope that the client sending the traffic would be smart enough to figure out, hey I sent to 192.168.2.x via my gateway mac of xyz... Why is the response coming from 192.168.1.Y from mac abc.. Because such a response could be of security concern.. But many clients are stupid, and will just accept the answer.. Hey I sent to 192.168.2.x from port 4000 to port 443.. And the response even though from different IP and different mac address is to my port 4000 from a port 443..

                                            Is this device your talking to multihomed? Ie does it have an IP in both networks?

                                            If your going to talk to a device that has interfaces in network A and B from a device in network A.. You should talk to the device IP in network A.. Not B - if you talk to its B address, you are yes most likely going to have issues..

                                            Multihoming can be very problematic.. And also a security concern.. Because your firewall has no control over this device talking to other devices in other networks - because it has a leg in multiple networks. And this can be used to circumvent firewall controls of what can talk to what.

                                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                                            If you get confused: Listen to the Music Play
                                            Please don't Chat/PM me for help, unless mod related
                                            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                            O 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.