Multi-wan + vlans = screwed up



  • Hi there,

    I'm totally screwed up setting up a multi-wan failover connections. See attached image for the setup.

    Routing table:

    
    Internet:
    Destination        Gateway            Flags    Refs      Use  Netif Expire
    default            xx.xx.35.192     UGS         0      318 re1_vlan25
    10.4.0.0/16        link#12            U           0        0 re1_vlan442
    10.4.1.44          link#12            UHS         0        0    lo0
    77.88.21.3         10.4.255.254       UGHS        0     1793 re1_vlan442
    87.250.251.3       xx.xx.35.192     UGHS        0     1835 re1_vlan25
    127.0.0.1          link#5             UH          0      162    lo0
    172.26.0.0/16      link#9             U           1   318411 re0_vlan1
    172.26.1.241       link#9             UHS         0        0    lo0
    172.27.0.0/16      link#10            U           0      792 re0_vlan12
    172.27.1.241       link#10            UHS         0        0    lo0
    192.168.211.252/30 link#3             U           0        0    rl0
    192.168.211.254    link#3             UHS         0        0    lo0
    xx.xx.35.0/24    link#11            U           0        0 re1_vlan25
    xx.xx.35.193     link#11            UHS         0        0    lo0
    

    Both gateways are up (see attached picture). NAT and fw rules are set as appropriate.

    The weird thing comes when I'm trying to connect to the internal host from the network behind ISP 1:

    
    tcpdump -ne -i re1 (dot1q interface for ISP1 and ISP2)
    19:32:07.862350 00:15:17:d5:d3:ca > 00:21:91:0c:73:1d, ethertype 802.1Q (0x8100), length 78: vlan 442, p 0, ethertype IPv4, zz.zz.160.13.59962 > yy.yy.120.148.446: Flags [s], seq 819020420, win 5840, options [mss 1460,sackOK,TS val 1554380521 ecr 0,nop,wscale 7], length 0
    19:32:07.862683 00:21:91:0c:73:1d > d8:5d:4c:82:d0:c0, ethertype 802.1Q (0x8100), length 78: vlan 25, p 0, ethertype IPv4, yy.yy.120.148.446 > zz.zz.160.13.59962: Flags [S.], seq 79230050, ack 819020421, win 5792, options [mss 1460,sackOK,TS val 1309007838 ecr 1554380521,nop,wscale 4], length 0
    
    tcpdump -n -i re0_vlan1 (LAN)
    19:32:07.862416 IP zz.zz.160.13.59962 > 172.26.2.76.446: Flags [s], seq 819020420, win 5840, options [mss 1460,sackOK,TS val 1554380521 ecr 0,nop,wscale 7], length 0
    19:32:07.862660 IP 172.26.2.76.446 > zz.zz.160.13.59962: Flags [S.], seq 79230050, ack 819020421, win 5792, options [mss 1460,sackOK,TS val 1309007838 ecr 1554380521,nop,wscale 4], length 0
    
    Translating to human language, it appears that the traffic coming from vlan 442 going out thru vlan 25.
    
    The rule which routes all traffic from 172.26.2.76 via ISP1 (vlan 442) is setup in LAN fw rules and it works well for the traffic initiated from 172.26.2.76.
    
    If I force the default gateway to ISP 1, connections initiated from vlan 442 start working well, while connections initiated from vlan 25 are getting stuck (SYNs come from vlan 25 and SYN-ACKs returning via vlan 442).
    
    I tried to turn logging on for everything, but nothing came out that could help.
    
    Please, help me!!
    ![untitled.JPG](/public/_imported_attachments_/1/untitled.JPG)
    ![untitled.JPG_thumb](/public/_imported_attachments_/1/untitled.JPG_thumb)
    ![untitled2.JPG](/public/_imported_attachments_/1/untitled2.JPG)
    ![untitled2.JPG_thumb](/public/_imported_attachments_/1/untitled2.JPG_thumb)[/s][/s]
    


  • as far as i understand what you are saying is this? ???:

    remote clients connects to ISP1  --> reaches device in LAN with correct portforward ? --> return traffic goes out ISP2  ??
    

    this is odd behaviour that i have not seen before unless this is forced by policy routing in the firewall rules on the interface tabs or floating rules

    are you using automatic nat (AON) ? check what nat rules are applied and if needed switch to manual nat to correct



  • heper, thank you for your reply!

    @heper:

    as far as i understand what you are saying is this? ???:

    remote clients connects to ISP1  --> reaches device in LAN with correct portforward ? --> return traffic goes out ISP2  ??
    

    Yes, this is exactly what I'm saying.

    @heper:

    this is odd behaviour that i have not seen before unless this is forced by policy routing in the firewall rules on the interface tabs or floating rules

    are you using automatic nat (AON) ? check what nat rules are applied and if needed switch to manual nat to correct

    No, I'm using manual NAT.

    I'm attaching some more pictures. Please, check them and let me know if you need something else.

    The interface named INTERNET is just a group of interfaces ISP1 and ISP2.










  • And here is what happens when I clear the default gateway checkbox on all gateways (in System:Gateways):

    14:56:06.992023 00:21:91:0c:75:65 > 00:0c:29:6d:06:74, ethertype 802.1Q (0x8100), length 78: vlan 1, p 0, ethertype IPv4, zz.zz.160.13.46351 > 172.26.2.76.446: Flags [s], seq 1056994038, win 5840, options [mss 1460,sackOK,TS val 1624193785 ecr 0,nop,wscale 7], length 0
    14:56:06.992285 00:0c:29:6d:06:74 > 00:21:91:0c:75:65, ethertype 802.1Q (0x8100), length 78: vlan 1, p 0, ethertype IPv4, 172.26.2.76.446 > zz.zz.160.13.46351: Flags [S.], seq 3460358797, ack 1056994039, win 5792, options [mss 1460,sackOK,TS val 1378821896 ecr 1624193785,nop,wscale 4], length 0
    14:56:06.992314 00:21:91:0c:75:65 > 00:0c:29:6d:06:74, ethertype 802.1Q (0x8100), length 106: vlan 1, p 0, ethertype IPv4, 172.26.1.241 > 172.26.2.76: ICMP host 81.222.160.13 unreachable, length 68
    
    This is the case when no gateways set for ISP 1 and ISP 2 interfaces - I thought the routing for reply packets might rely on the policy routing defined in the LAN rules.
    
    If I return gateways for ISP 1 and ISP 2 interfaces (but still not marking any of them as a system default), I'm returning to the original issue.
    
    Looking forward for any help from you, guys![/s]
    


  • you seem to have duplicate nat rules (altho that does not explain what is going on)
    personally i prefer to not use 1:1 nat if possible (just forwarding the individual ports or ranges of ports). There are a lot of drawbacks of 1:1 !

    what i would do is this:
    -create a backup of you currect config
    -remove all rules and change everything back to a default and sane state with just basic outgoing nat working for both dmz & lan. Depending on how critical the environment a restore of config to when it was first online helps
    -slowly build everything up one step at a time by using regular portforwards and automatic-outgoing-nat. (AON)

    start with as little as possible and take it from there.



  • heper, thank you so much for your suggestions!! Here is what caused my issue: I had fw rules defined in the INTERNET tab (group of wan interfaces). After I moved them under each of the wan interfaces - it worked like a charm. Thank you again!

    Side question: what are interface groups used for then? I thought I could specify rules over there to avoid duplicating.. What is the proper way of using that?



  • well that is the propper use of them …

    not sure what went wrong, maybe group rules match after interface rules or something



  • Just for your and others information:

    After I've figured out what was wrong, I started to search for "pfsense nat interface group" and it appears there are a lot of questions on this topic. And the answer is that we can't use interface groups with NAT. Looks like they can be used just for blocking rules.

    Digging into details, each NAT rule for the real interface has a reply-to keyword which states which interface must be used for reply traffic:

    pass in log quick on re1_vlan442 reply-to (re1_vlan442 10.4.255.254) inet proto tcp from any to 172.26.2.76 port = ddm-rdb flags S/SA keep state label "USER_RULE"
    
    

    When we try to use NAT on interface groups that rule changes to:

    pass in log quick on INTERNET inet proto tcp from any to 172.26.2.76 port = ddm-rdb flags S/SA keep state label "USER_RULE"
    

    So, it appears the system just has no ideas where to send the replies and chooses an interface randomly or sort of.

    I might be wrong, but it looks like a bug of pf-subsystem. It definitely knows where the traffic came from and why it does not use this information - god only knows.

    Thank you again, heper, and have a nice weekend!


Log in to reply