Bug? Using CARP IP as WAN IP in console and fw rules



  • I just upgraded our secondary firewall 2.3.3 -> 2.4.1.
    After the upgrade the (secondary) firewall presents wrong WAN IP in the console:
    *** Welcome to pfSense 2.4.1-RELEASE (amd64) on company-pfsense2 ***

    WAN (wan)      -> em2        -> v4: XX.YYY.173.22/27
    LAN (lan)      -> em3.2      -> v4: 10.1.1.4/24
    …etc

    The WAN IP is, and has always been XX.YYY.173.6.
    The OS seem to get it right:

    [2.4.1-RELEASE][root@company-pfsense2.localdomain]/root: ifconfig em2
    em2: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=4209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic,vlan_hwtso>ether 00:25:90:d8:XX:YY
            hwaddr 00:25:90:d8:XX:YY
            inet6 fe80::225:90ff:XXXX:XXXX%em2 prefixlen 64 scopeid 0x3
            inet XX.YYY.173.6 netmask 0xffffffe0 broadcast XX.YYY.173.31
            inet XX.YYY.173.8 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 80
            inet XX.YYY.173.9 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 83
            inet XX.YYY.173.10 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 85
            inet XX.YYY.173.14 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 86
            inet XX.YYY.173.13 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 87
            inet XX.YYY.173.15 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 88
            inet XX.YYY.173.12 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 91
            inet XX.YYY.173.16 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 92
            inet XX.YYY.173.17 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 93
            inet XX.YYY.173.18 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 94
            inet XX.YYY.173.19 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 95
            inet XX.YYY.173.20 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 96
            inet XX.YYY.173.21 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 99
            inet XX.YYY.173.22 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 100
            inet XX.YYY.173.23 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 101
            inet XX.YYY.173.24 netmask 0xffffffe0 broadcast XX.YYY.173.31 vhid 102
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            carp: BACKUP vhid 80 advbase 1 advskew 100
            carp: BACKUP vhid 83 advbase 1 advskew 100
            carp: BACKUP vhid 85 advbase 1 advskew 100
            carp: BACKUP vhid 86 advbase 1 advskew 100
            carp: BACKUP vhid 87 advbase 1 advskew 100
            carp: BACKUP vhid 88 advbase 1 advskew 100
            carp: BACKUP vhid 91 advbase 1 advskew 100
            carp: BACKUP vhid 92 advbase 1 advskew 100
            carp: BACKUP vhid 93 advbase 1 advskew 100
            carp: BACKUP vhid 94 advbase 1 advskew 100
            carp: BACKUP vhid 95 advbase 1 advskew 100
            carp: BACKUP vhid 96 advbase 1 advskew 100
            carp: BACKUP vhid 99 advbase 1 advskew 100
            carp: BACKUP vhid 100 advbase 1 advskew 100
            carp: BACKUP vhid 101 advbase 1 advskew 100
            carp: BACKUP vhid 102 advbase 1 advskew 100</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic,vlan_hwtso></up,broadcast,running,promisc,simplex,multicast>
    

    When I change the carp IP with vhid 100 from .22 to for example .25, everything works fine again, and primary WAN IP is back on .173.6.

    If it was only the console, I wouldn't care much, but the firewall rules also seem to think the WAN IP is XX.YYY.173.22, and therefor mishandles rules with destination set to "WAN IP".

    This was a pretty straight forward upgrade. No config change.
    The "bad" IP is only mentioned once in my config.xml, as a vip.

    There must be something wrong with the code that determines the WAN IP, but I'm not sure how exactly I can mitigate it.



  • Bug found and pull request created:
    https://github.com/pfsense/pfsense/pull/3872



  • Thanks for this, it bit us quite badly on a box we have in AWS.
    The patch fixed the ip for the WAN address, we just have a last issue where openvpn servers are now binding to the wrong ip (the vip), it might possibly be related.



  • @nuro:

    Thanks for this, it bit us quite badly on a box we have in AWS.
    The patch fixed the ip for the WAN address, we just have a last issue where openvpn servers are now binding to the wrong ip (the vip), it might possibly be related.

    What interface did you select in the openvpn server settings? "WAN"?



  • That is correct. Using the "WAN" interface in the server settings normally gave us the interface ip instead of the virtual ip.



  • @nuro:

    That is correct. Using the "WAN" interface in the server settings normally gave us the interface ip instead of the virtual ip.

    In pfSense web gui, go to Diagnostics / Command Prompt.
    Under "Execute PHP Commands", run:
    printf(find_interface_ip("em2"));

    But change "em2" to your WAN interface.
    Do you get correct WAN IP, or VIP IP?



  • It returns our virtual ip.
    Our real ip is xxx.xxx.127.4, and the php command returns xxx.xxx.127.22 which is our virtual ip.



  • @nuro:

    It returns our virtual ip.
    Our real ip is xxx.xxx.127.4, and the php command returns xxx.xxx.127.22 which is our virtual ip.

    I think it might be cached. I do not know the reason for wrong IP being cached, but you can change the cache.
    Go to Diagnostics / Edit File.
    Open each of these files, and change IP to WAN IP (if needed):
    /var/db/wan_cacheip
    /var/db/wan_ip
    /var/db/em2_ip

    Once again, change last one from "em2_ip" to your wan interface.



  • Ah, that was it, our openvpn servers are back to normal. Thanks for your help.
    I'll keep any eye on it when she goes for a reboot again (not anytime soon), but this thread has given me some insight.



  • @nuro:

    Ah, that was it, our openvpn servers are back to normal. Thanks for your help.
    I'll keep any eye on it when she goes for a reboot again (not anytime soon), but this thread has given me some insight.

    Glad to hear it's working. We haven't put our 2.4.1 into production yet.

    I wounder if it is a coincidence that for both you and me, it was the .22 vip that caused problem.



  • Yeah, kinda weird.
    It's the only box where we have a vip on the WAN interface, so it slipped through the cracks when we tested it on another box.


Log in to reply