Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    VIP addresses stop working

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    12 Posts 3 Posters 2.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      TitanSystems
      last edited by

      We have an odd issue where our VIP IP aliases stop passing traffic. We have them set for may tasks, like Openvpn, webservers, and the like. After a non set length of time, they simply stop responding. In order for them to start working again, all that has to be done is change the subnet to ANYTHING else, save and apply. Boom they start working again. They had been working fine since a fresh install Jan 1, no changes to system, and it starting this fail about a week ago. The WAN primary ip always works, otherwise I would assume failing hardware. Been using pfsense for a very long time, but this is the first time I am seeing this issue.
      We have ordered replacement hardware just in case, but this just does not seem like hardware failure.

      We are getting this in our dpinger system log:
      send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 184.174.xxx.xx bind_addr 184.174.xxx.xxx identifier "WANGW "

      Thank you for your help.

      1 Reply Last reply Reply Quote 0
      • DerelictD
        Derelict LAYER 8 Netgate
        last edited by

        That is a perfectly normal log entry that is logged when dpinger starts or restarts.

        Going to need more information. Are the IP Alias VIPs still on the interface if you run ifconfig -a when it is broken?

        What happens when you try to ping sourced from that VIP?

        ping -S VIP_IP_ADDRESS GATEWAY_ADDRESS

        ping -S VIP_IP_ADDRESS 8.8.8.8

        Chattanooga, Tennessee, USA
        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
        Do Not Chat For Help! NO_WAN_EGRESS(TM)

        1 Reply Last reply Reply Quote 0
        • T
          TitanSystems
          last edited by

          To answer, here is ifconfig while broken.

          bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
          options=c019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
          ether 00:e0:66:fd:6c:c3
          hwaddr 00:e0:66:fd:6c:c3
          inet6 fe80::2e0:66ff:fefd:6cc3%bge0 prefixlen 64 scopeid 0x1
          inet 184.xx.xx.83 netmask 0xffffff00 broadcast 184.xx.xx.255
          inet 198.xx.xx.51 netmask 0xffffff00 broadcast 198.xx.xx.255
          inet 184.xx.xx.118 netmask 0xffffff00 broadcast 184.xx.xx.255
          inet 184.xx.xx.84 netmask 0xffffff00 broadcast 184.xx.xx.255
          nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
          media: Ethernet autoselect (1000baseT <full-duplex>)
          status: active

          first ping:
          --- 184.xx.xx.1 ping statistics ---
          16 packets transmitted, 0 packets received, 100.0% packet loss

          Second Ping:
          --- 8.8.8.8 ping statistics ---
          12 packets transmitted, 0 packets received, 100.0% packet loss

          As soon as I change the subnet to anything else and apply

          PING 184.xx.xx.1 (184.xx.xx.1) from 184.xx.xx.84: 56 data bytes
          64 bytes from 184.174.168.1: icmp_seq=0 ttl=64 time=1.634 ms
          64 bytes from 184.174.168.1: icmp_seq=1 ttl=64 time=1.124 ms
          64 bytes from 184.174.168.1: icmp_seq=2 ttl=64 time=0.962 ms
          64 bytes from 184.174.168.1: icmp_seq=3 ttl=64 time=1.856 ms
          64 bytes from 184.174.168.1: icmp_seq=4 ttl=64 time=1.716 ms
          64 bytes from 184.174.168.1: icmp_seq=5 ttl=64 time=2.012 ms

          1 Reply Last reply Reply Quote 0
          • DerelictD
            Derelict LAYER 8 Netgate
            last edited by

            You'll have to pcap and see what's going on on the WAN.

            Which one of those is the interface subnet and which are the VIPs?

            Is the 198 address routed to 184.xx.xx.83 or is it a "secondary" interface subnet?

            When it is broken is your WAN ARPing for 184.xx.xx.1 or is it sending the echo requests to a MAC address from its ARP table? Is there a response? Does the upstream ARP for 184.xx.xx.84? Is there a response? Is it honored or is it ignored?

            Seems like upstream ARP is screwed up to me.

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            1 Reply Last reply Reply Quote 0
            • T
              TitanSystems
              last edited by

              Afraid I am getting a bit over my head, but will try to keep up.

              Interface subnet is 184.xx.xx.1/24
              Secondary , still on same wan interface, is 198.xx.xx.1/24

              I wish they had given me statics all in the same block. I have for testing, however, removed the 198 subnet including the gateway, but still end up with the same issue.

              I have created a gateway for both, and they always show up.

              Your next questions are where I am getting a bit lost on how to answer.

              Would the upstream arp be my internet provider (EPB Telecom)? Do I need to call them and ask a specific question?

              1 Reply Last reply Reply Quote 0
              • DerelictD
                Derelict LAYER 8 Netgate
                last edited by

                Diagnostics > Packet Capture

                What we do here sort of depends on how much traffic is on that VIP.

                When it is broken, start this capture:

                Interface: WAN
                Address Family: IPv4
                Protocol: Any
                Host Address: 184.xx.xx.84
                Count: 1000000
                Start the capture

                Then run the above ping tests with -S 184.xx.xx.84 Let it fail for a bit. Then do whatever you do to fix it, note the time you did this, then run another ping test using -S 184.xx.xx.84.

                Then go back to Diagnostics > Packet Capture, stop it and download it. If you've never used it before, this is a good time to download wireshark and open that file using that. It'll do a lot of the interpretation of the protocols for you.

                I'll send you a chat with a drop box link to upload the capture file to.

                Do you really have the whole /24 or just some addresses on each of them?

                This is EPB in Tennessee?

                Chattanooga, Tennessee, USA
                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                1 Reply Last reply Reply Quote 0
                • T
                  TitanSystems
                  last edited by

                  Will do. I do not have the entire/24 but that is how they hand them out.

                  Yes, EPB Chattanooga TN. We have a 1 gig fiber connection, hence the need for netgate / pfsense.

                  I have just called their helpdesk, after your prompting about upstream arp, and apparently last Wednesday, they moved us to some new equipment early o'clock in the morning. I started seeing the issues on Thursday. At the moment they have esclated the issue, but the high level will not be in until 9am eastern. Since it is almost 8pm here I will probably let it go til tomorrow and let you know what they say. If it is on their end I will ask as much as possible (including what type of equipment is on their end) so I can add to the forum just in case someone else has a similar issue.

                  Thanks for your help thus far. I have been using PFsense since 07 when I shifted from using m0n0wall and Tomato and have have donations along the way. Have had the various bugs and such as were expected but never needed to ask for help. Glad it may not be pfsense!

                  1 Reply Last reply Reply Quote 0
                  • T
                    TitanSystems
                    last edited by

                    OK, so the problem is solved, sorta.

                    The issue is that PFSENSE (and cisco asa) does not reply to their equipment for ARPhost package replies on the VIP's. They have seen it with every pfsense / opnsense / monowall based package. In order for it to work going forward, they had to direct all traffic sent to VIP's to the main ip. I dont know if this is something the devs can fix, but kinda doubt it.

                    Thank you again

                    1 Reply Last reply Reply Quote 1
                    • DerelictD
                      Derelict LAYER 8 Netgate
                      last edited by Derelict

                      They should be routing the packets destined for the new subnet to an address on the existing subnet. That is the proper way to do that.

                      I believe the fix was applied to the correct side there. Good to know EPB came through.

                      Chattanooga, Tennessee, USA
                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                      1 Reply Last reply Reply Quote 0
                      • magnus-maximusM
                        magnus-maximus
                        last edited by

                        I am working on the same issue with EPB and the CARP VIP ARP problem.

                        @Derelict, I saw a post that you moved to Chattanooga for EPBs 10Gbps fiber network, and your input is appreciated.

                        https://forum.netgate.com/topic/155028/so-i-moved-to-chattanooga-so-i-could-get-the-fastest-internet-in-america

                        We have been looking into this issue with EPB as the firewall does respond to the VIP ARP request. Still, it seems like EPBs equipment reads the source-mac address from the ARP frame, which is the router's physical interface address, instead of using the virtual-MAC and virtual-IP in the ARP reply.

                        https://redmine.pfsense.org/issues/9476

                        The virtual IP address is associated with a virtual MAC address in the ARP reply when it egresses the local network to the ISP network to register the virtual IP address and MAC address with the ISP's network. However, the VIP has the same mac address as the physical router, and it looks like the ISP switch discards the frame due to this implementation. The ISP's upstream equipment never sees the ARP reply even though it is present when egressing the client's network before the ISP equipment (ONT, Modem, ETC). The source MAC address of the VIP ARP using the router's physical MAC address is why the ISP does not see the ARP reply, and it looks like the router is not generating an ARP reply.

                        https://community.arubanetworks.com/community-home/digestviewer/viewthread?MID=14293

                        As previously stated, the most common solution is to use a single router as a gateway and then redundant firewalls behind the gateway with routed IPs. But you still have the issue of the single point of failure. Ideally, one will have two ISP connections with one to each firewall.

                        A modified solution to temporarily register a VIP until the MAC IP address registration time-out triggers removal of the IP registration (4 to 18 Hours) is installing ARPing and generating an ARP frame with the virtual IP and virtual MAC address as the source MAC.

                        ARPING -A -i eth1 -s 00:00:5e:00:01:01 -S xxx.xxx.xxx.xxx 255.255.255.255

                        I have read many of your posts @Derelict where you discuss how some vendor's equipment deviates from the implementation in pfSense CARP, causing much frustration and difficulty for many installations.

                        As you mentioned in another post, the VRRP specification in the RFC is that the source-MAC address is the physical address of the router.

                        https://forum.netgate.com/topic/134297/cox-and-the-carp-mac/18

                        "Note that the source address of the Ethernet frame of this ARP response is the physical MAC address of the physical router. "
                        https://datatracker.ietf.org/doc/html/rfc5798#page-29

                        But some vendors insist that the source-mac address for VRRP should be the virtual mac address.

                        "The virtual MAC address should source ethernet frames from a VRRP router, but not all vendors may implement it this way. "
                        https://kb.juniper.net/InfoCenter/index?page=content&id=KB7109

                        I believe it could have come from the VRRP RFC in 2004, as this early RFC for VRRP does not include the note regarding the source of the MAC address.
                        https://datatracker.ietf.org/doc/html/rfc3768#section-8.2

                        The question I have is, are there any known solutions for interoperability that do not require changing the behavior of pfSense, as doing so would be working backward to try to maintain a non-compliant VRRP implementation?

                        DerelictD 2 Replies Last reply Reply Quote 0
                        • DerelictD
                          Derelict LAYER 8 Netgate @magnus-maximus
                          last edited by Derelict

                          @magnus-maximus I do not have business class service so I have not had to deal with EPB and CARP MAC addresses. (One DHCP address is all I have, which is incompatible with CARP/HA).

                          The ARP response to a CARP VIP is sourced from the interface MAC address and contains the CARP MAC in the IS AT response. This is so layer 3 comms go to the correct MAC address for the CARP VIP on the broadcast domain.

                          The CARP heartbeats source from the CARP MAC address. This is generally to instruct the layer 2 gear which port to use to contact that MAC address.

                          While CARP and VRRP are similar they are not the same and the VRRP RFCs are really meaningless here.

                          Usually the problems people have are with ISP gear that only permits one MAC address per port or some other "port security" type scheme.

                          I have not spent nearly as much time looking at VRRP as I have CARP.

                          Feel free to DM/chat the provisioning information from EPB for your circuit. Their business tiers are just too spendy for me here at the home office so I haven't seen one.

                          Chattanooga, Tennessee, USA
                          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                          Do Not Chat For Help! NO_WAN_EGRESS(TM)

                          1 Reply Last reply Reply Quote 0
                          • DerelictD
                            Derelict LAYER 8 Netgate @magnus-maximus
                            last edited by Derelict

                            @magnus-maximus said in VIP addresses stop working:

                            https://datatracker.ietf.org/doc/html/rfc3768#section-8.2

                            That seems to indicate what is included in the ARP IS AT response in the ARP protocol itself. It is silent about the source MAC address of the frame containing the ARP response.

                            8.2 pretty much describes what CARP does. The MAC address in the ARP response for a CARP VIP is always the virtual CARP MAC address.

                            What, exactly, is the ISP doing that is breaking things? Why are they not issuing another ARP request when they have traffic for an IP address after the ARP cache has expired?

                            Chattanooga, Tennessee, USA
                            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                            Do Not Chat For Help! NO_WAN_EGRESS(TM)

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.