Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    FRR and BGP disconnect

    Scheduled Pinned Locked Moved FRR
    1 Posts 1 Posters 112 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      Blade1024
      last edited by

      Hi,

      We are running 25.03-BETA and running into the issue of FRR and BGP processes disconnecting at the control level. It mitigates itself in BGP being stuck in the active state from the GUI and FRR point of view (even vtysh thinks so), while the BGP process is actively keeping the connection in the background. No routes are being populated into the routing table, but these are being announced as confirmed by our peer:

      Nothing in routing, BGP neighbor is active, so no routes should be in.

      10.206.238.225 4 65228 0 2309 0 0 0 never Active 0 Odido BGP via

      So far it looks good, but the session is already established:

      >>> tcpdump -i ipsec2
      07:23:11.642870 IP 10.206.238.225.bgp > 10.206.238.226.49408: Flags [P.], seq 2440502671:2440502690, ack 2016892785, win 11, options [nop,nop,md5 shared secret not supplied with -M, can't check - 2ed14f304978416f8007afca427f988d], length 19: BGP
      07:23:11.642939 IP 10.206.238.226.49408 > 10.206.238.225.bgp: Flags [.], ack 19, win 131, options [nop,nop,md5 shared secret not supplied with -M, can't check - 078b7005ba698e2b636e70eb2c37e234], length 0
      >>>
      USER     COMMAND    PID   FD  PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
      ...
      frr      bgpd       76872 22  tcp4   10.206.238.226:49408  10.206.238.225:179
      
      The FRR restart doesn't help:
      /usr/local/etc/rc.d/frr restart
      Stopping watchfrr.
      Waiting for PIDS: 2357.
      Starting watchfrr.
      [58970|mgmtd] sending configuration
      Waiting for children to finish applying config...
      [59017|zebra] sending configuration
      [59963|bgpd] sending configuration
      [61500|staticd] sending configuration
      [61157|watchfrr] sending configuration
      [59017|zebra] done
      [58970|mgmtd] done
      [61157|watchfrr] done
      [61500|staticd] done
      [59963|bgpd] done
      

      The BGP process ID 59963 is different from 76872!!!

      >>>> ps -ax | grep 76872
      76872  -  Ss        0:02.09 /usr/local/sbin/bgpd -A 127.0.0.1 -F traditional -d
      62041  0  S+        0:00.00 grep 76872
      
      >>>> sockstat -4
      USER     COMMAND    PID   FD  PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
      ...
      frr      bgpd       76872 22  tcp4   10.206.238.226:49408  10.206.238.225:179
      

      After killing the process, restarting the FRR, and checkign for the traffic and routes:

      >>> kill -KILL 76872
      >>> ps -ax | grep 76872
      21650  0  S+        0:00.00 grep 76872
      
      >>> /usr/local/etc/rc.d/frr restart
      Stopping watchfrr.
      Waiting for PIDS: 88383.
      Starting watchfrr.
      [27380|mgmtd] sending configuration
      [27540|zebra] sending configuration
      [28677|bgpd] sending configuration
      Waiting for children to finish applying config...
      [27380|mgmtd] done
      [30560|staticd] sending configuration
      [30405|watchfrr] sending configuration
      [27540|zebra] done
      [28677|bgpd] done
      [30560|staticd] done
      [30405|watchfrr] done
      >>> ps -ax | grep bgp
      11708  -  Ss        0:05.87 /usr/local/sbin/bgpd -A 127.0.0.1 -F traditional -d
      31648  0  S+        0:00.00 grep bgp
      >>> tcpdump -i ipsec2
      07:31:08.709787 IP 10.206.238.225.bgp > 10.206.238.226.26294: Flags [P.], seq 1180140056:1180140117, ack 3799507337, win 11, options [nop,nop,md5 shared secret not supplied with -M, can't check - d6b2c0bac2ebb8cf1058d365224d4c5c], length 61: BGP
      07:31:08.709850 IP 10.206.238.226.26294 > 10.206.238.225.bgp: Flags [.], ack 61, win 131, options [nop,nop,md5 shared secret not supplied with -M, can't check - 9dec3ac243f71d5f90e285627b2cd9e5], length 0
      >>> show bgp summary 
      Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
      10.206.238.225  4      65228         3       494        5    0    0 00:00:48            4        5 Odido BGP via
      >>> show bgp ipv4 unicast 
          Network          Next Hop            Metric LocPrf Weight Path
       *> 10.204.50.4/32   10.206.238.225                         0 65228 ?
       *> 10.204.50.12/32  10.206.238.225                         0 65228 ?
       *> 10.204.52.4/32   10.206.238.225                         0 65228 ?
       *> 10.206.238.192/27
                          0.0.0.0                  0         32768 ?
       *> 172.27.0.0/16    10.206.238.225                         0 65228 ?
      
      >>> netstat -rn
      ...
      B>* 10.204.50.4/32 [20/0] via 10.206.238.225, ipsec2, weight 1, 03:42:44
      B>* 10.204.50.12/32 [20/0] via 10.206.238.225, ipsec2, weight 1, 03:42:44
      B>* 10.204.52.4/32 [20/0] via 10.206.238.225, ipsec2, weight 1, 03:42:44
      B>* 172.27.0.0/16 [20/0] via 10.206.238.225, ipsec2, weight 1, 03:42:44
      

      Did anyone see anything like it? We could've lived with the BGP down and no routes, but it is announcing, and the traffic is being expected on the wrong interface in the destination FW.

      Regards

      1 Reply Last reply Reply Quote 0
      • First post
        Last post
      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.