upgrade 2.4.3 to 2.4.4 changes bird behavior during filter reload
-
Hi all,
after upgrading my box from 2.4.3 to 2.4.4 every change on filter or nat rules
(after applying) is breaking my iBGP Sessions.
BGP Sessions are using BFDNormal State:
ATVIED6BNG01 BGP master up 16:42:19 Established
ATVIED6BNG02 BGP master up 17:16:00 Established
ATVIED6INFRAH1_NET_A BGP master up 16:43:16 Established
ATVIED6INFRAH1_NET_B BGP master up 16:43:08 Established
ATVIED6INFRAH2_NET_A BGP master up 16:43:10 Established
ATVIED6INFRAH2_NET_B BGP master up 16:43:04 Established
ATVIED6INFRAH3_NET_A BGP master up 16:43:13 Established
ATVIED6INFRAH3_NET_B BGP master up 16:43:08 Established
ATVIED6INFRAH4_NET_A BGP master up 16:43:05 Established
ATVIED6INFRAH4_NET_B BGP master up 16:43:05 EstablishedAfter Changing Rules
ATVIED6INFRAH1_NET_A BGP master up 17:17:47 Established
ATVIED6INFRAH1_NET_B BGP master start 17:18:08 Idle Error: BFD session down
ATVIED6INFRAH2_NET_A BGP master start 17:18:08 Passive Received: Cease
ATVIED6INFRAH2_NET_B BGP master start 17:18:08 Passive Received: Cease
ATVIED6INFRAH3_NET_A BGP master start 17:18:08 Passive Received: Cease
ATVIED6INFRAH3_NET_B BGP master start 17:18:08 Idle Received: Cease
ATVIED6INFRAH4_NET_A BGP master start 17:18:08 Idle Received: Cease
ATVIED6INFRAH4_NET_B BGP master start 17:18:08 Passive Received: CeaseOnly BGP Sessions on the "LAN" side are affected.
Peers on the WAN Side (ATVIED6BNG01, ATVIED6BNG02) are Cisco
Peers on the LAN Side (ATVIED6INFRAH1-4) are other Bird InstancesI have reverted the bird package to 1.6.3 to match version under 2.4.3 but nothing changed.
It seems something has changed in 2.4.4 reagrding the filter reloadany ideas ?
BR
Stefan -
Bird is not a supported package on pfSense, IIRC the team recommends the "frr" package for BGP.
-
Yes i know , but it's the only option to habe BGP with BFD support.
It was working all the time - only this update break it.I'will have a look at the FRR package but i don't think BFD support is included
thanks
-
Ok, so to be clear you installed the FreeBSD bird package? Did it pull in any other dependencies?
One might have been replaced with our version at upgrade.Steve
-
I have installed it via shell
PACKAGE-INFO
[2.4.4-RELEASE][root@ATVIED6INFRAFW2.as29081.net]/root: pkg info bird
bird-1.6.4
Name : bird
Version : 1.6.4
Installed on : Mon Nov 12 11:33:34 2018 CET
Origin : net/bird
Architecture : FreeBSD:11:amd64
Prefix : /usr/local
Categories : net
Licenses : GPLv2
Maintainer : olivier@FreeBSD.org
WWW : http://bird.network.cz/
Comment : Dynamic IP routing daemon (IPv4 version)
Shared Libs required:
libreadline.so.7
Annotations :
FreeBSD_version: 1102000
flavor : ipv4
repo_type : binary
repository : pfSense
Flat size : 554KiB
Description :
The BIRD project aims to develop a fully functional dynamic IP routing daemon.- Both IPv4 and IPv6
- Multiple routing tables
- BGP
- RIP
- OSPF
- Static routes
- Inter-table protocol
- Command-line interface
- Soft reconfiguration
- Powerful language for route filtering
WWW: http://bird.network.cz/
bird 1.6.3 was only for testing - downloaded the package provided by FreeBSD - extracted it and only used
the binary.
Should i install bird 1.6.3 from pfsense repo Package ? (how can i do this?) -
Hmm, well as far as I know it's completely untested. I didn't even realise it was in our repo until I checked it.
Do you see blocked packets from the peers? Do you see any packets from the peers? Or being sent to the peers?
Steve
-
i will try to do some deeper debug (tcpdump)
Maybe something has changed (parameters) in the way pfctl is called for filter reloadat time it seems all TCP Session conncted to the pfsense box get stalled when
filter is reloaded (for example SSH remote Session timed out) -
Some more information
every "save" action in the gui causes a disruption - this could also be the reason for delay response
from the guiSyslog shows this after changig Log-Settings
Nov 15 10:38:50 sshd 17503 Fssh_packet_write_poll: Connection from user root 62.212.164.58 port 24480: Permission denied
Nov 15 10:38:47 syslogd kernel boot file is /boot/kernel/kernel
Nov 15 10:38:47 syslogd exiting on signal 15
Nov 15 10:38:45 root /etc/rc.d/hostid: WARNING: hostid: unable to figure out a UUID from DMI data, generating a new one
Nov 15 10:38:45 check_reload_status Syncing firewallIn the meanwhile i have also modified the BGP BFD Timers because tcpdump and bird logs indicates
BGP Session down due to failed BFD Session
After this birdc give following info to me after a filter reloadbirdc
ATVIED6INFRAH1_NET_A BGP master start 10:31:15 Passive Socket: Permission denied
ATVIED6INFRAH1_NET_B BGP master start 10:31:15 Passive Socket: Permission denied
ATVIED6INFRAH2_NET_A BGP master start 10:31:15 Passive Socket: Permission denied
ATVIED6INFRAH2_NET_B BGP master start 10:31:14 Passive Socket: Permission denied
ATVIED6INFRAH3_NET_A BGP master start 10:31:15 Passive Socket: Permission denied
ATVIED6INFRAH3_NET_B BGP master start 10:31:15 Passive Error: BFD session down
ATVIED6INFRAH4_NET_A BGP master start 10:31:15 Passive Socket: Permission denied
ATVIED6INFRAH4_NET_B BGP master start 10:31:15 Passive Socket: Permission deniedit seems sockets get cleared during reload - would also match the ssh error we can see in the syslog
-
Hmm, that's curious. Just to confirm if you are ssh'd into the firewall and you reload the filter from Status > Filter Reload in the GUI your ssh session is interrupted? Disconnected?
That should not happen. I've never seen it on any of my test boxes.
Do you lose the firewall state when that happens? Do you have any of the advanced state killing options selected? Like 'Flush all states when a gateway goes down' for example.
Steve
-
Yes, i ssh'd into the box and after "Status->Filter Reload" or any other action , for example changing a
Rule make the terminal unresponsive and after 1 Minute i got the message "Session closed".BUT
i have done some additional tests right now , and it's working at time
Don't know whtas going on.
i will watch the situation for a few days ....Many thanks for your help Steve
- and to answer the rest of your question: no advanced state killing options selected