CARP fails after few hours
-
Hi all, first of all sorry for my english. I will try to explain an annoying issue:
We had 2 pfsense 2.0.2 in failover that have worked fine for months. I don't know why but recently the network became very very slow. After some times we discovered that CARP failed and some intefaces work as MASTER on both firewall. We had to stop the BACKUP firewall to let the system works properly.
I have searched a lot to find an answer without success. The problem is that we are in production so we can't keep this situation (multimaster) a lot to investigate.
We decided yesterday to upgrade to 2.0.3 hopeing to solve the issue but after some hours the problem returns!
Does anyone has some suggestion to solve the problem?
p.s. some news: I'm watching the 2 pfsense working … just a few minutes ago:
pfs1 (master):
vip50: flags=49 <up,loopback,running>metric 0 mtu 1500
inet 172.17.0.20 netmask 0xffffff00
carp: MASTER vhid 50 advbase 1 advskew 0pfs2 (backup):
vip50: flags=49 <up,loopback,running>metric 0 mtu 1500
inet 172.17.0.20 netmask 0xffffff00
carp: MASTER vhid 50 advbase 1 advskew 100at the same time. It was BACKUP on pfs2 just a few minutes ago.
Now the system logs on pfs2:
Aug 13 11:37:26 pf2 check_reload_status: Linkup starting re2
Aug 13 11:37:26 pf2 kernel: re2: link state changed to UP
Aug 13 11:37:29 pf2 kernel: vip50: link state changed to UP
Aug 13 11:37:33 pf2 dhcpd: timeout waiting for failover peer dhcp0
Aug 13 11:37:33 pf2 dnsmasq[19205]: read /etc/hosts - 103 addresses
Aug 13 11:37:35 pf2 dhcpd: timeout waiting for failover peer dhcp1
Aug 13 11:37:35 pf2 dnsmasq[19205]: read /etc/hosts - 103 addresses
Aug 13 11:38:01 pf2 php: : Stopping haproxy on CARP backup.
Aug 13 11:38:08 pf2 check_reload_status: Linkup starting re2
Aug 13 11:38:08 pf2 kernel: re2: link state changed to DOWN
Aug 13 11:38:09 pf2 kernel: vip50: MASTER -> BACKUP (more frequent advertisement received)
Aug 13 11:38:09 pf2 kernel: vip50: link state changed to DOWN
Aug 13 11:38:11 pf2 check_reload_status: Linkup starting re2
Aug 13 11:38:11 pf2 kernel: re2: link state changed to UP</up,loopback,running></up,loopback,running> -
Hi,
how about checking the switch where the pfsense connected and lastly the LAN card.
If it's still the same after isolation above, try other motherboard.Hope this can help you.
Thanks