How to debug state sync issues?
-
Hi there,
I think I got my CARP/HA Setup working.
Status -> CARP (failover)
shows the correct state IDs. But I think the firewall might be playing tricks on me. Assume the following setup:client(192.168.11.43) -- (192.168.11.2 on vtnet2.511) pfsense1 (10.7.200.2 on vtnet2.192) -- target(10.7.200.12)
Then there is a second pfsense box with
192.168.11.3
&10.7.200.3
repectively. Now when I have an open SSH session (SSH is just an example this happens with every protocol) and connect fromclient
totarget
and then enterpersistent carp maintainance mode
onpfsense1
the connection will hang. The firewall filter log will then show:Jun 9 14:28:57 pfSense2 filterlog[94533]: 4,,,1000000103,vtnet2.511,match,block,in,4,0x48,,64,41324,0,DF,6,tcp,124,192.168.11.43,10.7.200.12,51568,22,72,PA,1544295413:1544295485,73422640,501,,nop;nop;TS Jun 9 14:30:07 pfSense2 filterlog[94533]: 4,,,1000000103,vtnet2.192,match,block,in,4,0x48,,64,44921,0,DF,6,tcp,104,10.7.200.12,192.168.11.43,22,46976,52,PA,716626969:716627021,2743249057,501,,nop;nop;TS Jun 9 14:30:08 pfSense2 filterlog[94533]: 4,,,1000000103,vtnet2.192,match,block,in,4,0x48,,64,44922,0,DF,6,tcp,104,10.7.200.12,192.168.11.43,22,46976,52,PA,716626969:716627021,2743249057,501,,nop;nop;TS Jun 9 14:30:08 pfSense2 filterlog[94533]: 4,,,1000000103,vtnet2.192,match,block,in,4,0x48,,64,44923,0,DF,6,tcp,104,10.7.200.12,192.168.11.43,22,46976,52,PA,716626969:716627021,2743249057,501,,nop;nop;TS Jun 9 14:30:08 pfSense2 filterlog[94533]: 4,,,1000000103,vtnet2.192,match,block,in,4,0x48,,64,44924,0,DF,6,tcp,104,10.7.200.12,192.168.11.43,22,46976,52,PA,716626969:716627021,2743249057,501,,nop;nop;TS Jun 9 14:30:09 pfSense2 filterlog[94533]: 4,,,1000000103,vtnet2.192,match,block,in,4,0x48,,64,44925,0,DF,6,tcp,104,10.7.200.12,192.168.11.43,22,46976,52,PA,716626969:716627021,2743249057,501,,nop;nop;TS Jun 9 14:30:11 pfSense2 filterlog[94533]: 4,,,1000000103,vtnet2.192,match,block,in,4,0x48,,64,44926,0,DF,6,tcp,104,10.7.200.12,192.168.11.43,22,46976,52,PA,716626969:716627021,2743249057,501,,nop;nop;TS
which suggests that either the firewall isn't able to associate those packages with a state or the state didn't get synced properly. How can I debug this further?
Thanks,
Florian -
Also: Should I see the synced states via
Diagnostics -> States
? -
Ok, so the states are synced properly:
pfctl -s states | grep '10.7.200.12:22' all tcp 10.7.200.12:22 <- 10.7.22.120:51714 ESTABLISHED:ESTABLISHED all tcp 10.7.22.120:51714 -> 10.7.200.12:22 ESTABLISHED:ESTABLISHED
this command shows the connection on both hosts. So it seems like the mighty pf might have a problem with me somehow somewhere.
-
Digging deeper I have made the following observation: I did not see the state on the second firewall in the GUI because I tried to filter on the interface but the backup firewall shows
all
as interface instead. Is that correct? Shouldn't pfsync sync the rules to the correct iface?Also investigating the state with
pfctl -s states -v
shows that the backup firewall misses the rule id that the first firewall has:[1412868241 + 4294639872] wscale 7 [2149509812 + 16711936] wscale 7 age 00:01:16, expires in 23:58:44, 3:2 pkts, 164:133 bytes, rule 203
Should it replicate the rule number as well?
-
@apollo13 said in How to debug state sync issues?:
Also: Should I see the synced states via
Diagnostics -> States
?yes.
What does Status/CARP -> State Synchronization Status show? It should match.
What version pfSense are you on? Hardware interface names had to match before 22.01/2.6.
-
Hi Steve,
@SteveITS said in How to debug state sync issues?:
@apollo13 said in How to debug state sync issues?:
Also: Should I see the synced states via
Diagnostics -> States
?yes.
I see them now but the interface shows as "all" and not the actual interface from the other firewall. But I guess that is okay.
What does Status/CARP -> State Synchronization Status show? It should match.
Same for both nodes, I just recently switched them to 1 & 2:
1 (This node) 2 77a0485b fd899c14
What version pfSense are you on? Hardware interface names had to match before 22.01/2.6.
23.05
-
This seems to be the same issue as https://redmine.pfsense.org/issues/13569 -- I'd love to debug this further but I am not sure what else to look into.
-