HA XMLRPC error
-
Are you seeing syn blocks? If you do not see a SYN in the blocks then your only out of state..
-
@bolvar said in HA XMLRPC error:
@johnpoz
But why :D . I made a rule to communicate from the second to the primary, but it didnt solve the problem. Could the xmlrpc error came from this?Because that indicates asymmetric traffic. Like you forcing it out to one interface instead of the one it should or would be taking. Then you end up with traffic that comes from/to an interface, that was not expected. You'd have to show your rules, gateways, routing etc. to see where that may come from. And why you have traffic from the firewall on port 80. Did you disable HTTPS for the WebUI?
-
that is the sync traffic.. makes no sense that there would be any issues.. But as you stated we have no actual details of how he has everything configured or connected.
https://docs.netgate.com/pfsense/en/latest/highavailability/configuring-high-availability.html
-
Aye, if there's some "route everything via VPN" or other such tidbits in play, hell knows what hoops the traffic will hop through ;)
-
Would seem odd it would send traffic out a vpn since its a directly attached interface.. Maybe he just need to restart the sync process? Not a lot of ha experience, but would seem almost impossible to mess up to be honest since it is suppose to be a dedicated connection via a wire between the 2 boxes.
But all of the traffic looks to be out of state, since there are no syn's blocked. And looks like something wanting to close the connection since there is Fin and RST even in there.
Are those outbound blocks, sure looks like it, just noticed that - see the little black arrows?
That makes ZERO sense! So some floating rule, not sure how you could even do default deny rule outbound on an interface?
-
Aye, we have quite some cluster customers, but never seen that on a Sync interface. The original error is one we have on 1-2 locations, too, though, but nothing with that kind of OOS traffic
-
What not make sense, is i think i have found the problem.
Under System/Routing/Gateways if i disable the Gateway Monitoring the problem gone...What the f**k.
At the sync he checks the interfaces states?I have only 1 public ip, so on my wan interface i have a private ip.
The problem could came from this?
-
I have no idea how you would be getting outbound blocking? It does not do that out of the box - you must of put in a floating rule.
And you prob had something setup to flush states on gateway loss? Not sure why you would setup a gateway on your sync interface?? That makes no sense to do such a thing. What would you even set it too?
-
There are no gw on the sync interface.
I have no floating rule set up.
So this is a big question how realy this ha workes, beacause now i feel the outdate-ed netgate video and setup is a crap. -
Very odd to be sure... I would hope someone with more HA experience could chime in.. While I have setup a HA for play in vm.. Never ran into such blocks..
Is this the video your talking about?
https://www.slideshare.net/NetgateUSA/high-availability-on-pfsense-24-pfsense-hangout-march-2017 -
Nothing magical about XMLRPC sync. It's just a TCP/HTTPS connection to the webgui port on the secondary.
I suppose on the sending node that could happen if something has killed the state.
Do you have state killing on gateway failure enabled in System > Advanced, Miscellaneous?
-
But how/why would it be an outbound block?
-
Sorry I updated ^
-
But I still don't get how it could be an outbound block - that is what the little black arrow thing means.. I have to think there is some setting that are done that are not default..
I don't run HA setup, nor do I have it that setting set.. But seems odd that there could be outbound blocking even on a state kill?
-
If the primary has an open https connection to the secondary and is trying to send the config changes and something kills the state out from under it, it will continue to try to send the data until it times out and quits.
It will be out-of-state like anything else in that case.
# default deny rules block in log inet all tracker 1000000103 label "Default deny rule IPv4" block out log inet all tracker 1000000104 label "Default deny rule IPv4"
1000000104 is the outbound default deny so that would log like that.
-
Yeah I get that it is out of state, but it would be logged as an outbound block?? This is what is confusing me..
-
Sorry. updated again lol
-
Ah that is why its block as outbound.. Then..
block out log inet all tracker 1000000104 label "Default deny rule IPv4"
Normally never see outbound blocks.. But if its pfsense itself doing the talking, and the state goes away then that rule would block it since the state is missing. Until the process on pfsense creates a new state by sending syn.
-
Yeah. Everything that was initially set up by the TCP handshake starting with a SYN going out has been blown away so...
-
Ok that makes sense then - thanks. Even though there is a rule that allows pfsense to talk out, it still needs a valid state.
-
So if they are seeing this block - how do they restart the sync process so there is a new state created? I really need to play more with the HA stuff.. Time to fire up some vms and play with the HA setup ;) My understanding of the inner works of that is very lacking - I just have not had need to play with it.
-
@johnpoz It will kick off another sync when another change is made or there's a button in Status > Filter Reload (of all places).
-
hehe - that image just got better, I was thinking man derelict must be blind if has fonts/resolution set like that ;) Now it looks normal.. Before it was HUGE ;)
-
@johnpoz It plays pretty nice in VMs. If you decide to lab it and have any questions just shout. Nothing special needed in proxmox.
-
But if the sync is having issues talking to the other side, wouldn't it auto send a new syn?
-
@johnpoz I made a folder action that automatically downsizes screencaps from the 4K when they are taken. I have gotten lazy with Cmd-Option-Shift-4 (instead of Cmd-Shift-4) because it automatically sends the capture to the clipboard instead of the disk.
-
@johnpoz said in HA XMLRPC error:
But if the sync is having issues talking to the other side, wouldn't it auto send a new syn?
A config sync is a one-time/as-needed event. If the connection fails it isn't retried - or maybe it is I don't know. Not really sure of why it is coded that way (if it is) and wouldn't understand it if I looked in there.
But that would not change those logged blocks or the logged XMLRPC message. It would just try again and succeed.
-
So you running 4k on your monitor? You Suck! ;) you have all the good toys!
-
@johnpoz 5K iMac with a 4K on each side
-
Yeah you suck! ;) heheheh.. I finally updated main tv to 4k.. But upgrading my pc to do 4k with new monitor is cost prohibitive currently.. Damn budget committee (wife) can be a problem ;)
-
Hy
Nothing changes made everything is on default values.
The problem now gone when i checked out the gateway monitoring.Now its a little bit like pfsense has a soul :D
-
@Derelict said in HA XMLRPC error:
@johnpoz It will kick off another sync when another change is made or there's a button in Status > Filter Reload (of all places).
DAMN! Never even saw that/realized it is there. Important tidbit to add to my slides! :)
mutters to self: so many HA setups and never even saw that button... might be getting blind on my old days...
-
Status (CARP) seems like a better place for that. There must be...reasons.
Yeah. It's there because it gives progress feedback using the same mechanism as a filter reload.
-
@Derelict said in HA XMLRPC error:
Status (CARP) seems like a better place for that. There must be...reasons.
I'm sure ;) But ... what about bringing it to both places? I must say the filter reload screen is one of the last (and least) ones I was ever using and would have never searched for a HA related sync button there.
-
They probably wouldn't want to duplicate that command output display code on another page but a link to the filter reload page there might be possible.
-
Problem "solved".
I have monitoring on my wan gw and both on my core router.
I have disabled the monitoring on my wan gw and the error gone. So if you only have 1 public ip the gw monitoring should be off. Not the best solution but this workes only.