HELP … An error code was received while attempting XMLRPC sync



  • Hi,

    I did a simple carp setup.

    master,      backup,
    LAN            LAN
    WAN          WAN
    WAN2_opt1  WAN2_opt2
    SYNC_opt2  SYNC_opt1

    Im using crossover at SYNC interfaces.
    192.168.222.1 to 192.168.222.2

    No settings did sync, so i created LAN VIP on bacup, and it became MASTER with "Advertising Frequency" set to "0", and it fell back to backup if i set it to "1".

    But still no settings on any area sync.
    Master has almost all the "syncronise this and that" stuff checked
    It has the correct IP od SYNC if. of backup and the admin password (both are the same).

    I discovered some errors from the system log of the MASTER:

    Jul 18 19:24:34 php: : An error code was received while attempting XMLRPC sync with username admin http://192.168.222.2:80 - Code 104: XML error: not well-formed (invalid token) at line 2653

    Jul 18 19:24:34 php: : New alert found: An error code was received while attempting XMLRPC sync with username admin http://192.168.222.2:80 - Code 104: XML error: not well-formed (invalid token) at line 2653

    Jul 18 19:24:44 kernel: arp_rtrequest: bad gateway 194.168.0.3 (!AF_LINK) –--- this is my VIP LAN ip. I also get it on all other VIP's, bot not on real IP-s. It also shows on the BACKUP pfsense's one and only LAN VIP...

    What could this mean ?
    What could be wrong ?



  • I found this now :
    http://devwiki.pfsense.org/wikka.php?wakka=CARPConfigurationSyncTroubleShooting

    ~~  1.  The username must be admin on all nodes~~ Actually i found out, in 1.2 it must just be same in all pfsense boxes.. dont have to be "admin"
    ~~  2. The passwords must match on all nodes~~
    ~~  3. You must permit traffic to the webConfigurator port on the interface that you are syncing against~~
    ~~  4. The webConfigurator must be on the same port on each cluster node~~
      5. The other node that you are syncing to interface must be enabled
      6. Remove ALL special characters from every description that you are syncing. Nat rules, firewall, etc.

    If 5. means both pfsense's have to have equal number of interfaces with same names and in same order then i fail there. !? If it means that SYNC interfaces must be "enabled" then i pass this one which i think "node" implies to.

    6. Im going to try "sync VIP" checked only.. i have a lot of rules and rows which may contain some "illegal" characters. This will happen next monday.. as i dont remember the MAC to give a wake on lan call to the backup machine.



  • 5. i failed also as i dicovered this morning…
    A quote from this link...
    http://forum.pfsense.org/index.php/topic,7595.0.html ...
    @heiko:

    and pay attention! the interfaces must have the same order, please take a look at the screenshot…..

    e.g. LAN, WAN, SYNC for the master and LAN,SYNC,WAN is not working......

    i have opt1 and opt2 mixed up…



  • So errors about " (invalid token) " are special character errors (point 6.)
    arp_rtrequest: bad gateway errors are some "cosmetic" information i heard.

    So i fixed the Interface order..
    WAN2_opt1  WAN2_opt2
    SYNC_opt2  SYNC_opt1
    to
    WAN2_opt1  WAN2_opt1
    SYNC_opt2  SYNC_opt2

    Removed all special chars and it seemed to work for a second.
    Then the Bacup machine became Master !.. like wtf !
    And I began getting XMLRPC sync communication error messages.

    SYNC interfaces on both machines are old 10 Mbit/s NICs… others are 100/10 realteks.
    Backup has an ISA card, Master has a PCI card. The ISA card failes to state "link up" marking " * " at the console tree on the monitor. But the card should work fine... i have in another setup 2 old cards not stating "link up/down" messages working properly.

    So for just fun i switched WAN2_opt1 NIC with SYNC_opt2 NIC at the backup machine and no joy.
    So to mess things more up.. i switched WAN2_opt1 NIC with SYNC_opt2 at the Master also... and the communication error dissapeared !

    Now i got curious.. is my 10 Mb NIC on Master damaged somehow ?!?!
    I shut down the backup machine... because it became Master again... damn fool.
    Now i tried using the WAN2_opt1 (which now used the 10 Mb card) connections and it worked fine.. no errors in system log or on console screen.

    And i have to mention.. it got quite unstable.. mostly the backup system just restarts !
    Backup is a 400Mhz 500MB RAM dinosaur... could be hardware.. tho it worked fine with Win2000. I turned ACPI off.. the ISA card got some errors otherwise.

    Once the both machines restarted at the same time. Now Master is a 900Mhz P3, also old, but known to be very stable. So must be something to do with CARP / SYNC.

    Im not giving up jet.. going to get another 400Mhz Pentium for backup soon to rule out hardware probs.

    –------------------------------

    1. So is it a known thing, that CARP cant tolerate 10Mb/s old hardware ?

    2. And why the hell is my backup becoming Master (especially when i save the CARP config. to force the sync) ???
      (I have VIP's synced from master and they all get "100" for the.. mmm the.. slot which should mean these are backup VIP's)



  • arp_rtrequest: bad gateway errors are some "cosmetic" information i heard.

    What version of pfSense are you running?!
    If on a 1.2.1 version please upgrade to a recent snapshot.



  • ::) nice to know it's fixed … altho i was not bothered by it. It bothers me more that my CARP is not functioning.



  • Well answer the question to know it was related to carp :P



  • Both are:

    1.2-RELEASE
    built on Sun Feb 24 17:04:58 EST 2008

    Seems ok to me…



  • So i today i spent a LOT of time figuring out my errors, but seems that CARP failover just sucks for MultiWAN.

    I replaced the backup machine hardware 100%.
    Then i removed the second WAN.. just disabled OPT1 on both boxes and pluged out the cable.
    (failover "Load balancer" was still intact tho !)

    Then i discovered some weird character errors on SYNC.. fixed em.
    It SYNCEd fine now.. but it is not stable at all.

    I did a partial crash test on Master box… pluged the WAN cable out.
    Guess what... Backup machine didnt help, because Master machine LAN was still plugged in and THIS was the Master.. but without a connection.

    Ok a real crash test... plugged out all cables on Master did actually work. But not as advertised. The delay was 5-10 sec, not 0-2. And streaming media (google vide, youtube, MediaPlayer Radio) DID crash and had to be reinitialized from zero... also other connections.

    Ohh and when the master comes back online.. it takes 1-2 min and the streaming media brakes down again !... and between the 1-2 min you i saw some traffic going to Master and other to Backup machine... so i looked over which one is Master at the moment and they both were... nice joke... so it doesn't work.

    Going to do a last confirming test now...



  • Hopless… just no way to get MultiWAN working with CARP failover.

    I have this situation now...
    I figured out ISP2 gives me only 2 IPs, so im screwed... but im screwed because CARP sucks, not because of this ISP.

    U see.. there is no way to use Multiwan on MASTER and One WAN on BACKUP (so only one ISP would be CARP-ed)

    I ended up so that Backup is making random reboots now. It didnt survive Master crash at all.. and after Backup became Master.. and the real master woke up - it NEVER gave back the Master status to the right box. Basically after a crash.. the internet would never come back automatically.

    Seems im on "manual" hardware failover now.


Log in to reply