CARP LAN both are master.



  • Hello i have 2 pfsense machines (soekriss 5501). that do CARP.
    Just simpel Wan and Lan.
    Now on the master Wan and Lan are master but on the slave the LAN interface is also Master.

    This gives me a lot of timeouts and slow network, if i shutdown the slave, things are running normal.

    What could be the cause of this.

    Thnaks for your time
    regards,
    Johan



  • We'll need some more details. Is the SYNC on a dedicated interface? Did you allow all traffic via the SYNC interface on both boxes? Can you ping the SYNC interfaces from the other CARP node? Can you ping the LAN interfaces from the other CARP node?



  • both pfsense boxes are on a soekriss 5501 board  with 4 interfaces.

    lan 192.168.0.0 on vr0  Wan 81.x.x.x on vr1  and sync 192.168.200.0 on vr3  vr4 is not used.
    The sync ports are connected through a crosscable. master 192.168.200.1 and slave 192.168.200.2
    All changes made on the master show up on the slave, the rules allow all protocols from the sync network
    (i followed the tutorial about failover.)
    slave has ip 192.168.0.252 master has 192.168.0.251 virtual ip is 192.168.0.2

    i can ping from master 192.168.200.1 on sync to slave 192.168.200.2 and visa versa
    i can ping both lan and virtual ips all together.

    they are connected to the main wan switch an old nortel 10/100 switch.

    the wan interface (which is working well) is connected to a netgear 5 ports switch.

    i just read in threads it could be something of multicast and some switches.

    thanks
    Johan



  • Generally this happens when your switch doesn't properly support multicast traffic, or is configured to block it, or something of that nature. Check your switch firmware, configuration, etc. and if possible try a different switch.



  • Johan,
    I'm seeing exactly the same problem on the lan-carp.
    I'm still trying to figure out what's going on.
    Running 1.2RC3

    Edit:  Tried a different switch and same problem.
    Very simple setup as well.



  • Well then try another switch.  This really does work fine and is one of our most heavily used features…



  • I tried 3 different switches already.
    I think I know what it is now.

    box 1: is a master with wan-carp and lan-carp.  captive portal is running
    box 2: if captive portal is enable, the lan-carp will switch to master.  As soon as the I disable captive portal, then the lan-carp will show up as "backup" right away.

    Is pfsync checking on the portal state on the LAN interface ?
    portal is running on LAN int, not lan-carp so why should this affect the CARP state?



  • Eh?  Please show the output of "ipfw show" from a command line / console.  We are allowing CARP/PFSYNC traffic.  Or we should be…



  • on box#1 - carp state for both wan and lan are "master"

    on box#2 running pfsense.  If captive portal is disabled, then the lan-carp state is "back up".  ipfw show:

    pfsense2:/#  ipfw show
    ipfw: getsockopt(IP_FW_GET): Protocol not available

    If I enable the captive portal, then the lan-carp state is "master" while wan-carp is still "backup". ipfw show is showing the rules.



  • I need a ipfw show when CP is ENABLED.



  • I'm using all public ip stuff - I sent it to your private email.
    If you can take a look and help me out that will be great.
    I did not want to include the public IPs here.

    Thanks



  • If i use the packet capture (gui) in pfsense for wich packages must i look for.(the missing multicast).
    I do not have fysical access to the router, so i can not go to console.

    Secondly is there a list of switches that work out of the box and which do not.

    I did not replace the switch (we do not have them in stock!!) but i do not use Captive Portal.



  • @Sullrich,
    here is the capture of ipfw show on the 2nd pfsense box in which the LAN CARP is showing as master instead of backup.

    pfsense2:~#  ipfw show
    00030 2537  886901 skipto 50000 ip from any to any in via rl1 keep-state
    00030  490  98976 skipto 50000 ip from any to any in via rl0 keep-state
    00500  272  15232 allow ip from 128.97.205.3 to any out via rl2
    00501    0      0 allow ip from any to 128.97.205.3 in via rl2
    01000  199  35301 skipto 50000 ip from any to any not layer2 not via rl2
    01001  199  35301 allow ip from any to any layer2 not via rl2
    01100    1      28 allow ip from any to any layer2 mac-type 0x0806
    01100    0      0 allow ip from any to any layer2 mac-type 0x888e
    01100    0      0 allow ip from any to any layer2 mac-type 0x88c7
    01100    0      0 allow ip from any to any layer2 mac-type 0x8863
    01100    0      0 allow ip from any to any layer2 mac-type 0x8864
    01100    0      0 allow ip from any to any layer2 mac-type 0x8863
    01100    0      0 allow ip from any to any layer2 mac-type 0x8864
    01100    0      0 allow ip from any to any layer2 mac-type 0x888e
    01101    0      0 deny ip from any to any layer2 not mac-type 0x0800
    01102  192  10752 skipto 20000 ip from any to any layer2
    01200    0      0 allow udp from any 68 to 255.255.255.255 dst-port 67 in
    01201    0      0 allow udp from any 68 to 128.97.205.3 dst-port 67 in
    01202    0      0 allow udp from 128.97.205.3 67 to any dst-port 68 out
    01203    0      0 allow icmp from 128.97.205.3 to any out icmptypes 8
    01204    0      0 allow icmp from any to 128.97.205.3 in icmptypes 0
    01300    0      0 allow udp from any to 128.97.205.3 dst-port 53 in
    01301    0      0 allow udp from 128.97.205.3 53 to any out
    01302    0      0 allow tcp from any to 128.97.205.3 dst-port 8000 in
    01303    0      0 allow tcp from 128.97.205.3 8000 to any out
    01304    0      0 allow tcp from any to 128.97.205.3 dst-port 8001 in
    01305    0      0 allow tcp from 128.97.205.3 8001 to any out
    10000    0      0 skipto 50000 ip from any to 128.97.186.150 in
    10000    0      0 skipto 50000 ip from 128.97.186.150 to any out
    10001    0      0 skipto 50000 ip from any to 128.97.229.250 in
    10001    0      0 skipto 50000 ip from 128.97.229.250 to any out
    10002    0      0 skipto 50000 ip from any to 164.67.128.1 in
    10002    0      0 skipto 50000 ip from 164.67.128.1 to any out
    10003    0      0 skipto 50000 ip from any to 164.67.128.2 in
    10003    0      0 skipto 50000 ip from 164.67.128.2 to any out
    10004    0      0 skipto 50000 ip from any to 164.67.62.100 in
    10004    0      0 skipto 50000 ip from 164.67.62.100 to any out
    10005    0      0 skipto 50000 ip from any to 164.67.62.101 in
    10005    0      0 skipto 50000 ip from 164.67.62.101 to any out
    10006    0      0 skipto 50000 ip from any to 164.67.62.102 in
    10006    0      0 skipto 50000 ip from 164.67.62.102 to any out
    10007    0      0 skipto 50000 ip from any to 169.232.33.135 in
    10007    0      0 skipto 50000 ip from 169.232.33.135 to any out
    10008    0      0 skipto 50000 ip from any to 169.232.35.150 in
    10008    0      0 skipto 50000 ip from 169.232.35.150 to any out
    10009    0      0 skipto 50000 ip from any to 169.232.46.139 in
    10009    0      0 skipto 50000 ip from 169.232.46.139 to any out
    10010    0      0 skipto 50000 ip from any to 169.232.47.139 in
    10010    0      0 skipto 50000 ip from 169.232.47.139 to any out
    10011    0      0 skipto 50000 ip from any to 169.232.48.139 in
    10011    0      0 skipto 50000 ip from 169.232.48.139 to any out
    10012    0      0 skipto 50000 ip from any to 169.232.48.157 in
    10012    0      0 skipto 50000 ip from 169.232.48.157 to any out
    19902    0      0 fwd 127.0.0.1,8000 tcp from any to any dst-port 80 in
    19903    0      0 allow tcp from any 80 to any out
    19904  192  10752 deny ip from any to any
    29900  192  10752 allow ip from any to any layer2
    65535 3229 1021346 allow ip from any to any



  • I just commited some fixes for this.  Please install a snapshot from http://snapshots.pfsense.com/FreeBSD6/RELENG_1_2/updates/ in 2-3 hours from now.



  • I updated firmware on both boxes.  It is showing:

    1.2-RC3
    built on Mon Dec 10 16:14:30 EST 2007

    anyway, the new firmware still showing the same status as before. 
    On the 2nd box, wan-carp is "back-up" and lan-carp status is still "master" when captive portal is enabled on the 2nd box.  At the same time, box #1 is showing both wan-carp and lan-carp as master (captive portal also enabled),

    Also, when I manually disable captive portal on the second box, console via com port is showing:

    IP firewall unloaded
    Warning: memory type IpFw/IpAcct leaked memory on destroy (1 allocations, 1024 b
    ytes leaked).
    ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding enabled, defau
    lt to accept, logging disabled



  • Do you see two new rules of "ipfw show" that mention carp and pfsync?



  • no - I did not see that at all.
    I compared the old output and new and they are nearly identical.
    Let me look closer.

    edit:  I did not see anything different.



  • Tell me if /etc/inc/captiveportal.inc has pfsync and carp in it.

    You can do this from a shell: cat /etc/inc/captiveportal.inc | grep pfsync



  • No. nothing there.
    grep for pfsync in /etc/inc/captiveportal.inc does not show any occurrence of pfsync.



  • Then you must have upgraded before the snapshot server created the newer images.

    Please upgrade to an image from http://snapshots.pfsense.com/FreeBSD6/RELENG_1_2/updates/pfSense-Full-Update-1.2-RC3.tgz



  • do you have the full-embedded image because that's what I'm using here.
    Thanks

    edit: I'm getting the pfSense-Embedded-Update-1.2-RC3.tgz - will that work?

    edit 2:  I upgraded both boxes with new image.
    cat /etc/inc/captiveportal.inc | grep pfsync is not showing anything.

    The 2nd box is still showing "master" on lan-carp when captiveportal is enabled.
    I guess the pfSense-Embedded-Update-1.2-RC3.tgz does not have the patch.
    system overview is showing: 
    1.2-RC3
    built on Tue Dec 11 11:52:19 EST 2007



  • Reinstall from a recent snapshot then, please.



  • Do you mean you want me to physdiskwrite using the pfSense-Embedded-Update-1.2-RC3.tgz rather than the firmware load via the GUI ?



  • Yep.



  • I'm sorry, I think i am missing something elementary here.
    physdiskwrite using the pfSense-Embedded-Update-1.2-RC3.tgz - cannot boot after that

    Using an older image, I can boot up just fine.

    I notice that, with the new imagem using physdiskwrite I get:

    Found compressed image file
    62023680/62023680 bytes written in total

    and using and older image, I get:
    Found compressed image file
    122441728/122441728 bytes written in total



  • http://snapshots.pfsense.com/FreeBSD6/RELENG_1_2/embedded/pfSense.img.gz .. That other file was an update file.. Sorry.



  • Reinstalled from scratch using the suggested snapshot.
    Problem is still there. 
    2nd box is still showing lan-carp as master when captive portal is enabled.

    cat /etc/inc/captiveportal.inc | grep pfsync is not showing anything.

    note: reinstall done on both boxes..



  • Hi,
    Any new development on this ? - Thanks



  • Our snapshot system is busted currently.  There is an open ticket that I need to check into.

    In the meantime, replace /etc/inc/captiveportal.inc with http://pfsense.com/cgi-bin/cvsweb.cgi/pfSense/etc/inc/captiveportal.inc?rev=1.58.2.42.2.6;content-type=text%2Fplain;only_with_tag=RELENG_1_2



  • I replaced the captiveportal.inc you provided on both boxes.

    • cat /etc/inc/captiveportal.inc | grep pfsync is showing:
      $cprules =  "add 500 set 1 allow pfsync from any to any\n";

    • ipfw show is showing:
      00500 1347 328072 allow pfsync from any to any
      00500 2086 116816 allow carp from any to any
      00500  11    608 allow ip from 128.97.205.2 to any out via rl2

    scenario #1

    • box#1 (captiveportal ON), box#2 (captiveportal OFF).
      On box#1 - wan-carp and lan-carp are both master
      On box#2 - wan-carp is backup and lan-carp is master

    Before replacing the file, this showed "backup" for both carps.

    Scenario #2

    • box#1 (captiveportal ON), box#2 (captiveportal ON).
      Same as above.


  • Well I am at my wits end then.  This really should have fixed it.



  • I'm going to tear down everything and start from scratch all over again.
    This time I will configure everything manually rather than upload the configuration.
    I've encountered minor quirks before when uploading the config.  I will report back.
    And thanks for your help.  I really appreciate it.



  • I had something like this with my cluster. (but not using CP I didn't think that worked with CARP has this been fixed?)

    After running CARP for ages with no problems I decided to unplug the KVM from the slave to use on another machine. So I unplugged it and rebooted the slave and up it came all fine so went back to the WebGUI to check and after a few mins of fiddling the slave became master on the LAN on its own. So rebooted and same prob so I plugged the kvm back in and no problem.

    It seemed to be having some issue sharing IRQs for the nics with no kvm attached. In the end I fiddled with the IRQ settings changing them from auto to fixed and it has been fine ever since.

    I cant remember what the message was but it would pop up on the console

    So might be worth a look


Log in to reply