Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Secondary Pfsense Crash after CARP Configuration

    Scheduled Pinned Locked Moved Virtualization
    21 Posts 5 Posters 2.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      Mat1987
      last edited by

      Hi again

      i have deleted all my limiters and it still crashes.  here is the crash report.  HELP Please!

      Crash report begins.  Anonymous machine information:

      amd64
      11.1-RELEASE-p4
      FreeBSD 11.1-RELEASE-p4 #5 r313908+79c92265a31(RELENG_2_4): Mon Nov 20 08:18:22 CST 2017    root@buildbot2.netgate.com:/builder/ce-242/tmp/obj/builder/ce-242/tmp/FreeBSD-src/sys/pfSense

      Crash report details:

      No PHP errors found.

      Filename: /var/crash/bounds
      1

      Filename: /var/crash/info.0
      Dump header from device: /dev/gptid/78cabbc4-dad6-11e7-b9e4-005056b3ced1
        Architecture: amd64
        Architecture Version: 1
        Dump Length: 106496
        Blocksize: 512
        Dumptime: Sat Dec  9 17:23:14 2017
        Hostname: srvtcfw01
        Magic: FreeBSD Text Dump
        Version String: FreeBSD 11.1-RELEASE-p4 #5 r313908+79c92265a31(RELENG_2_4): Mon Nov 20 08:18:22 CST 2017
          root@buildbot2.netgate.com:/builder/ce-242/tmp/obj/builder/ce-242/tmp/FreeBSD-src/sys/pfSense
        Panic String: pfsync_undefer_state: unable to find deferred state
        Dump Parity: 583539232
        Bounds: 0
        Dump Status: good

      Filename: /var/crash/info.last
      Dump header from device: /dev/gptid/78cabbc4-dad6-11e7-b9e4-005056b3ced1
        Architecture: amd64
        Architecture Version: 1
        Dump Length: 106496
        Blocksize: 512
        Dumptime: Sat Dec  9 17:23:14 2017
        Hostname: srvtcfw01
        Magic: FreeBSD Text Dump
        Version String: FreeBSD 11.1-RELEASE-p4 #5 r313908+79c92265a31(RELENG_2_4): Mon Nov 20 08:18:22 CST 2017
          root@buildbot2.netgate.com:/builder/ce-242/tmp/obj/builder/ce-242/tmp/FreeBSD-src/sys/pfSense
        Panic String: pfsync_undefer_state: unable to find deferred state
        Dump Parity: 583539232
        Bounds: 0
        Dump Status: good

      Filename: /var/crash/minfree
      2048

      Filename: /var/crash/textdump.tar.0
      ddb.txt06000014000013213016002  7056 ustarrootwheeldb:0:kdb.enter.default>  run lockinfo
      db:1:lockinfo> show locks
      No such command
      db:1:locks>  show alllocks
      No such command
      db:1:alllocks>  show lockedvnods
      Locked vnodes
      db:0:kdb.enter.default>  show pcpu
      cpuid        = 0
      dynamic pcpu = 0x7ebf00
      curthread    = 0xfffff80007d6f000: pid 15262 "openvpn"
      curpcb      = 0xfffffe0096381cc0
      fpcurthread  = 0xfffff80007d6f000: pid 15262 "openvpn"
      idlethread  = 0xfffff8000351d000: tid 100003 "idle: cpu0"
      curpmap      = 0xfffff80007d47138
      tssp        = 0xffffffff82a73b90
      commontssp  = 0xffffffff82a73b90
      rsp0        = 0xfffffe0096381cc0
      gs32p        = 0xffffffff82a7a3e8
      ldt          = 0xffffffff82a7a428
      tss          = 0xffffffff82a7a418
      db:0:kdb.enter.default>  bt
      Tracing pid 15262 tid 100109 td 0xfffff80007d6f000
      kdb_enter() at kdb_enter+0x3b/frame 0xfffffe0096381250
      vpanic() at vpanic+0x1a3/frame 0xfffffe00963812d0
      panic() at panic+0x43/frame 0xfffffe0096381330
      pfsync_update_state() at pfsync_update_state+0x3c5/frame 0xfffffe0096381380
      pf_test() at pf_test+0x21cf/frame 0xfffffe00963815c0
      pf_check_out() at pf_check_out+0x1d/frame 0xfffffe00963815e0
      pfil_run_hooks() at pfil_run_hooks+0x7b/frame 0xfffffe0096381670
      ip_output() at ip_output+0x22b/frame 0xfffffe00963817c0
      ip_forward() at ip_forward+0x323/frame 0xfffffe0096381860
      ip_input() at ip_input+0x75a/frame 0xfffffe00963818c0
      netisr_dispatch_src() at netisr_dispatch_src+0xa0/frame 0xfffffe0096381910
      tunwrite() at tunwrite+0x226/frame 0xfffffe0096381950
      devfs_write_f() at devfs_write_f+0xe2/frame 0xfffffe00963819b0
      dofilewrite() at dofilewrite+0xc8/frame 0xfffffe0096381a00
      sys_writev() at sys_writev+0x8c/frame 0xfffffe0096381a60
      amd64_syscall() at amd64_syscall+0x6c4/frame 0xfffffe0096381bf0
      Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0096381bf0
      –- syscall (121, FreeBSD ELF64, sys_writev), rip = 0x8015a581a, rsp = 0x7fffffffde18, rbp = 0x7fffffffde50 ---
      db:0:kdb.enter.default>  ps
        pid  ppid  pgrp  uid  state  wmesg        wchan        cmd
      51715  308  308    0  S      nanslp  0xffffffff828b51e0 php-fpm
      49963 70428  308    0  S      nanslp  0xffffffff828b51e0 sleep
      71262 71220 69187    0  S      nanslp  0xffffffff828b51e0 sleep
      71220    1 69187    0  S      wait    0xfffff8001c105000 sh
      9738  9585  9738 65534  Ss      sbwait  0xfffff800079fa4a4 darkstat
      9585    1  9585 65534  Ss      select  0xfffff8001c237b40 darkstat
      7039 71757  7039    0  Ss      (threaded)                  sshlockout_pf
      100091                  S      piperd  0xfffff8001c39f8e8 sshlockout_pf
      100169                  S      nanslp  0xffffffff828b51e0 sshlockout_pf
      6749    1  6749    0  Ss      select  0xfffff800074f9b40 bsnmpd
      70428    1  308    0  S      wait    0xfffff80007ea4000 sh
      69866    1 69866  136  Ss      select  0xfffff80007f20cc0 dhcpd
      63516    1 63516    59  Ss      kqread  0xfffff80007744900 unbound
      52767    1 52767    0  Ss      (threaded)                  dpinger
      100156                  S      uwait    0xfffff8001c278080 dpinger
      100165                  S      sbwait  0xfffff800079f8144 dpinger
      100166                  S      nanslp  0xffffffff828b51e0 dpinger
      100167                  S      nanslp  0xffffffff828b51e0 dpinger
      100168                  S      accept  0xfffff800079f906c dpinger
      52697    1 52697    0  Ss      (threaded)                  dpinger
      100155                  S      uwait    0xfffff8001c2fbf00 dpinger
      100161                  S      sbwait  0xfffff80007be1144 dpinger
      100162                  S      nanslp  0xffffffff828b51e0 dpinger
      100163                  S      nanslp  0xffffffff828b51e0 dpinger
      100164                  S      accept  0xfffff800076bc3cc dpinger
        415    1  415    0  Ss+    ttyin    0xfffff80007504cb0 getty
        407    1  407    0  Ss+    ttyin    0xfffff800075050b0 getty
        123    1  123    0  Ss+    ttyin    0xfffff800075054b0 getty
      99998    1 99998    0  Ss+    ttyin    0xfffff800075058b0 getty
      99937    1 99937    0  Ss+    ttyin    0xfffff800074d78b0 getty
      99644    1 99644    0  Ss+    ttyin    0xfffff800074b00b0 getty
      99639    1 99639    0  Ss+    ttyin    0xfffff800074ae8b0 getty
      99388    1 99388    0  Ss+    ttyin    0xfffff800074ae4b0 getty
      82736    1 82213    0  S      select  0xfffff8001c0ec340 vmtoolsd
      79421 78825 78825    0  S      nanslp  0xffffffff828b51e0 minicron
      78825    1 78825    0  Ss      wait    0xfffff800079c3000 minicron
      77900 77877 77877    0  S      nanslp  0xffffffff828b51e0 minicron
      77877    1 77877    0  Ss      wait    0xfffff800079c4000 minicron
      77783 77255 77255    0  S      nanslp  0xffffffff828b51e0 minicron
      77255    1 77255    0  Ss      wait    0xfffff800079c5000 minicron
      71757    1 71757    0  Ss      select  0xfffff80007617cc0 syslogd
      32516    1 32516    0  Ss      (threaded)                  ntpd
      100117                  S      select  0xfffff80007daae40 ntpd
      31992    1 31992    0  Ss      nanslp  0xffffffff828b51e0 cron
      31697 31441 31441    0  S      kqread  0xfffff80007d5f100 nginx
      31679 31441 31441    0  S      kqread  0xfffff80007d5f200 nginx
      31441    1 31441    0  Ss      pause    0xfffff8000754d630 nginx
      15591    1 15591    0  Ss      bpf      0xfffff80007bdca00 filterlog
      15262    1 15262    0  Rs      CPU 0                      openvpn
      14091    1 14091    0  Ss      select  0xfffff80007cde3c0 openvpn
      7334    1  7334    0  Ss      select  0xfffff800074f71c0 sshd
        336    1  336    0  Ss      select  0xfffff800074f60c0 devd
        324  322  322    0  S      kqread  0xfffff80007745600 check_reload_status
        322    1  322    0  Ss      kqread  0xfffff80007745500 check_reload_status
        308    1  308    0  Ss      kqread  0xfffff80007746100 php-fpm
        58    0    0    0  DL      mdwait  0xfffff80007524800 [md0]
        25    0    0    0  DL      syncer  0xffffffff829aee00 [syncer]
        24    0    0    0  DL      vlruwt  0xfffff8000754a588 [vnlru]
        23    0    0    0  DL      (threaded)                  [bufdaemon]
      100083                  D      psleep  0xffffffff829ad604 [bufdaemon]
      100092                  D      sdflush  0xfffff8000752b8e8 [/ worker]
        22    0    0    0  DL      -        0xffffffff829ae2bc [bufspacedaemon]
        21    0    0    0  DL      pgzero  0xffffffff829c2a64 [pagezero]
        20    0    0    0  DL      psleep  0xffffffff829bef14 [vmdaemon]
        19    0    0    0  DL      (threaded)                  [pagedaemon]
      100079                  D      psleep  0xffffffff82a72f85 [pagedaemon]
      100086                  D      launds  0xffffffff829beec4 [laundry: dom0]
      100087                  D      umarcl  0xffffffff829be838 [uma]
        18    0    0    0  DL      -        0xffffffff829ace14 [soaiod4]
        17    0    0    0  DL      -        0xffffffff829ace14 [soaiod3]
        16    0    0    0  DL      -        0xffffffff829ace14 [soaiod2]
        15    0    0    0  DL      -        0xffffffff829ace14 [soaiod1]
          9    0    0    0  DL      -        0xffffffff82789700 [rand_harvestq]
          8    0    0    0  DL      pftm    0xffffffff80e930b0 [pf purge]
          7    0    0    0  DL      waiting_ 0xffffffff82a61d70 [sctp_iterator]
          6    0    0    0  DL      -        0xfffff800039c4448 [fdc0]
          5    0    0    0  DL      idle    0xfffffe0000ee1000 [mpt_recovery0]
          4    0    0    0  DL      (threaded)                  [cam]
      100020                  D      -        0xffffffff8265c480 [doneq0]
      100074                  D      -        0xffffffff8265c2c8 [scanner]
          3    0    0    0  DL      crypto_r 0xffffffff829bd3f0 [crypto returns]
          2    0    0    0  DL      crypto_w 0xffffffff829bd298 [crypto]
        14    0    0    0  DL      (threaded)                  [geom]
      100014                  D      -        0xffffffff82a39e20 [g_event]
      100015                  D      -        0xffffffff82a39e28 [g_up]
      100016                  D      -        0xffffffff82a39e30 [g_down]
        13    0    0    0  DL      sleep    0xffffffff82615c70 [ng_queue0]
        12    0    0    0  WL      (threaded)                  [intr]
      100004                  I                                  [swi1: netisr 0]
      100005                  I                                  [swi3: vm]
      100006                  I                                  [swi4: clock (0)]
      100008                  I                                  [swi6: task queue]
      100009                  I                                  [swi6: Giant taskq]
      100012                  I                                  [swi5: fast taskq]
      100021                  I                                  [irq14: ata0]
      100022                  I                                  [irq15: ata1]
      100023                  I                                  [irq17: mpt0]
      100025                  I                                  [irq256: ahci0]
      100026                  I                                  [irq257: pcib3]
      100027                  I                                  [irq258: vmx0]
      100028                  I                                  [irq259: pcib4]
      100029                  I                                  [irq260: pcib5]
      100030                  I                                  [irq261: pcib6]
      100031                  I                                  [irq262: pcib7]
      100032                  I                                  [irq263: pcib8]
      100033                  I                                  [irq264: pcib9]
      100034                  I                                  [irq265: pcib10]
      100035                  I                                  [irq266: pcib11]
      100036                  I                                  [irq267: vmx1]
      100037                  I                                  [irq268: pcib12]
      100038                  I                                  [irq269: pcib13]
      100039                  I                                  [irq270: pcib14]
      100040                  I                                  [irq271: pcib15]
      100041                  I                                  [irq272: pcib16]
      100042                  I                                  [irq273: pcib17]
      100043                  I                                  [irq274: pcib18]
      100044                  I                                  [irq275: pcib19]
      100045                  I                                  [irq276: vmx2]
      100046                  I                                  [irq277: pcib20]
      100047                  I                                  [irq278: pcib21]
      100048                  I                                  [irq279: pcib22]
      100049                  I                                  [irq280: pcib23]
      100050                  I                                  [irq281: pcib24]
      100051                  I                                  [irq282: pcib25]
      100052                  I                                  [irq283: pcib26]
      100053                  I                                  [irq284: pcib27]
      100054                  I                                  [irq285: pcib28]
      100055                  I                                  [irq286: pcib29]
      100056                  I                                  [irq287: pcib30]
      100057                  I                                  [irq288: pcib31]
      100058                  I                                  [irq289: pcib32]
      100059                  I                                  [irq290: pcib33]
      100060                  I                                  [irq291: pcib34]
      100061                  I                                  [irq1: atkbd0]
      100062                  I                                  [irq12: psm0]
      100067                  I                                  [swi1: pf send]
      100068                  I                                  [swi1: pfsync]
        11    0    0    0  RL                                  [idle: cpu0]
          1    0    1    0  SLs    wait    0xfffff80003518588 [init]
        10    0    0    0  DL      audit_wo 0xffffffff82a68f40 [audit]
          0    0    0    0  DLs    (threaded)                  [kernel]
      100000                  D      swapin  0xffffffff82a39e68 [swapper]
      100007                  D      -        0xfffff80003507900 [kqueue_ctx taskq]
      100010                  D      -        0xfffff80003507100 [aiod_kick taskq]
      100011                  D      -        0xfffff80003506e00

      1 Reply Last reply Reply Quote 0
      • M
        Mat1987
        last edited by

        Also been reading that the WAN has to be on the same NIC interface on backup cluster?

        Im using vmware on both boxes so does that mean same vswitch?

        1 Reply Last reply Reply Quote 0
        • DerelictD
          Derelict LAYER 8 Netgate
          last edited by

          Also be sure you remove all the calls to the limiters in the rules.

          Disable state syncing on both nodes and try again. Does it still crash? If so you might be looking at a different problem.

          Also been reading that the WAN has to be on the same NIC interface on backup cluster?

          ALL NICs have to be the same on both nodes in the same order. If WAN is igb0 on the primary, WAN has to be igb0 on the secondary, and so on. Generally not the source of a panic however, just "unexpected" behavior.

          You might want to start again - small, and get WAN+LAN working in a very basic HA pair before moving on to more advanced configurations. They're VMs. It don't cost nothin'.

          Both nodes have to be able to pass multicast between each other.

          Inability to do so will not result in a crash, however, but a MASTER/MASTER split brain issue.

          Chattanooga, Tennessee, USA
          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
          Do Not Chat For Help! NO_WAN_EGRESS(TM)

          1 Reply Last reply Reply Quote 0
          • M
            Mat1987
            last edited by

            Thanks

            I think the problem i have is the interfaces arent the same so ill have to try and move stuff around to get same interface names.

            so it has to be the same physical nic its not based on virtual nic?

            Mat

            1 Reply Last reply Reply Quote 0
            • DerelictD
              Derelict LAYER 8 Netgate
              last edited by

              An interface has a physical name (em0, re0, igb0, xn0, igb0.1000, lagg2.1001) and an internal name (wan, lan, opt1, opt2, opt3, optX).

              They all have to match exactly on both nodes.

              Use Status > Interfaces to verify.

              This is all covered in detail here: https://portal.pfsense.org/docs/book/highavailability/index.html

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              1 Reply Last reply Reply Quote 0
              • M
                Mat1987
                last edited by

                @Derelict:

                An interface has a physical name (em0, re0, igb0, xn0, igb0.1000, lagg2.1001) and an internal name (wan, lan, opt1, opt2, opt3, optX).

                They all have to match exactly on both nodes.

                Use Status > Interfaces to verify.

                This is all covered in detail here: https://portal.pfsense.org/docs/book/highavailability/index.html

                Painful lol.

                Internally they are all named the same but physical there not so ill have to change some bits around.

                few more days of playing then.

                1 Reply Last reply Reply Quote 0
                • M
                  Mat1987
                  last edited by

                  Ok set up quick test boxes on same host for now.  all HA works however cant ping the LAN Virtual IP until i set the MAC as static on the hosts.

                  Now i can ping but its up and down like a yoyo.

                  Any ideas?

                  1 Reply Last reply Reply Quote 0
                  • DerelictD
                    Derelict LAYER 8 Netgate
                    last edited by

                    https://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting

                    Chattanooga, Tennessee, USA
                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                    1 Reply Last reply Reply Quote 0
                    • M
                      Mat1987
                      last edited by

                      @Derelict:

                      https://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting

                      I have done the

                      Enable promiscuous mode on the vSwitch
                      Enable "MAC Address changes"
                      Enable "Forged transmits"

                      I have VM_Prod for VMS

                      I now have another port group of VM_Prod-PF and changed pfsense LAN to this port group.

                      Same problem though.

                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Request timed out.
                      Request timed out.
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                      Request timed out.
                      Request timed out.

                      1 Reply Last reply Reply Quote 0
                      • DerelictD
                        Derelict LAYER 8 Netgate
                        last edited by

                        Sorry. Runs great under XenServer. Someone else will have to help with VMware. It's certainly something in your virtual environment.

                        Moving to Virtualization.

                        Chattanooga, Tennessee, USA
                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                        1 Reply Last reply Reply Quote 0
                        • M
                          Mat1987
                          last edited by

                          Thanks for your help up to now anyway.

                          Anyone had this issue?

                          Cant ping virtual ip until the following is enabled

                          Enable promiscuous mode on the vSwitch
                          Enable "MAC Address changes"
                          Enable "Forged transmits"

                          Once enabled i start to get ping return but it times out.

                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Request timed out.
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=40ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=56ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=72ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=90ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=2ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64
                          Request timed out.
                          Request timed out.
                          Reply from 192.168.50.254: bytes=32 time=1ms TTL=64

                          1 Reply Last reply Reply Quote 0
                          • M
                            Mat1987
                            last edited by

                            Is there anyone who has got this working?

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.