Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Pfsync kernel panic after 2.1.5 to 2.2 to Upgrade - pfsync_undefer_state

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    73 Posts 13 Posters 23.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M Offline
      Marlenio
      last edited by

      Hi Steve,
      actually i'm on 2.2.1 64 bit. My CARPs ar in a production environment, so i cannot make too much test with snapshot. After the problem on last beta (Reply #52), i reinstalled official 2.2.1.

      Marlenio

      1 Reply Last reply Reply Quote 0
      • M Offline
        Marlenio
        last edited by

        This is the error log.

        Crash report begins.  Anonymous machine information:

        amd64
        10.1-RELEASE-p6
        FreeBSD 10.1-RELEASE-p6 #0 b69ba8f(releng/10.1)-dirty: Fri Mar 13 08:37:46 CDT 2015    root@pfs22-amd64-builder:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10

        Crash report details:

        Filename: /var/crash/bounds
        1

        Filename: /var/crash/info.0
        Dump header from device /dev/da0s1b
          Architecture: amd64
          Architecture Version: 1
          Dump Length: 156160B (0 MB)
          Blocksize: 512
          Dumptime: Wed Apr 15 14:08:59 2015
          Hostname: fw-ld01a.soft.it
          Magic: FreeBSD Text Dump
          Version String: FreeBSD 10.1-RELEASE-p6 #0 b69ba8f(releng/10.1)-dirty: Fri Mar 13 08:37:46 CDT 2015
            root@pfs22-amd64-builder:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10
          Panic String:
          Dump Parity: 3106681438
          Bounds: 0
          Dump Status: good

        Filename: /var/crash/info.last
        Dump header from device /dev/da0s1b
          Architecture: amd64
          Architecture Version: 1
          Dump Length: 156160B (0 MB)
          Blocksize: 512
          Dumptime: Wed Apr 15 14:08:59 2015
          Hostname: fw-ld01a.soft.it
          Magic: FreeBSD Text Dump
          Version String: FreeBSD 10.1-RELEASE-p6 #0 b69ba8f(releng/10.1)-dirty: Fri Mar 13 08:37:46 CDT 2015
            root@pfs22-amd64-builder:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10
          Panic String:
          Dump Parity: 3106681438
          Bounds: 0
          Dump Status: good

        Filename: /var/crash/textdump.tar.0
        ddb.txt06000014000012513452333  7073 ustarrootwheeldb:0:kdb.enter.default>  run lockinfo
        db:1:lockinfo> show locks
        No such command
        db:1:locks>  show alllocks
        No such command
        db:1:alllocks>  show lockedvnods
        Locked vnodes
        db:0:kdb.enter.default>  show pcpu
        cpuid        = 3
        dynamic pcpu = 0xfffffe00994fd780
        curthread    = 0xfffff800034c9490: pid 12 "irq261: bce0"
        curpcb      = 0xfffffe0000373cc0
        fpcurthread  = none
        idlethread  = 0xfffff80003291000: tid 100006 "idle: cpu3"
        curpmap      = 0xffffffff8217a048
        tssp        = 0xffffffff821950c8
        commontssp  = 0xffffffff821950c8
        rsp0        = 0xfffffe0000373cc0
        gs32p        = 0xffffffff82196b20
        ldt          = 0xffffffff82196b60
        tss          = 0xffffffff82196b50
        db:0:kdb.enter.default>  bt
        Tracing pid 12 tid 100044 td 0xfffff800034c9490
        bce_intr() at bce_intr+0x405/frame 0xfffffe0000373b20
        intr_event_execute_handlers() at intr_event_execute_handlers+0xab/frame 0xfffffe0000373b60
        ithread_loop() at ithread_loop+0x96/frame 0xfffffe0000373bb0
        fork_exit() at fork_exit+0x9a/frame 0xfffffe0000373bf0
        fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000373bf0
        –- trap 0, rip = 0, rsp = 0xfffffe0000373cb0, rbp = 0 ---
        db:0:kdb.enter.default>  ps
          pid  ppid  pgrp  uid  state  wmesg        wchan        cmd
          59    0    0    0  DL      mdwait  0xfffff80003d7b800 [md0]
          23    0    0    0  DL      kpsusp  0xfffff80003da35e8 [syncer]
          22    0    0    0  DL      kpsusp  0xfffff80003da3aa8 [vnlru]
          21    0    0    0  DL      kpsusp  0xfffff80003da4128 [bufdaemon]
          20    0    0    0  DL      pgzero  0xffffffff82108e0c [pagezero]
          19    0    0    0  DL      pollid  0xffffffff81f64870 [idlepoll]
            9    0    0    0  DL      psleep  0xffffffff82108540 [vmdaemon]
            8    0    0    0  DL      psleep  0xffffffff82194304 [pagedaemon]
            7    0    0    0  DL      waiting_ 0xffffffff82184d70 [sctp_iterator]
            6    0    0    0  DL      pftm    0xffffffff80d055e0 [pf purge]
          18    0    0    0  DL      cooling  0xfffff8000350c558 [acpi_cooling0]
          17    0    0    0  DL      tzpoll  0xffffffff81ea6aa0 [acpi_thermal]
          16    0    0    0  DL      (threaded)                  [usb]
        100074                  D      -        0xfffffe0000b31ef0 [usbus5]
        100073                  D      -        0xfffffe0000b31e98 [usbus5]
        100072                  D      -        0xfffffe0000b31e40 [usbus5]
        100071                  D      -        0xfffffe0000b31de8 [usbus5]
        100069                  D      -        0xfffffe0000b21e18 [usbus4]
        100068                  D      -        0xfffffe0000b21dc0 [usbus4]
        100067                  D      -        0xfffffe0000b21d68 [usbus4]
        100066                  D      -        0xfffffe0000b21d10 [usbus4]
        100065                  D      -        0xfffffe0000b11ef0 [usbus3]
        100064                  D      -        0xfffffe0000b11e98 [usbus3]
        100063                  D      -        0xfffffe0000b11e40 [usbus3]
        100062                  D      -        0xfffffe0000b11de8 [usbus3]
        100060                  D      -        0xfffffe0000b01ef0 [usbus2]
        100059                  D      -        0xfffffe0000b01e98 [usbus2]
        100058                  D      -        0xfffffe0000b01e40 [usbus2]
        100057                  D      -        0xfffffe0000b01de8 [usbus2]
        100055                  D      -        0xfffffe0000af1ef0 [usbus1]
        100054                  D      -        0xfffffe0000af1e98 [usbus1]
        100053                  D      -        0xfffffe0000af1e40 [usbus1]
        100052                  D      -        0xfffffe0000af1de8 [usbus1]
        100050                  D      -        0xfffffe0000a64ef0 [usbus0]
        100049                  D      -        0xfffffe0000a64e98 [usbus0]
        100048                  D      -        0xfffffe0000a64e40 [usbus0]
        100047                  D      -        0xfffffe0000a64de8 [usbus0]
            5    0    0    0  DL      idle    0xfffffe0000a860f0 [ciss_notify0]
            4    0    0    0  DL      (threaded)                  [cam]
        100086                  D      -        0xffffffff81e9ea40 [scanner]
        100033                  D      -        0xffffffff81e9ec00 [doneq0]
            3    0    0    0  DL      crypto_r 0xffffffff82106a10 [crypto returns]
            2    0    0    0  DL      crypto_w 0xffffffff821068b8 [crypto]
          15    0    0    0  DL      -        0xffffffff81ebc100 [rand_harvestq]
          14    0    0    0  DL      (threaded)                  [geom]
        100019                  D      -        0xffffffff821794e0 [g_down]
        100018                  D      -        0xffffffff821794d8 [g_up]
        100017                  D      -        0xffffffff821794d0 [g_event]
          13    0    0    0  DL      (threaded)                  [ng_queue]
        100016                  D      sleep    0xffffffff81e5cf48 [ng_queue3]
        100015                  D      sleep    0xffffffff81e5cf48 [ng_queue2]
        100014                  D      sleep    0xffffffff81e5cf48 [ng_queue1]
        100013                  D      sleep    0xffffffff81e5cf48 [ng_queue0]
          12    0    0    0  RL      (threaded)                  [intr]
        100096                  I                                  [swi1: netisr 3]
        100095                  I                                  [swi1: netisr 2]
        100094                  I                                  [swi1: netisr 1]
        100084                  I                                  [swi1: pfsync]
        100082                  I                                  [swi1: pf send]
        100079                  I                                  [swi0: uart]
        100078                  I                                  [irq1: atkbd0]
        100075                  I                                  [irq14: ata0]
        100070                  I                                  [irq22: uhci4]
        100061                  I                                  [irq19: uhci3]
        100056                  I                                  [irq18: uhci2]
        100051                  I                                  [irq17: uhci1]
        100046                  I                                  [irq16: uhci0 ehci0]
        100045                  I                                  [irq262: bce1]
        100044                  Run    CPU 3                      [irq261: bce0]
        100042                  I                                  [irq260: ciss0]
        100031                  I                                  [swi5: fast taskq]
        100029                  I                                  [swi6: Giant taskq]
        100027                  I                                  [swi6: task queue]
        100012                  I                                  [swi3: vm]
        100011                  I                                  [swi4: clock]
        100010                  I                                  [swi4: clock]
        100009                  I                                  [swi4: clock]
        100008                  L      *Giant    0xfffff80003eb8240 [swi4: clock]
        100007                  I                                  [swi1: netisr 0]
          11    0    0    0  RL      (threaded)                  [idle]
        100006                  CanRun                              [idle: cpu3]
        100005                  Run    CPU 2                      [idle: cpu2]
        100004                  Run    CPU 1                      [idle: cpu1]
        100003                  CanRun                              [idle: cpu0]
            1    0    1    0  RLs    CPU 0                      [init]
          10    0    0    0  DL      audit_wo 0xffffffff8218b8f0 [audit]
            0    0    0    0  DLs    (threaded)                  [kernel]
        100138                  D      -        0xfffff80003d50d00 [dummynet]
        100085                  D      -        0xfffff800032b2300 [CAM taskq]
        100080                  D      -        0xfffff800035da700 [mca taskq]
        100041                  D      -        0xfffff800034bd100 [em3 txq]
        100040                  D      -        0xfffff800034bd300 [em3 que]
        100039                  D      -        0xfffff800034bda00 [em2 txq]
        100038                  D      -        0xfffff800034bdc00 [em2 que]
        100037                  D      -        0xfffff800034be700 [em1 txq]
        100036                  D      -        0xfffff800034be900 [em1 que]
        100035                  D      -        0xfffff800034bf100 [em0 txq]
        100034                  D      -        0xfffff800034bf300 [em0 que]
        100032                  D      -        0xfffff800032b2500 [kqueue taskq]
        100030                  D      -        0xfffff800032b2a00

        Marlenio

        1 Reply Last reply Reply Quote 0
        • M Offline
          Marlenio
          last edited by

          Another strange error: i disabled carp sync on MASTER (uncheck "Synchronize States"), but NOT deleting my Configuration Synchronization Settings. Now MASTER sync with SLAVE without any kind of error log.

          Apr 15 16:25:13 php-fpm[65408]: /rc.filter_synchronize: Filter sync successfully completed with https://192.168.183.2:443.
          Apr 15 16:25:11 php-fpm[65408]: /rc.filter_synchronize: XMLRPC sync successfully completed with https://192.168.183.2:443.
          Apr 15 16:25:09 check_reload_status: Reloading filter
          Apr 15 16:25:08 php-fpm[65408]: /rc.filter_synchronize: Beginning XMLRPC sync to https://192.168.183.2:443.
          Apr 15 16:25:07 check_reload_status: Syncing firewall
          Apr 15 16:24:47 php-fpm[18627]: /rc.filter_synchronize: Filter sync successfully completed with https://192.168.183.2:443.
          Apr 15 16:24:45 php-fpm[18627]: /rc.filter_synchronize: XMLRPC sync successfully completed with https://192.168.183.2:443.
          Apr 15 16:24:43 check_reload_status: Reloading filter
          Apr 15 16:24:42 php-fpm[18627]: /rc.filter_synchronize: Beginning XMLRPC sync to https://192.168.183.2:443.
          Apr 15 16:24:41 check_reload_status: Syncing firewall
          Apr 15 16:23:00 kernel: carp: demoted by 0 to 0 (pfsync bulk fail)
          Apr 15 16:23:00 kernel:
          Apr 15 16:22:38 php-fpm[70216]: /rc.filter_synchronize: Filter sync successfully completed with https://192.168.183.2:443.
          Apr 15 16:22:36 php-fpm[70216]: /rc.filter_synchronize: XMLRPC sync successfully completed with https://192.168.183.2:443.
          Apr 15 16:22:33 php-fpm[70216]: /rc.filter_synchronize: Beginning XMLRPC sync to https://192.168.183.2:443.
          Apr 15 16:22:32 check_reload_status: Syncing firewall
          Apr 15 16:22:11 php-fpm[69480]: /rc.filter_synchronize: Nothing has been configured to be synched. Skipping….

          Here i uncheck "Synchronize States"

          Apr 15 16:22:10 kernel: pfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred statepfsync_undefer_state: unable to find deferred state

          Marlenio

          1 Reply Last reply Reply Quote 0
          • F Offline
            fragged
            last edited by

            The log spam should be fixed in 2.2.2. It's pointless to test further on 2.2.1.

            1 Reply Last reply Reply Quote 0
            • M Offline
              Marlenio
              last edited by

              As i said in my post, i tried more than four snapshot, and i found a lot of problems. I made last test today with last snapshot, same problem, except log spam.
              Crash report is for Steve.

              Marlenio

              1 Reply Last reply Reply Quote 0
              • L Offline
                lowprofile
                last edited by

                So this issue has not been fixed in 2.2.2 ?  :(

                1 Reply Last reply Reply Quote 0
                • M Offline
                  Marlenio
                  last edited by

                  Yes, it seems to be fixed. CARP working, limiter too, and cpu load in MASTER server is very low.  :)

                  Marlenio

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S Offline
                    stephenw10 Netgate Administrator
                    last edited by

                    Interesting. I'm not aware of any specific fixes that have gone in for this since the earlier snapshots. Perhaps something else was causing the issue.
                    I'm sure we'll see many more reports in the next few days to condom or deny this.

                    Steve

                    1 Reply Last reply Reply Quote 0
                    • M Offline
                      Marlenio
                      last edited by

                      Two CARP upgraded (each one with two server), the first yesterday, the last one today. Works without problems: low CPU, CARP without error and log spam, and the back CARP with limiter on bandwidth. I don't know why, but 2.2.2 really works.  :)

                      Marlenio

                      1 Reply Last reply Reply Quote 0
                      • M Offline
                        Marlenio
                        last edited by

                        In 2.2.2 there is still a problem i found in 2.2.1, as i said before: on limited rules, i need to unchek  the flag "State Type" to "NO pfsync". If i don't, CARP Slave randomly crash.
                        As i said before, in my CARP configuration limited rules are only HTTP and HTTPS: the state is not really important for me. However, 2.2.2 it is much better than 2.2.1.  :)

                        Marlenio

                        1 Reply Last reply Reply Quote 0
                        • T Offline
                          tdale
                          last edited by

                          Fixed all my issues! I made my limiter rules, set them to 10Mbps, applied the firewall and did a test :)

                          [omitted@omitted ~]# wget cachefly.cachefly.net/100mb.test
                          –2015-04-16 14:02:53--  http://cachefly.cachefly.net/100mb.test
                          Resolving cachefly.cachefly.net... 205.234.175.175
                          Connecting to cachefly.cachefly.net|205.234.175.175|:80... connected.
                          HTTP request sent, awaiting response... 200 OK
                          Length: 104857600 (100M) [application/octet-stream]
                          Saving to: “100mb.test.7”

                          100%[======================================>] 104,857,600 1.18M/s  in 89s

                          2015-04-16 14:04:21 (1.13 MB/s) - “100mb.test.7” saved [104857600/104857600]

                          I tested it a few times :)

                          THANK YOU!!

                          1 Reply Last reply Reply Quote 0
                          • O Offline
                            ondrej
                            last edited by

                            Please is there any progress? I am also still having problems with this bug.
                            Thanks

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S Offline
                              stephenw10 Netgate Administrator
                              last edited by

                              Have you tried a 2.2.3 snapshot, if you are able to?
                              There have been fixes gone in to address this though it seems not all cases have been resolved. Some here have reported success.

                              Steve

                              1 Reply Last reply Reply Quote 0
                              • L Offline
                                lshiry
                                last edited by

                                I sent a couple crash logs for this yesterday.  This is on version 2.2.4.

                                1 Reply Last reply Reply Quote 0
                                • S Offline
                                  SteveITS Rebel Alliance
                                  last edited by

                                  Interesting to find this thread.  We didn't have this issue until upgrading from 2.2.6 to 2.3.1_5.  It does look like it is logged during a nightly copy that triggers a limiter, ending when the traffic stops.

                                  Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                                  When upgrading, allow 10-15 minutes to reboot, or more depending on packages, and device or disk speed.
                                  Upvote 👍 helpful posts!

                                  1 Reply Last reply Reply Quote 0
                                  • S Offline
                                    SteveITS Rebel Alliance
                                    last edited by

                                    In 2.3.1 there is a checkbox "No pfSync" on the rule, but checking that doesn't avoid the error message.  Nor does setting "State type" to None.

                                    I can duplicate by simply using "rsync –dry-run" so it's not like it needs a lot of traffic.

                                    Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                                    When upgrading, allow 10-15 minutes to reboot, or more depending on packages, and device or disk speed.
                                    Upvote 👍 helpful posts!

                                    1 Reply Last reply Reply Quote 0
                                    • A Offline
                                      AngelG
                                      last edited by

                                      I upgrade to 2.3.3 and kermel entry "pfsync_undefer_state: unable to find deferred state" in log persist. Any existing or planned solutions?

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S Offline
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        No direct cause has yet been found as far as I know.

                                        Obviously we would like to resolve it as soon as possible.

                                        https://redmine.pfsense.org/issues/4310

                                        Steve

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.