Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Firewall hangs and reboots since upgrade to 2.2.3

    Scheduled Pinned Locked Moved General pfSense Questions
    11 Posts 4 Posters 1.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • W Offline
      wricaurte
      last edited by

      Since upgrading to 2.2.3 my home firewall hangs and sometimes it reboots also. After these events I get a message at the Web Configurator saying that a core dump was found, I submitted it some times. This morning I got the dump file which is attached. I realized the firewall has rebooted 5 hours ago.

      I run:

      • 7 Ipsec VPNs (1 with an Alix pfsense and 6 with a TPLINK TL-ER604W)
      • 2 openvpn servers
      • 2 openvpn clients
      • PPTP Server
      • DNS Resolver
      • DHCP Server
      • Many NAT and firewall rules (Between local and VPN nets)
      • Squid + SquidGuard

      I can not read the dump properly, I hope somebody could help me to make sure I´m not having a hardware related problem, last weekend I reinstalled the system with a new SSD, the same configuration worked fine on 2.2.2.

      I only see IPSEC related messages, the IPSEC configuration has not changed from 2.2.2 to 2.2.3.

      I can not access my home because it seems the firewall did not come back after rebooting again, I´ll check tonight when I´m home.

      Regards.
      dump_ndxfw-Jul-9-2015.txt

      1 Reply Last reply Reply Quote 0
      • jimpJ Offline
        jimp Rebel Alliance Developer Netgate
        last edited by

        The actual crash dump/panic appears to be complaining about the filesystem

        curthread    = 0xfffff8005eb82490: pid 65120 "squid"
        curpcb       = 0xfffffe0036609cc0
        fpcurthread  = none
        idlethread   = 0xfffff80003210920: tid 100004 "idle: cpu1"
        curpmap      = 0xfffff8005e6bf678
        tssp         = 0xffffffff8219cff8
        commontssp   = 0xffffffff8219cff8
        rsp0         = 0xfffffe0036609cc0
        gs32p        = 0xffffffff8219ea50
        ldt          = 0xffffffff8219ea90
        tss          = 0xffffffff8219ea80
        db:0:kdb.enter.default>  bt
        Tracing pid 65120 tid 100256 td 0xfffff8005eb82490
        softdep_disk_io_initiation() at softdep_disk_io_initiation+0xdb0/frame 0xfffffe00366094c0
        ffs_geom_strategy() at ffs_geom_strategy+0x15e/frame 0xfffffe00366094f0
        bufwrite() at bufwrite+0x142/frame 0xfffffe0036609530
        ffs_update() at ffs_update+0x25e/frame 0xfffffe00366095b0
        ffs_write() at ffs_write+0x542/frame 0xfffffe0036609650
        VOP_WRITE_APV() at VOP_WRITE_APV+0x145/frame 0xfffffe0036609760
        vn_write() at vn_write+0x248/frame 0xfffffe00366097e0
        vn_io_fault_doio() at vn_io_fault_doio+0x22/frame 0xfffffe0036609820
        vn_io_fault1() at vn_io_fault1+0x7c/frame 0xfffffe0036609970
        vn_io_fault() at vn_io_fault+0x18b/frame 0xfffffe00366099f0
        dofilewrite() at dofilewrite+0x87/frame 0xfffffe0036609a40
        kern_writev() at kern_writev+0x68/frame 0xfffffe0036609a90
        sys_write() at sys_write+0x63/frame 0xfffffe0036609ae0
        amd64_syscall() at amd64_syscall+0x351/frame 0xfffffe0036609bf0
        Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0036609bf0
        
        

        As you can see it was squid that was active at the time, and the various function calls in the backtrace are mostly filesystem related (ffs, vn, filewrite, etc)

        Though there are some later ipsec errors they don't appear to be related the actual panic/crash

        ipsec4_checkpolicy: invalid policy 3
        ipsec4_checkpolicy: invalid policy 3
        ipsec4_checkpolicy: invalid policy 3
        
        

        It may be worth giving a 2.2.4 snapshot a try to see if our changes to the filesystem help (we fixed some issues in pw and in config.xml writing that could cause problems, and we turned sync back off).

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • W Offline
          wricaurte
          last edited by

          Thanks a lot.. I´m going to install the snapshot, I´ll let you know if it fixes the problem.

          Regards.

          1 Reply Last reply Reply Quote 0
          • W Offline
            wricaurte
            last edited by

            Sadly I got a crash dump again, the firewall rebooted again. I attach the dump file for further revision.

            The snapshot was "2.2.4-DEVELOPMENT (amd64) built on Fri Jul 10 00:17:53 CDT 2015", it did not let me do any configuration change. By now I'm going back to 2.2.2. I have the old HDD still working if you want me to do any test, no problem at all :).

            I'm sure you will fix it as always. Thanks a lot for your help.

            Regards.

            dump_pfsense_ndxfw-Jul-11-2015.txt

            1 Reply Last reply Reply Quote 0
            • W Offline
              wricaurte
              last edited by

              This problem persists in 2.2.4-STABLE, may be I found a pattern:

              • Open the serial console
              • Put some traffic on the firewall, something like watching Netflix

              The firewall restarted every time after 1 or 2 minutes I started to watch any movie. Sometimes it restarted when I logged in to the web configuration tool. I attach the dump file.

              I closed the serial console and it did not restarted while watching Netflix again. In 2.2.4 the reboots are less than 2.2.3, I'm having up to 2 days of uptime with 2.2.4, with 2.2.3 I was having some reboots a day.

              I run squid in transparent mode but the client machine generating the traffic has an exception rule so this traffic passes through the firewall not through squid.

              no rdr on re1 inet proto tcp from 192.168.30.140 to any port = http
              no rdr on ovpns3 inet proto tcp from 192.168.30.140 to any port = http
              no rdr on ovpns4 inet proto tcp from 192.168.30.140 to any port = http
              no rdr on pptp inet proto tcp from 192.168.30.140 to any port = http

              Please let me know if you want me to do any test.

              Regards.

              ndxfw_dump-17Aug2015.txt

              1 Reply Last reply Reply Quote 0
              • H Offline
                heper
                last edited by

                remove squid/squidguard (and any related proxy packages),and try again.

                1 Reply Last reply Reply Quote 0
                • jimpJ Offline
                  jimp Rebel Alliance Developer Netgate
                  last edited by

                  This crash dump was in IPsec processing / NIC drivers:

                  db:0:kdb.enter.default>  show pcpu
                  cpuid        = 1
                  dynamic pcpu = 0xfffffe00984bc800
                  curthread    = 0xfffff800034ae000: pid 12 "irq256: re0"
                  curpcb       = 0xfffffe00344cecc0
                  fpcurthread  = none
                  idlethread   = 0xfffff80003210920: tid 100004 "idle: cpu1"
                  curpmap      = 0xffffffff82181fd8
                  tssp         = 0xffffffff8219cff8
                  commontssp   = 0xffffffff8219cff8
                  rsp0         = 0xfffffe00344cecc0
                  gs32p        = 0xffffffff8219ea50
                  ldt          = 0xffffffff8219ea90
                  tss          = 0xffffffff8219ea80
                  db:0:kdb.enter.default>  bt
                  Tracing pid 12 tid 100051 td 0xfffff800034ae000
                  key_allocsp() at key_allocsp+0x256/frame 0xfffffe00344ce620
                  ipsec_getpolicybyaddr() at ipsec_getpolicybyaddr+0x8d/frame 0xfffffe00344ce690
                  ipsec4_checkpolicy() at ipsec4_checkpolicy+0x29/frame 0xfffffe00344ce6b0
                  ip_ipsec_output() at ip_ipsec_output+0x8a/frame 0xfffffe00344ce6f0
                  ip_output() at ip_output+0x966/frame 0xfffffe00344ce7f0
                  ip_forward() at ip_forward+0x347/frame 0xfffffe00344ce8a0
                  ip_input() at ip_input+0x6ec/frame 0xfffffe00344ce8f0
                  netisr_dispatch_src() at netisr_dispatch_src+0x62/frame 0xfffffe00344ce960
                  ether_demux() at ether_demux+0x149/frame 0xfffffe00344ce990
                  ether_nh_input() at ether_nh_input+0x347/frame 0xfffffe00344ce9f0
                  netisr_dispatch_src() at netisr_dispatch_src+0x62/frame 0xfffffe00344cea60
                  re_rxeof() at re_rxeof+0x4ce/frame 0xfffffe00344ceae0
                  re_intr_msi() at re_intr_msi+0x10b/frame 0xfffffe00344ceb20
                  intr_event_execute_handlers() at intr_event_execute_handlers+0xab/frame 0xfffffe00344ceb60
                  ithread_loop() at ithread_loop+0x96/frame 0xfffffe00344cebb0
                  fork_exit() at fork_exit+0x9a/frame 0xfffffe00344cebf0
                  fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00344cebf0
                  
                  
                  ipsec4_checkpolicy: invalid policy 3
                  
                  Fatal trap 12: page fault while in kernel mode
                  cpuid = 1; apic id = 01
                  fault virtual address	= 0xa40c050150
                  fault code		= supervisor read data, page not present
                  instruction pointer	= 0x20:0xffffffff80cf0d26
                  stack pointer	        = 0x28:0xfffffe00344ce590
                  frame pointer	        = 0x28:0xfffffe00344ce620
                  code segment		= base 0x0, limit 0xfffff, type 0x1b
                  			= DPL 0, pres 1, long 1, def32 0, gran 1
                  processor eflags	= interrupt enabled, resume, IOPL = 0
                  current process		= 12 (irq256: re0)
                  
                  

                  It's completely different from the last panic which was in filesystem code. I'd suspect the hardware at this stage more than anything.

                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  1 Reply Last reply Reply Quote 0
                  • W Offline
                    wricaurte
                    last edited by

                    @heper:

                    remove squid/squidguard (and any related proxy packages),and try again.

                    Hi, Thanks for the suggestion. I've deleted some packages, all related to caching and ntop also. I had an improvement. I got 2 days uptime but the firewall keeps rebooting. I see the IpSec related errors in the dump file.

                    Thanks again!!!

                    1 Reply Last reply Reply Quote 0
                    • W Offline
                      wricaurte
                      last edited by

                      @jimp:

                      This crash dump was in IPsec processing / NIC drivers:

                      db:0:kdb.enter.default>  show pcpu
                      cpuid        = 1
                      dynamic pcpu = 0xfffffe00984bc800
                      curthread    = 0xfffff800034ae000: pid 12 "irq256: re0"
                      curpcb       = 0xfffffe00344cecc0
                      fpcurthread  = none
                      idlethread   = 0xfffff80003210920: tid 100004 "idle: cpu1"
                      curpmap      = 0xffffffff82181fd8
                      tssp         = 0xffffffff8219cff8
                      commontssp   = 0xffffffff8219cff8
                      rsp0         = 0xfffffe00344cecc0
                      gs32p        = 0xffffffff8219ea50
                      ldt          = 0xffffffff8219ea90
                      tss          = 0xffffffff8219ea80
                      db:0:kdb.enter.default>  bt
                      Tracing pid 12 tid 100051 td 0xfffff800034ae000
                      key_allocsp() at key_allocsp+0x256/frame 0xfffffe00344ce620
                      ipsec_getpolicybyaddr() at ipsec_getpolicybyaddr+0x8d/frame 0xfffffe00344ce690
                      ipsec4_checkpolicy() at ipsec4_checkpolicy+0x29/frame 0xfffffe00344ce6b0
                      ip_ipsec_output() at ip_ipsec_output+0x8a/frame 0xfffffe00344ce6f0
                      ip_output() at ip_output+0x966/frame 0xfffffe00344ce7f0
                      ip_forward() at ip_forward+0x347/frame 0xfffffe00344ce8a0
                      ip_input() at ip_input+0x6ec/frame 0xfffffe00344ce8f0
                      netisr_dispatch_src() at netisr_dispatch_src+0x62/frame 0xfffffe00344ce960
                      ether_demux() at ether_demux+0x149/frame 0xfffffe00344ce990
                      ether_nh_input() at ether_nh_input+0x347/frame 0xfffffe00344ce9f0
                      netisr_dispatch_src() at netisr_dispatch_src+0x62/frame 0xfffffe00344cea60
                      re_rxeof() at re_rxeof+0x4ce/frame 0xfffffe00344ceae0
                      re_intr_msi() at re_intr_msi+0x10b/frame 0xfffffe00344ceb20
                      intr_event_execute_handlers() at intr_event_execute_handlers+0xab/frame 0xfffffe00344ceb60
                      ithread_loop() at ithread_loop+0x96/frame 0xfffffe00344cebb0
                      fork_exit() at fork_exit+0x9a/frame 0xfffffe00344cebf0
                      fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00344cebf0
                      
                      
                      ipsec4_checkpolicy: invalid policy 3
                      
                      Fatal trap 12: page fault while in kernel mode
                      cpuid = 1; apic id = 01
                      fault virtual address	= 0xa40c050150
                      fault code		= supervisor read data, page not present
                      instruction pointer	= 0x20:0xffffffff80cf0d26
                      stack pointer	        = 0x28:0xfffffe00344ce590
                      frame pointer	        = 0x28:0xfffffe00344ce620
                      code segment		= base 0x0, limit 0xfffff, type 0x1b
                      			= DPL 0, pres 1, long 1, def32 0, gran 1
                      processor eflags	= interrupt enabled, resume, IOPL = 0
                      current process		= 12 (irq256: re0)
                      
                      

                      It's completely different from the last panic which was in filesystem code. I'd suspect the hardware at this stage more than anything.

                      Thanks for your help,

                      Is there any way to troubleshoot Ipsec? What I do not understand is why in 2.2.2 the problems does not exists. I rolled back to 2.2.2 two times and the problem disappear with the same confguration. I thing some change introduced from 2.2.3 forward is messing with my configuration  :D.

                      Can you please give me any advice to troubleshoot IPSec? if there is no way I will roll back again. I have two hard disk so I can test new versions of PFSense with no problem.

                      Regards.

                      1 Reply Last reply Reply Quote 0
                      • D Offline
                        divsys
                        last edited by

                        I have no idea if this will help your particular issue, but it may be worth a try to roll forward to the current 2.2.4.

                        There were some IPSec issues resolved in that release.

                        It's only a guess, but reasonably easy to try…....

                        -jfp

                        1 Reply Last reply Reply Quote 0
                        • W Offline
                          wricaurte
                          last edited by

                          @divsys:

                          I have no idea if this will help your particular issue, but it may be worth a try to roll forward to the current 2.2.4.

                          There were some IPSec issues resolved in that release.

                          It's only a guess, but reasonably easy to try…....

                          Hi, Thanks for the suggestion. The problem happens in 2.2.3 and 2.2.4. I downgraded to 2.2.2 again and the problem disappears. I'll try again with 2.3. There is something wrong with those versions. I've seen some IPSec related problems reported in the forums. I hope the pfsense team solve this.

                          Thanks..

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.