Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Cannot keep 2.0.2 pfsense firewall from crashing daily in production environment

    Scheduled Pinned Locked Moved General pfSense Questions
    14 Posts 4 Posters 3.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      rclaus
      last edited by

      running on kvm amd64 virtual hosts.  have 2 firewalls and using carp for failover. 
      have 4 intel 1gb nics and have made changes as pfsense refers to about intel nics on amd64 platform.

      doing openvpn, ipsec, tindydns, dhcp and have rules in place for access to the 3 internal networks.

      nic e1 is wan
      nic e2 is lan
      nic e3 is control network
      nic e4 is qa network

      i am desperate trying to figure out why i am having so many crashes.
      i have attached latest crash dump and also uploaded to pfsense.
      public ip of firewall is 199.255.156.227

      Crash report begins.  Anonymous machine information:

      amd64
      8.1-RELEASE-p13
      FreeBSD 8.1-RELEASE-p13 #1: Fri Dec  7 23:07:32 EST 2012    root@snapshots-8_1-amd64.builders.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8

      Crash report details:

      Filename: /var/crash/bounds
      1

      Filename: /var/crash/info.0
      Dump header from device /dev/ad0s1b
        Architecture: amd64
        Architecture Version: 1
        Dump Length: 66560B (0 MB)
        Blocksize: 512
        Dumptime: Sun Mar 17 10:00:23 2013
        Hostname: fw01data.prod.tracsoftware.com
        Magic: FreeBSD Text Dump
        Version String: FreeBSD 8.1-RELEASE-p13 #1: Fri Dec  7 23:07:32 EST 2012
          root@snapshots-8_1-amd64.builders.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8
        Panic String:
        Dump Parity: 1173100622
        Bounds: 0
        Dump Status: good

      Filename: /var/crash/textdump.tar.0
      ddb.txt���������������������������������������������������������������������������������������������0600����0�������0�������140000������12121346167�  7075� �����������������������������������������������������������������������������������������������������ustar���root����������������������������wheel������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������db:0:kdb.enter.default>  run lockinfo
      db:1:lockinfo> show locks
      No such command
      db:1:locks>  show alllocks
      No such command
      db:1:alllocks>  show lockedvnods
      Locked vnodes
      db:0:kdb.enter.default>  show pcpu
      cpuid        = 0
      dynamic pcpu    = 0x2fc080
      curthread    = 0xffffff00024477c0: pid 0 "em3 taskq"
      curpcb      = 0xffffff800018cd40
      fpcurthread  = none
      idlethread  = 0xffffff00022a77c0: pid 11 "idle: cpu0"
      curpmap        = 0
      tssp            = 0xffffffff811ddd80
      commontssp      = 0xffffffff811ddd80
      rsp0            = 0xffffff800018cd40
      gs32p          = 0xffffffff811dcbb8
      ldt            = 0xffffffff811dcbf8
      tss            = 0xffffffff811dcbe8
      db:0:kdb.enter.default>  bt
      Tracing pid 0 tid 64036 td 0xffffff00024477c0
      rn_match() at rn_match+0x1b
      pfr_match_addr() at pfr_match_addr+0xcb
      pf_test_udp() at pf_test_udp+0x89b
      pf_test() at pf_test+0x207f
      pf_check_in() at pf_check_in+0x39
      pfil_run_hooks() at pfil_run_hooks+0xa2
      ip_input() at ip_input+0x34e
      netisr_dispatch_src() at netisr_dispatch_src+0x7b
      ether_demux() at ether_demux+0x169
      ether_input() at ether_input+0x174
      lem_rxeof() at lem_rxeof+0x24d
      lem_handle_rxtx() at lem_handle_rxtx+0x51
      taskqueue_run() at taskqueue_run+0x93
      taskqueue_thread_loop() at taskqueue_thread_loop+0x46
      fork_exit() at fork_exit+0x118
      fork_trampoline() at fork_trampoline+0xe
      –- trap 0, rip = 0, rsp = 0xffffff800018cd30, rbp = 0 ---
      db:0:kdb.enter.default>  ps
        pid  ppid  pgrp  uid  state  wmesg        wchan        cmd
      61811 51715    24    0  S      nanslp  0xffffffff8115cc28 sleep
      30064    1 30064    0  Ss      (threaded)                  filterdns
      64134                  S      nanslp  0xffffffff8115cc28 filterdns
      64133                  S      nanslp  0xffffffff8115cc28 filterdns
      64132                  S      nanslp  0xffffffff8115cc28 filterdns
      64131                  S      nanslp  0xffffffff8115cc28 filterdns
      64130                  S      nanslp  0xffffffff8115cc28 filterdns
      64129                  S      nanslp  0xffffffff8115cc28 filterdns
      64128                  S      nanslp  0xffffffff8115cc28 filterdns
      64127                  S      nanslp  0xffffffff8115cc28 filterdns
      64126                  S      nanslp  0xffffffff8115cc28 filterdns
      64125                  S      nanslp  0xffffffff8115cc28 filterdns
      64124                  S      nanslp  0xffffffff8115cc28 filterdns
      64123                  S      nanslp  0xffffffff8115cc28 filterdns
      64122                  S      nanslp  0xffffffff8115cc28 filterdns
      64090                  S      nanslp  0xffffffff8115cc28 filterdns
      64089                  S      nanslp  0xffffffff8115cc28 filterdns
      64088                  S      nanslp  0xffffffff8115cc28 filterdns
      64087                  S      nanslp  0xffffffff8115cc28 filterdns
      64086                  S      nanslp  0xffffffff8115cc28 filterdns
      64085                  S      nanslp  0xffffffff8115cc28 filterdns
      64084                  S      nanslp  0xffffffff8115cc28 filterdns
      64083                  S      nanslp  0xffffffff8115cc28 filterdns
      64082                  S      nanslp  0xffffffff8115cc28 filterdns
      64081                  S      nanslp  0xffffffff8115cc28 filterdns
      64080                  S      nanslp  0xffffffff8115cc28 filterdns
      64118                  S      uwait    0xffffff000f936100 filterdns
      28048 21191 20919    0  S      accept  0xffffff000f979b06 initial thread
      36846 51391 51391    0  S      piperd  0xffffff000265a000 rrdtool
      50910 50856 42225  2009  S      pipewr  0xffffff000265b5b0 tinydns
      50856 42351 42225    0  S      select  0xffffff000fca34c0 supervise
      46070 42077 46070    0  S+      ttyin    0xffffff0002556ca8 sh
      45819 42967 45819    0  S+      ttyin    0xffffff00025568a8 sh
      42967 41688 42967    0  S+      wait    0xffffff000fa2e460 sh
      42077 41599 42077    0  S+      wait    0xffffff000fa89000 sh
      41883 43346 41883    0  Ss      (threaded)                  sshlockout_pf
      64099                  S      nanslp  0xffffffff8115cc28 sshlockout_pf
      64092                  S      piperd  0xffffff0002719b60 initial thread
      41688    1 41688    0  Ss+    wait    0xffffff000f98b8c0 login
      41599    1 41599    0  Ss+    wait    0xffffff000fe31460 login
      41370    1 41370    0  Ss      select  0xffffff000fbad6c0 ntpd
      33855 33840 33840    0  S      nanslp  0xffffffff8115cc28 svscan
      33840    1 33840    0  Ss      wait    0xffffff000fe30460 sh
      31857 31512 31512    0  S      nanslp  0xffffffff8115cc28 minicron
      31512    1 31512    0  Ss      wait    0xffffff00025ed000 minicron
      31455 30665 30665    0  S      nanslp  0xffffffff8115cc28 minicron
      30665    1 30665    0  Ss      wait    0xffffff000f9898c0 minicron
      30551 30311 30311    0  S      nanslp  0xffffffff8115cc28 minicron
      30311    1 30311    0  Ss      wait    0xffffff00025ed8c0 minicron
      52885    1 52885    0  Ss      nanslp  0xffffffff8115cc28 cron
      51715    1    24    0  S+      wait    0xffffff000fa888c0 sh
      42834 42351 42225    0  S      select  0xffffff000f9262c0 supervise
      42518 42225 42225    0  S      piperd  0xffffff000265bb60 multilog
      42351 42225 42225    0  S      nanslp  0xffffffff8115cc28 svscan
      42225    1 42225    0  Ss      wait    0xffffff000faa2000 sh
      39412    1 39412    0  Ss      select  0xffffff000fbae2c0 racoon
      35106    1 35106    0  Ss      (threaded)                  filterdns
      64160                  S      uwrlck  0xffffff000f926180 filterdns
      64097                  S      uwait    0xffffff000f936780 filterdns
      31660    1 31490 65534  S      select  0xffffff000fbadc40 dnsmasq
      31138    1 31138  1002  Ss      select  0xffffff000f936bc0 dhcpd
      21656 21191 20919    0  S      accept  0xffffff000f71d30e initial thread
      21191    1 20919    0  S      kqread  0xffffff000fb6aa00 lighttpd
      51391    1 51391    0  Ss      select  0xffffff000f936240 apinger
      46583    1 46583    0  Ss      (threaded)                  filterdns
      64091                  S      uwrlck  0xffffff000f6bdc00 filterdns
      64063                  S      uwait    0xffffff0002615380 filterdns
      45137    1 45137    0  Ss      select  0xffffff0002614cc0 inetd
      44686    1 44686    0  Ss      select  0xffffff000f6896c0 openvpn
      43835    1    24    0  S+      piperd  0xffffff000265ab60 logger
      43706    1    24    0  S+      bpf      0xffffff000f663c00 tcpdump
      43346    1 43346    0  Ss      select  0xffffff000f6bd340 syslogd
      8284    1  8284    0  Ss      select  0xffffff000f6bd540 sshd
        259    1  259    0  Ss      select  0xffffff0002614740 devd
        245  243  243    0  S      kqread  0xffffff0002609900 check_reload_status
        243    1  243    0  Ss      kqread  0xffffff0002626000 check_reload_status
        39    0    0    0  SL      mdwait  0xffffff0002605800 [md0]
        23    0    0    0  SL      sdflush  0xffffffff811a3738 [softdepflush]
        22    0    0    0  SL      vlruwt  0xffffff00025d6000 [vnlru]
        21    0    0    0  SL      syncer  0xffffffff81180e40 [syncer]
        20    0    0    0  SL      psleep  0xffffffff81180968 [bufdaemon]
        19    0    0    0  SL      pollid  0xffffffff8115ba48 [idlepoll]
        18    0    0    0  SL      pgzero  0xffffffff811a51cc [pagezero]
        17    0    0    0  SL      psleep  0xffffffff811a4568 [vmdaemon]
        16    0    0    0  SL      psleep  0xffffffff811a452c [pagedaemon]
          9    0    0    0  SL      ccb_scan 0xffffffff811229e0 [xpt_thrd]
          8    0    0    0  SL      pftm    0xffffffff80204dd0 [pfpurge]
          7    0    0    0  SL      waiting_ 0xffffffff8118ce00 [sctp_iterator]
        15    0    0    0  SL      (threaded)                  usb
      64032                  D      -        0xffffff8000234ef0 [usbus0]
      64031                  D      -        0xffffff8000234e98 [usbus0]
      64030                  D      -        0xffffff8000234e40 [usbus0]
      64029                  D      -        0xffffff8000234de8 [usbus0]
        14    0    0    0  SL      -        0xffffffff8115c904 [yarrow]
          6    0    0    0  SL      crypto_r 0xffffffff811a2660 [crypto returns]
          5    0    0    0  SL      crypto_w 0xffffffff811a2620 [crypto]
          4    0    0    0  SL      -        0xffffffff811586e8 [g_down]
          3    0    0    0  SL      -        0xffffffff811586e0 [g_up]
          2    0    0    0  SL      -        0xffffffff811586d0 [g_event]
        13    0    0    0  SL      sleep    0xffffffff810d19d0 [ng_queue0]
        12    0    0    0  RL      (threaded)                  intr
      64040                  I                                  [swi0: uart]
      64039                  I                                  [irq7: ppc0]
      64038                  I                                  [irq12: psm0]
      64037                  I                                  [irq1: atkbd0]
      64028                  I                                  [irq11: em0 em1+]
      64027                  I                                  [irq15: ata1]
      64026                  I                                  [irq14: ata0]
      64025                  I                                  [irq9: acpi0]
      64023                  I                                  [swi5: +]
      64021                  I                                  [swi2: cambio]
      64017                  I                                  [swi6: task queue]
      64016                  I                                  [swi6: Giant taskq]
      64007                  I                                  [swi3: vm]
      64006                  RunQ                                [swi4: clock]
      64005                  I                                  [swi1: netisr 0]
        11    0    0    0  RL                                  [idle: cpu0]
          1    0    1    0  SLs    wait    0xffffff00022a48c0 [init]
        10    0    0    0  SL      audit_wo 0xffffffff811a2b90 [audit]
          0    0    0    0  RLs    (threaded)                  kernel
      64036                  Run    CPU 0                      [em3 taskq]
      64035                  D      -        0xffffff0002509580 [em2 taskq]
      64034                  D      -        0xffffff00024de900 [em1 taskq]
      64033                  D      -        0xffffff00024d9100 [em0 taskq]
      64024                  D      -        0xffffff0002444180

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        Is there a reason you started a new thread rather than continue your existing thread (http://forum.pfsense.org/index.php/topic,59979.0/topicseen.html) ?

        Have you tried 2.0.3 (http://forum.pfsense.org/index.php/topic,58203.0.html) ?

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • R
          rclaus
          last edited by

          my previous message seemed to not get much response and i am struggling trying to figure out why i am having so many crashes.
          my apologies for jumping the gun and posting new message and thank you for your reply.
          i have not tried 2.03 yet since this is production fw and leary of applying beta code.
          regards
          Richard

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            2.0.3 is more stable than 2.0.2. Don't consider it a "beta". It's practically RELEASE we are just waiting on FreeBSD to issue an OpenSSL security advisory before we can wrap it up.

            And multiple threads are never the correct answer. One issue, one thread.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • R
              rclaus
              last edited by

              noted and understood about multiple threads.  i wont do that again.
              could you tell me where i can get the 2.0.3 code?  i am unable to locate.
              thanks

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                Follow the link in the earlier message I posted.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • F
                  felipe.fsso
                  last edited by

                  rclaus,
                  I have the same problem. Please give us a feedback after 2.0.3 install.

                  tnx

                  1 Reply Last reply Reply Quote 0
                  • R
                    rclaus
                    last edited by

                    installed 2.0.3 this morning and it crashed about 2 hours ago >:(

                    1 Reply Last reply Reply Quote 0
                    • R
                      rclaus
                      last edited by

                      looking at the dmesg.boot log i have the following error on all 4 of the intel nics
                      "Memory Access and/or Bus Master bits were not set!"
                      was wondering if this error be related to my daily crashes on pfsense 2.0.3 on amd64 build?

                      1 Reply Last reply Reply Quote 0
                      • R
                        rclaus
                        last edited by

                        this issue is resolved now.  i installed 2.1 beta version which has newer release of FreeBSD and updated Intel nic drivers and i am having no more daily crashes. 
                        i assume the older version of FreeBSD that the 2.0.x pfSense uses had bad em(4) drivers :)  the 2.1 beta version is performing well for me.
                        thanks
                        Richard

                        1 Reply Last reply Reply Quote 0
                        • J
                          jon_pow
                          last edited by

                          @rclaus:

                          i assume the older version of FreeBSD that the 2.0.x pfSense uses had bad em(4) drivers :)

                          I confirm it, there is something wrong with the em(4) drivers, even in the PRE-RELEASE 2.0.3 version (amd64). Tested with multiple machines, all wtih Intel NICs.
                          Is there a chance to correct the driver before launch 2.0.3 Final Release ?

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            We've seen a few complaints about the em drivers but nothing specific enough to act on. We can certainly update the drivers, but it would be most helpful to know what is wrong so we can confirm fixes.

                            I have em(4) NICs on practically everything (physical and virtual) and have had zero issues on 2.0.x or 2.1. It's not a universal/general issue, it must be specific to a certain set of chips.

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            1 Reply Last reply Reply Quote 0
                            • J
                              jon_pow
                              last edited by

                              Jimp,

                              I had open this thread with a lots of info:
                              http://forum.pfsense.org/index.php/topic,60237.0.html

                              I don't know if it's enough.

                              Unfortunately, I can't reproduce the crash in a lab yet.

                              1 Reply Last reply Reply Quote 0
                              • R
                                rclaus
                                last edited by

                                i also unable to reproduce the error but it happened consistently on a daily basis.  my environment was a vm running on the kvm hypervisor
                                the physical nics are  e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
                                the hypervisor os is  Ubuntu 11.10 (GNU/Linux 3.0.0-16-server x86_64)

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.