Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense crashed on Alix

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    49 Posts 11 Posters 24.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jlepthien
      last edited by

      Hi!

      This is the output…

      interrupt                          total      rate
      irq0: clk                        120917        99
      irq4: uart0                          501          0
      irq8: rtc                        154794        127
      irq9: ath0                        73851        61
      irq10: vr0                          6349          5
      irq11: vr1                          9958          8
      irq12: ohci0 ehci0                    1          0
      irq14: ata0                        11966          9
      Total                            378337        312

      | apple fanboy | music lover | network and security specialist | in love with cisco systems |

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        That looks fine, though is that with or without the USB serial adapter plugged in?

        I suppose it may just be a coincidence that it's the irq process that is active when the panic happens, but it's hard to say for certain.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • J
          jlepthien
          last edited by

          This was taken with the usb adapter inserted. But it is usb on the Macintosh side. It is still serial on the Alix…

          | apple fanboy | music lover | network and security specialist | in love with cisco systems |

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            ahh… well then that wouldn't explain the ucom then. The ALIX would only see serial, not a USB device on its end.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • J
              jlepthien
              last edited by

              Again :-(

              Build from 01/06

              db> bt
              Tracing pid 11 tid 64025 td 0xc2456d80
              pim_input(c30a5816,c24fa000,c308cd00,0,0,…) at pim_input+0xb8c
              ip_input(c308cd00,246,c24d8700,c2378bcc,c06fd9b1,...) at ip_input+0x604
              netisr_dispatch_src(1,0,c308cd00,c2378c04,c08e3ecf,...) at netisr_dispatch_src+0x89
              netisr_dispatch(1,c308cd00,c24fa000,c24fa000,c30a5808,...) at netisr_dispatch+0x20
              ether_demux(c24fa000,c308cd00,3,0,3,...) at ether_demux+0x16f
              ether_vlanencap(c24fa000,c308cd00,c2456d80,c2378c5c,c0853f81,...) at ether_vlanencap+0x43f
              ucom_attach(c0d56e6d,c0cd10c0,c2378cb0,c2378c98,0,...) at ucom_attach+0x542b
              ucom_attach(c24ab000,0,109,82593edb,132,...) at ucom_attach+0x89d7
              intr_event_execute_handlers(c2436aa0,c2434680,c0b5910d,4f6,c24346f0,...) at intr_event_execute_handlers+0x14b
              intr_getaffinity(c24f9b50,c2378d38,0,0,0,...) at intr_getaffinity+0x14a
              fork_exit(c080dfe0,c24f9b50,c2378d38) at fork_exit+0x90
              fork_trampoline() at fork_trampoline+0x8
              --- trap 0, eip = 0, esp = 0xc2378d70, ebp = 0 ---
              db>

              | apple fanboy | music lover | network and security specialist | in love with cisco systems |

              1 Reply Last reply Reply Quote 0
              • J
                jlepthien
                last edited by

                And again. This is the last things I could grep:

                processor eflags        = interrupt enabled, resume, IOPL = 0
                current process        = 11 (irq10: vr0)

                | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                1 Reply Last reply Reply Quote 0
                • J
                  jlepthien
                  last edited by

                  Okay. Now again. I am testing igmpproxy right now, perhaps it has something to do with it?

                  Fatal trap 12: page fault while in kernel mode
                  fault virtual address  = 0x72636524
                  fault code              = supervisor write, page not present
                  instruction pointer    = 0x20:0xc096993c
                  stack pointer          = 0x28:0xc2378b10
                  frame pointer          = 0x28:0xc2378b64
                  code segment            = base 0x0, limit 0xfffff, type 0x1b
                                          = DPL 0, pres 1, def32 1, gran 1
                  processor eflags        = interrupt enabled, resume, IOPL = 0
                  current process        = 11 (irq10: vr0)

                  | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                  1 Reply Last reply Reply Quote 0
                  • C
                    cmb
                    last edited by

                    @jlepthien:

                    Okay. Now again. I am testing igmpproxy right now, perhaps it has something to do with it?

                    Possibly. Did it happen at all before you started testing it?

                    We'll get Ermal or someone to take a look at the backtraces when time permits.

                    1 Reply Last reply Reply Quote 0
                    • J
                      jlepthien
                      last edited by

                      Yeah, the first two happened before I think. But at least the first. All the ones from yesterday happened when I tested the igmp proxy…

                      | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                      1 Reply Last reply Reply Quote 0
                      • J
                        jlepthien
                        last edited by

                        Another one again. But I can only see stuff starting with the prompt most of the time….

                        db> bt
                        Tracing pid 11 tid 64025 td 0xc2456d80
                        rn_match(c0cd4fcc,c2842200,0,0,c23788a8,...) at rn_match+0x17
                        pfr_match_addr(c288d9b0,c31e5822,2,c2378894,c2378890,...) at pfr_match_addr+0x63
                        pf_test_udp(c2378990,c237898c,1,c2562c00,c278a800,...) at pf_test_udp+0x4db
                        pf_test(1,c24fa000,c2378b54,0,0,...) at pf_test+0xbb5
                        init_pf_mutex(0,c2378b54,c24fa000,1,0,...) at init_pf_mutex+0x5e6
                        pfil_run_hooks(c0cfd140,c2378ba4,c24fa000,1,0,...) at pfil_run_hooks+0x7e
                        ip_input(c278a800,246,c24d2ac0,c2378bcc,c06fd9b1,...) at ip_input+0x278
                        netisr_dispatch_src(1,0,c278a800,c2378c04,c08e3ecf,...) at netisr_dispatch_src+0x89
                        netisr_dispatch(1,c278a800,c24fa000,c24fa000,c31e5808,...) at netisr_dispatch+0x20
                        ether_demux(c24fa000,c278a800,3,0,3,...) at ether_demux+0x16f
                        ether_vlanencap(c24fa000,c278a800,c2456d80,c2378c5c,c0853f81,...) at ether_vlanencap+0x43f
                        ucom_attach(c0d56e6d,c0cd10c0,c2378cb0,c2378c98,0,...) at ucom_attach+0x542b
                        ucom_attach(c24ab000,0,109,cd9a2d5d,38ea,...) at ucom_attach+0x89d7
                        intr_event_execute_handlers(c2436aa0,c2434680,c0b5910d,4f6,c24346f0,...) at intr_event_execute_handlers+0x14b
                        intr_getaffinity(c24f9b50,c2378d38,0,0,0,...) at intr_getaffinity+0x14a
                        fork_exit(c080dfe0,c24f9b50,c2378d38) at fork_exit+0x90
                        fork_trampoline() at fork_trampoline+0x8
                        --- trap 0, eip = 0, esp = 0xc2378d70, ebp = 0 ---
                        db>

                        | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                        1 Reply Last reply Reply Quote 0
                        • J
                          jlepthien
                          last edited by

                          Do we have any info yet? Today this happened again and I really would like to know what this is. I can simply install 1.2.3 again and wait until 2.0 is out of beta, but I want to help the project. So devs, what could be the problem? Anything else I should check?

                          | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                          1 Reply Last reply Reply Quote 0
                          • J
                            jlepthien
                            last edited by

                            Would you guys please be so kind to give me an answer. Otherwise it is no fun posting these backtraces…
                            Today it happened again...

                            | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                            1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              Unfortunately with the snapshot server out of commission until the new one is put in place there isn't much to do or try except keep track of the traces.

                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              1 Reply Last reply Reply Quote 0
                              • J
                                jlepthien
                                last edited by

                                Yeah well but this doesn't answer my question what the real problem is. Or do you think that these problems were silently fixed in a new snapshot?

                                | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                                1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by

                                  It's hard to say with any certainty until someone with more in-depth knowledge of the freebsd kernel (such as ermal) can have a look and see if he can tell what is going on.

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • J
                                    jlepthien
                                    last edited by

                                    Yep. That's what I'm waiting for ;)

                                    | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                                    1 Reply Last reply Reply Quote 0
                                    • W
                                      wallabybob
                                      last edited by

                                      I've looked at a lot of FreeBSD dumps. This sort of problem is sometimes fairly straight forward to find but can also be very difficult to find. It can have a variety of causes including passing the wrong type of data structure to a function and freeing a data structure then reusing it while its being used for another purpose.

                                      If I was looking at this problem I expect the most useful items of information to me would be

                                      • a precise identification of the build on the which the problem was observed

                                      • a way of making it happen, even if it makes it happen only one in four times

                                      One of the back traces shows:
                                      ucom_attach(c0d56e6d,c0cd10c0,c2378cb0,c2378c98,0,…) at ucom_attach+0x542b
                                      ucom_attach(c24ab000,0,109,cd9a2d5d,38ea,...) at ucom_attach+0x89d7
                                      The offsets can be misleadng in that static functions don't appear in the symbol table available to the crash time debugger. Since 0x1000 is 4k, 0x89d7 is at least 32k and its pretty unlikely that an attach function would have anything like that amount of code. This offset likely is in some static function whose code starts at a higher address than the code for ucom_attach.

                                      Another of the reports shows:
                                      Fatal trap 12: page fault while in kernel mode
                                      fault virtual address  = 0x72636524
                                      fault code              = supervisor write, page not present
                                      instruction pointer    = 0x20:0xc096993c
                                      stack pointer          = 0x28:0xc2378b10
                                      frame pointer          = 0x28:0xc2378b64

                                      If you look at the virtual address you might notice that it could be considered to be printable text: "?ecr" (the ? is for the character who binary representation is 0x24; I don't have the mapping from 0x24 to printable character in my head).  From the reported code it would appear that a data structure referenced by rn_match has a text string where rn_match is expecting it to hold the address of another data structure. The challenge is to find out how that happened.

                                      1 Reply Last reply Reply Quote 0
                                      • J
                                        jlepthien
                                        last edited by

                                        What I see now is that this happens every 3-4 days. So I guess I will do a reboot now every night via cron to see if this then stops until I have better builds…

                                        | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                                        1 Reply Last reply Reply Quote 0
                                        • J
                                          jlepthien
                                          last edited by

                                          With the daily reboot in place I am not seeing this problem anymore. So what is the status of these problems? Has anyone (ermal) taken a look at the bt's? Is this "problem" fixed in newer snaps?

                                          | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                                          1 Reply Last reply Reply Quote 0
                                          • J
                                            jlepthien
                                            last edited by

                                            Today this happened again. So I cannot use this workaround :(

                                            Here is the bt:

                                            rn_match(c0cd504c,c283fd00,0,c2981718,e2992850,…) at rn_match+0x17
                                            pfr_match_addr(c288b9b0,c2741034,2,e299283c,e2992838,...) at pfr_match_addr+0x63
                                            pf_test_tcp(e2992938,e2992934,1,c26c4600,c272bd00,...) at pf_test_tcp+0x4cb
                                            pf_test(1,c2610400,e2992afc,0,0,...) at pf_test+0x8d2
                                            init_pf_mutex(0,e2992afc,c2610400,1,0,...) at init_pf_mutex+0x5e6
                                            pfil_run_hooks(c0cfd1c0,e2992b4c,c2610400,1,0,...) at pfil_run_hooks+0x7e
                                            ip_input(c272bd00,246,c24d38c0,e2992b74,c06fd9a1,...) at ip_input+0x278
                                            netisr_dispatch_src(1,0,c272bd00,e2992bac,c08e3f0f,...) at netisr_dispatch_src+0x89
                                            netisr_dispatch(1,c272bd00,c2610400,c2610400,c274101a,...) at netisr_dispatch+0x20
                                            ether_demux(c2610400,c272bd00,3,0,3,...) at ether_demux+0x16f
                                            ether_vlanencap(c2610400,c272bd00,ece0,18,c272bd00,...) at ether_vlanencap+0x43f
                                            ieee80211_hostap_detach(c2700000,c315a000,c272bd00,c2532480,c2438d80,...) at ieee80211_hostap_detach+0x362
                                            ieee80211_hostap_detach(c315a000,c272bd00,17,ffffffa0,0,...) at ieee80211_hostap_detach+0x29a7
                                            ath_suspend(c2514000,1,0,c0ca937c,0,...) at ath_suspend+0x1f67
                                            taskqueue_run(c251d100,c251d118,0,c0b53f14,0,...) at taskqueue_run+0x132
                                            taskqueue_thread_loop(c2514270,e2992d38,0,0,0,...) at taskqueue_thread_loop+0x88
                                            fork_exit(c086b060,c2514270,e2992d38) at fork_exit+0x90
                                            fork_trampoline() at fork_trampoline+0x8
                                            --- trap 0, eip = 0, esp = 0xe2992d70, ebp = 0 ---

                                            Please guys. Give me any info. What else do you need? Does nobody use 2.0-beta1 on Alix boards? Can't be...

                                            | apple fanboy | music lover | network and security specialist | in love with cisco systems |

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.