Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Dnsmasq dying regularly after upgrade from 2.2.2 to 2.2.3

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    7 Posts 3 Posters 1.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      echoranger
      last edited by

      I just upgraded my office firewalls to 2.2.3 from 2.2.2, mainly for the openssl fixes. The upgrade itself progressed without issue but now dnsmasq is dying very frequently post-upgrade, so much so that I had to write a restart script that checks for its PID every 2 seconds and restarts if failing. Here's what I'm seeing in my dmesg (note this is from a restart roughly 2 hours ago):

      carp: VHID 2@em1_vlan1: BACKUP -> MASTER (preempting a slower master)
      ovpns1: link state changed to DOWN
      ovpns1: link state changed to UP
      pid 68760 (dnsmasq), uid 65534: exited on signal 11
      pid 58617 (dnsmasq), uid 65534: exited on signal 11
      pid 61503 (dnsmasq), uid 65534: exited on signal 11
      arp: 192.168.1.114 moved from 70:56:81:9a:94:25 to b8:78:2e:57:a6:bc on em1_vlan1
      arp: 192.168.1.114 moved from 70:56:81:9a:94:25 to b8:78:2e:57:a6:bc on em1_vlan1
      pid 65112 (dnsmasq), uid 65534: exited on signal 11
      pid 61131 (dnsmasq), uid 65534: exited on signal 11
      pid 64404 (dnsmasq), uid 65534: exited on signal 11
      pid 6542 (dnsmasq), uid 65534: exited on signal 11
      pid 9089 (dnsmasq), uid 65534: exited on signal 11
      pid 67017 (dnsmasq), uid 65534: exited on signal 11
      pid 70543 (dnsmasq), uid 65534: exited on signal 11
      pid 38794 (dnsmasq), uid 65534: exited on signal 11
      pid 41698 (dnsmasq), uid 65534: exited on signal 11
      pid 78629 (dnsmasq), uid 65534: exited on signal 11
      pid 82340 (dnsmasq), uid 65534: exited on signal 11
      pid 86055 (dnsmasq), uid 65534: exited on signal 11
      pid 30145 (dnsmasq), uid 65534: exited on signal 11
      pid 33429 (dnsmasq), uid 65534: exited on signal 11
      pid 36217 (dnsmasq), uid 65534: exited on signal 11
      pid 88993 (dnsmasq), uid 65534: exited on signal 11
      pid 92712 (dnsmasq), uid 65534: exited on signal 11
      pid 61158 (dnsmasq), uid 65534: exited on signal 11
      pid 64693 (dnsmasq), uid 65534: exited on signal 11
      pid 68395 (dnsmasq), uid 65534: exited on signal 11
      pid 71601 (dnsmasq), uid 65534: exited on signal 11
      pid 3580 (dnsmasq), uid 65534: exited on signal 11
      pid 51100 (dnsmasq), uid 65534: exited on signal 11
      pid 54020 (dnsmasq), uid 65534: exited on signal 11
      pid 58547 (dnsmasq), uid 65534: exited on signal 11
      pid 8416 (dnsmasq), uid 65534: exited on signal 11
      pid 12152 (dnsmasq), uid 65534: exited on signal 11
      pid 79594 (dnsmasq), uid 65534: exited on signal 11
      pid 83020 (dnsmasq), uid 65534: exited on signal 11
      pid 86625 (dnsmasq), uid 65534: exited on signal 11
      pid 88256 (dnsmasq), uid 65534: exited on signal 11
      pid 91213 (dnsmasq), uid 65534: exited on signal 11
      pid 69497 (dnsmasq), uid 65534: exited on signal 11
      pid 71816 (dnsmasq), uid 65534: exited on signal 11
      pid 75650 (dnsmasq), uid 65534: exited on signal 11
      pid 18822 (dnsmasq), uid 65534: exited on signal 11
      pid 22268 (dnsmasq), uid 65534: exited on signal 11
      pid 98940 (dnsmasq), uid 65534: exited on signal 11
      pid 1493 (dnsmasq), uid 65534: exited on signal 11
      pid 6053 (dnsmasq), uid 65534: exited on signal 11
      pid 12206 (dnsmasq), uid 65534: exited on signal 11
      pid 14151 (dnsmasq), uid 65534: exited on signal 11
      pid 16539 (dnsmasq), uid 65534: exited on signal 11
      pid 94270 (dnsmasq), uid 65534: exited on signal 11
      pid 96935 (dnsmasq), uid 65534: exited on signal 11
      pid 40958 (dnsmasq), uid 65534: exited on signal 11
      pid 44402 (dnsmasq), uid 65534: exited on signal 11
      pid 95615 (dnsmasq), uid 65534: exited on signal 11
      pid 25564 (dnsmasq), uid 65534: exited on signal 11
      pid 28003 (dnsmasq), uid 65534: exited on signal 11
      pid 35455 (dnsmasq), uid 65534: exited on signal 11
      pid 37584 (dnsmasq), uid 65534: exited on signal 11
      pid 14473 (dnsmasq), uid 65534: exited on signal 11
      pid 17819 (dnsmasq), uid 65534: exited on signal 11
      pid 21431 (dnsmasq), uid 65534: exited on signal 11
      pid 35032 (dnsmasq), uid 65534: exited on signal 11
      pid 68048 (dnsmasq), uid 65534: exited on signal 11
      pid 17880 (dnsmasq), uid 65534: exited on signal 11
      pid 40663 (dnsmasq), uid 65534: exited on signal 11
      pid 50012 (dnsmasq), uid 65534: exited on signal 11
      pid 52857 (dnsmasq), uid 65534: exited on signal 11
      pid 56719 (dnsmasq), uid 65534: exited on signal 11
      pid 25754 (dnsmasq), uid 65534: exited on signal 11
      pid 30742 (dnsmasq), uid 65534: exited on signal 11
      pid 44637 (dnsmasq), uid 65534: exited on signal 11
      pid 47968 (dnsmasq), uid 65534: exited on signal 11
      pid 29731 (dnsmasq), uid 65534: exited on signal 11
      pid 33589 (dnsmasq), uid 65534: exited on signal 11
      pid 35866 (dnsmasq), uid 65534: exited on signal 11
      pid 64803 (dnsmasq), uid 65534: exited on signal 11
      pid 67341 (dnsmasq), uid 65534: exited on signal 11
      pid 71247 (dnsmasq), uid 65534: exited on signal 11
      pid 40116 (dnsmasq), uid 65534: exited on signal 11
      pid 42881 (dnsmasq), uid 65534: exited on signal 11
      pid 47026 (dnsmasq), uid 65534: exited on signal 11
      pid 59087 (dnsmasq), uid 65534: exited on signal 11
      pid 62523 (dnsmasq), uid 65534: exited on signal 11
      pid 42330 (dnsmasq), uid 65534: exited on signal 11
      pid 46109 (dnsmasq), uid 65534: exited on signal 11
      pid 48776 (dnsmasq), uid 65534: exited on signal 11
      pid 85096 (dnsmasq), uid 65534: exited on signal 11
      pid 88684 (dnsmasq), uid 65534: exited on signal 11
      pid 92489 (dnsmasq), uid 65534: exited on signal 11
      pid 62969 (dnsmasq), uid 65534: exited on signal 11
      pid 66189 (dnsmasq), uid 65534: exited on signal 11
      pid 77844 (dnsmasq), uid 65534: exited on signal 11
      pid 81265 (dnsmasq), uid 65534: exited on signal 11
      pid 69485 (dnsmasq), uid 65534: exited on signal 11
      pid 71999 (dnsmasq), uid 65534: exited on signal 11
      pid 9548 (dnsmasq), uid 65534: exited on signal 11
      pid 11804 (dnsmasq), uid 65534: exited on signal 11
      pid 13966 (dnsmasq), uid 65534: exited on signal 11
      pid 54987 (dnsmasq), uid 65534: exited on signal 11
      pid 87116 (dnsmasq), uid 65534: exited on signal 11
      pid 90712 (dnsmasq), uid 65534: exited on signal 11
      pid 99818 (dnsmasq), uid 65534: exited on signal 11
      pid 2930 (dnsmasq), uid 65534: exited on signal 11
      pid 82728 (dnsmasq), uid 65534: exited on signal 11
      pid 85402 (dnsmasq), uid 65534: exited on signal 11
      pid 22866 (dnsmasq), uid 65534: exited on signal 11
      pid 25727 (dnsmasq), uid 65534: exited on signal 11
      pid 68449 (dnsmasq), uid 65534: exited on signal 11
      pid 71363 (dnsmasq), uid 65534: exited on signal 11
      pid 4146 (dnsmasq), uid 65534: exited on signal 11
      pid 10773 (dnsmasq), uid 65534: exited on signal 11
      pid 13907 (dnsmasq), uid 65534: exited on signal 11
      pid 96777 (dnsmasq), uid 65534: exited on signal 11
      pid 502 (dnsmasq), uid 65534: exited on signal 11
      pid 42592 (dnsmasq), uid 65534: exited on signal 11
      pid 47377 (dnsmasq), uid 65534: exited on signal 11
      pid 49599 (dnsmasq), uid 65534: exited on signal 11
      pid 99488 (dnsmasq), uid 65534: exited on signal 11
      pid 3353 (dnsmasq), uid 65534: exited on signal 11
      pid 23311 (dnsmasq), uid 65534: exited on signal 11
      pid 40213 (dnsmasq), uid 65534: exited on signal 11
      pid 44069 (dnsmasq), uid 65534: exited on signal 11
      pid 46908 (dnsmasq), uid 65534: exited on signal 11
      pid 49554 (dnsmasq), uid 65534: exited on signal 11
      pid 25695 (dnsmasq), uid 65534: exited on signal 11
      pid 28149 (dnsmasq), uid 65534: exited on signal 11
      pid 32890 (dnsmasq), uid 65534: exited on signal 11
      pid 66124 (dnsmasq), uid 65534: exited on signal 11
      pid 72324 (dnsmasq), uid 65534: exited on signal 11
      pid 77280 (dnsmasq), uid 65534: exited on signal 11
      pid 81724 (dnsmasq), uid 65534: exited on signal 11
      pid 22626 (dnsmasq), uid 65534: exited on signal 11
      pid 26410 (dnsmasq), uid 65534: exited on signal 11
      pid 63678 (dnsmasq), uid 65534: exited on signal 11
      pid 66056 (dnsmasq), uid 65534: exited on signal 11
      pid 68401 (dnsmasq), uid 65534: exited on signal 11
      pid 50773 (dnsmasq), uid 65534: exited on signal 11

      My restart script logs the time after each restart and what I'm seeing is several a minute to one every few minutes:

      Wed Jul  8 23:08:14 UTC 2015
      Wed Jul  8 23:08:18 UTC 2015
      Wed Jul  8 23:08:24 UTC 2015
      Wed Jul  8 23:12:12 UTC 2015
      Wed Jul  8 23:12:16 UTC 2015
      Wed Jul  8 23:14:56 UTC 2015
      Wed Jul  8 23:15:00 UTC 2015
      Wed Jul  8 23:15:47 UTC 2015
      Wed Jul  8 23:15:51 UTC 2015
      Wed Jul  8 23:18:29 UTC 2015
      Wed Jul  8 23:18:34 UTC 2015
      Wed Jul  8 23:22:20 UTC 2015
      Wed Jul  8 23:22:26 UTC 2015
      Wed Jul  8 23:22:31 UTC 2015
      Wed Jul  8 23:25:05 UTC 2015
      Wed Jul  8 23:25:12 UTC 2015
      Wed Jul  8 23:25:16 UTC 2015
      Wed Jul  8 23:25:58 UTC 2015
      Wed Jul  8 23:26:02 UTC 2015
      Wed Jul  8 23:28:38 UTC 2015
      Wed Jul  8 23:28:44 UTC 2015
      Wed Jul  8 23:28:48 UTC 2015
      Wed Jul  8 23:32:35 UTC 2015
      Wed Jul  8 23:32:41 UTC 2015
      Wed Jul  8 23:35:22 UTC 2015
      Wed Jul  8 23:35:26 UTC 2015
      Wed Jul  8 23:35:32 UTC 2015
      Wed Jul  8 23:36:07 UTC 2015
      Wed Jul  8 23:36:13 UTC 2015
      Wed Jul  8 23:38:55 UTC 2015
      Wed Jul  8 23:38:59 UTC 2015
      Wed Jul  8 23:39:05 UTC 2015
      Wed Jul  8 23:42:46 UTC 2015
      Wed Jul  8 23:42:50 UTC 2015
      Wed Jul  8 23:45:37 UTC 2015
      Wed Jul  8 23:45:41 UTC 2015
      Wed Jul  8 23:45:47 UTC 2015
      Wed Jul  8 23:46:17 UTC 2015
      Wed Jul  8 23:46:21 UTC 2015
      Wed Jul  8 23:49:10 UTC 2015
      Wed Jul  8 23:49:14 UTC 2015
      Wed Jul  8 23:49:20 UTC 2015
      Wed Jul  8 23:52:57 UTC 2015
      Wed Jul  8 23:53:01 UTC 2015
      Wed Jul  8 23:53:05 UTC 2015
      Wed Jul  8 23:55:51 UTC 2015
      Wed Jul  8 23:55:56 UTC 2015
      Wed Jul  8 23:56:26 UTC 2015
      Wed Jul  8 23:56:32 UTC 2015
      Wed Jul  8 23:59:23 UTC 2015
      Wed Jul  8 23:59:29 UTC 2015
      Wed Jul  8 23:59:33 UTC 2015
      Thu Jul  9 00:03:12 UTC 2015
      Thu Jul  9 00:03:16 UTC 2015
      Thu Jul  9 00:06:02 UTC 2015
      Thu Jul  9 00:06:06 UTC 2015
      Thu Jul  9 00:06:11 UTC 2015
      Thu Jul  9 00:06:37 UTC 2015
      Thu Jul  9 00:06:43 UTC 2015
      Thu Jul  9 00:09:40 UTC 2015
      Thu Jul  9 00:09:44 UTC 2015
      Thu Jul  9 00:13:21 UTC 2015
      Thu Jul  9 00:13:25 UTC 2015
      Thu Jul  9 00:13:32 UTC 2015
      Thu Jul  9 00:16:16 UTC 2015
      Thu Jul  9 00:16:22 UTC 2015
      Thu Jul  9 00:16:48 UTC 2015
      Thu Jul  9 00:16:52 UTC 2015
      Thu Jul  9 00:19:49 UTC 2015
      Thu Jul  9 00:19:55 UTC 2015
      Thu Jul  9 00:19:59 UTC 2015
      Thu Jul  9 00:23:36 UTC 2015
      Thu Jul  9 00:23:40 UTC 2015
      Thu Jul  9 00:23:46 UTC 2015
      Thu Jul  9 00:26:26 UTC 2015
      Thu Jul  9 00:26:30 UTC 2015
      Thu Jul  9 00:26:36 UTC 2015
      Thu Jul  9 00:26:56 UTC 2015
      Thu Jul  9 00:27:03 UTC 2015
      Thu Jul  9 00:30:03 UTC 2015
      Thu Jul  9 00:30:09 UTC 2015
      Thu Jul  9 00:30:13 UTC 2015
      Thu Jul  9 00:33:52 UTC 2015
      Thu Jul  9 00:33:56 UTC 2015
      Thu Jul  9 00:34:02 UTC 2015
      Thu Jul  9 00:36:40 UTC 2015
      Thu Jul  9 00:36:47 UTC 2015
      Thu Jul  9 00:37:07 UTC 2015
      Thu Jul  9 00:37:13 UTC 2015
      Thu Jul  9 00:40:20 UTC 2015
      Thu Jul  9 00:40:24 UTC 2015
      Thu Jul  9 00:44:07 UTC 2015
      Thu Jul  9 00:44:11 UTC 2015
      Thu Jul  9 00:44:15 UTC 2015
      Thu Jul  9 00:46:52 UTC 2015
      Thu Jul  9 00:46:56 UTC 2015
      Thu Jul  9 00:47:02 UTC 2015
      Thu Jul  9 00:47:18 UTC 2015
      Thu Jul  9 00:47:22 UTC 2015
      Thu Jul  9 00:50:29 UTC 2015
      Thu Jul  9 00:50:33 UTC 2015
      Thu Jul  9 00:54:22 UTC 2015
      Thu Jul  9 00:54:26 UTC 2015
      Thu Jul  9 00:57:07 UTC 2015
      Thu Jul  9 00:57:11 UTC 2015
      Thu Jul  9 00:57:15 UTC 2015
      Thu Jul  9 00:57:27 UTC 2015
      Thu Jul  9 00:57:31 UTC 2015
      Thu Jul  9 01:00:38 UTC 2015
      Thu Jul  9 01:00:44 UTC 2015
      Thu Jul  9 01:04:30 UTC 2015
      Thu Jul  9 01:04:36 UTC 2015
      Thu Jul  9 01:04:40 UTC 2015
      Thu Jul  9 01:07:20 UTC 2015
      Thu Jul  9 01:07:27 UTC 2015
      Thu Jul  9 01:07:31 UTC 2015
      Thu Jul  9 01:07:37 UTC 2015
      Thu Jul  9 01:07:43 UTC 2015
      Thu Jul  9 01:07:47 UTC 2015
      Thu Jul  9 01:07:51 UTC 2015
      Thu Jul  9 01:10:50 UTC 2015
      Thu Jul  9 01:10:54 UTC 2015
      Thu Jul  9 01:10:58 UTC 2015
      Thu Jul  9 01:14:45 UTC 2015
      Thu Jul  9 01:14:51 UTC 2015
      Thu Jul  9 01:14:55 UTC 2015
      Thu Jul  9 01:15:02 UTC 2015
      Thu Jul  9 01:17:35 UTC 2015
      Thu Jul  9 01:17:42 UTC 2015
      Thu Jul  9 01:17:58 UTC 2015
      Thu Jul  9 01:18:02 UTC 2015
      Thu Jul  9 01:18:06 UTC 2015
      Thu Jul  9 01:21:04 UTC 2015
      Thu Jul  9 01:21:09 UTC 2015
      Thu Jul  9 01:25:05 UTC 2015
      Thu Jul  9 01:25:11 UTC 2015
      Thu Jul  9 01:27:48 UTC 2015
      Thu Jul  9 01:27:52 UTC 2015
      Thu Jul  9 01:27:56 UTC 2015
      Thu Jul  9 01:28:12 UTC 2015

      Has anyone else seen this? Is there anything I can do besides continue running this script and deal with the outage? I'm happy to provide any debug info I can offer.

      Of note, this firewall is in an HA pair and the secondary has only had 3 restarts in this same time frame. The main difference is that the secondary is second in the DNS search order so the primary is hit much more frequently in regards to DNS requests.

      Help?!?

      For reference. my restart script is below:

      #!/bin/sh

      DNSCMDLINE="/usr/local/sbin/dnsmasq –all-servers --server=/10.in-addr.arpa/ --server=/168.192.in-addr.arpa/ --server=/16.172.in-addr.arpa/ --server=/17.172.in-addr.arpa/ --server=/18.172.in-addr.arpa/ --server=/19.172.in-addr.arpa/ --server=/20.172.in-addr.arpa/ --server=/21.172.in-addr.arpa/ --server=/22.172.in-addr.arpa/ --server=/23.172.in-addr.arpa/ --server=/24.172.in-addr.arpa/ --server=/25.172.in-addr.arpa/ --server=/26.172.in-addr.arpa/ --server=/27.172.in-addr.arpa/ --server=/28.172.in-addr.arpa/ --server=/29.172.in-addr.arpa/ --server=/30.172.in-addr.arpa/ --server=/31.172.in-addr.arpa/ --dns-forward-max=5000 --cache-size=10000 --local-ttl=1"

      while true; do
      DNSPID=ps axww | fgrep '/usr/local/sbin/dnsmasq --all-servers' | grep -v grep | awk '{print $1}'

      if [ -z "$DNSPID" ]; then $DNSCMDLINE ; date >>/tmp/dnsrestart; fi

      sleep 2;
      done

      1 Reply Last reply Reply Quote 0
      • E
        echoranger
        last edited by

        Alternatively, can someone send me the AMD64 binary of dnsmasq from 2.2.2 so I can try that? I can try to extract from the install media when I'm at work tomorrow but this might save me a bit of time.

        Thanks!

        1 Reply Last reply Reply Quote 0
        • E
          echoranger
          last edited by

          So I had some time to my lonesome at the local watering hole and was able to extract the 2.2.2 dnsmasq binary from the install media via a loop mount (THANK YOU PFSENSE FOR USING A REGULAR DIRECTORY STRUCTURE!!) and have swapped that in for the time being.

          In the last 10 minutes things are looking much better (no restarts). I don't know what changed with the recent dnsmasq build but something, at least in my case, is a bit funky. I still have the old binary saved should someone want to delve further but the 2.2.2 build seems to be working much better with the 2.2.3 release.

          MD5 sigs for the dnsmasq binaries (for those interested):

          AMD64 dnsmasq from 2.2.2 (GOOD) -> MD5 (/usr/local/sbin/dnsmasq) = 8e9eb7759989bd2c04c0f7bf6c5bf303
          AMD64 dnsmasq from 2.2.3 (ISSUES) -> MD5 (/usr/local/sbin/dnsmasq.old) = 65408562620b5ae48202f28e241706d3

          1 Reply Last reply Reply Quote 0
          • DerelictD
            Derelict LAYER 8 Netgate
            last edited by

            em1_vlan1

            What is that?  Seems the untagged, default VLAN should be em1, not em1_vlan1.

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            1 Reply Last reply Reply Quote 0
            • E
              echoranger
              last edited by

              @Derelict:

              em1_vlan1

              What is that?  Seems the untagged, default VLAN should be em1, not em1_vlan1.

              I have VLANs on my internal interface and VLAN1 is my workstation VLAN, so em1_vlan1 is correct (I have several additional VLANs on that interface as well for other development networks, vlan10, vlan20, etc…). There is no traffic over the untagged interface internally.

              1 Reply Last reply Reply Quote 0
              • H
                heper
                last edited by

                lots of hardware doesn't support tagging vlan 1  … thats (what i think) is what derelict is referring to

                1 Reply Last reply Reply Quote 0
                • E
                  echoranger
                  last edited by

                  In my case both of my Netgear switches support it and have through several iterations of pfSense (2.0 to now). In fact one of the Netgear switches requires management via VLAN1 so I'm somewhat stuck there.

                  In any case, I did try to have dnsmasq listen on all interfaces as well as specific VLAN interfaces when it was flapping yesterday. In both instances I saw the same flapping behavior.

                  I am happy to note that since reverting the dnsmasq binary back to 2.2.2 I haven't seen a single signal 11 crash, it has stayed remarkably stable on both my primary and secondary firewall.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.