Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unexpected alias behaviour - two ranges / size limits with FQDN

    Scheduled Pinned Locked Moved General pfSense Questions
    78 Posts 4 Posters 834 Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      SteveITS Galactic Empire @tinfoilmatt
      last edited by

      @tinfoilmatt said in Unexpected alias behaviour - two ranges:

      Can you re-run this without any user error

      I guess that's fair though it would imply deleting the invalid entries still causes a problem? Or at least that doing so doesn't fix the problem. Onwards (read it all)...

      If I delete all four aliases, apply, and re-import them, I do not see the error case today even after a filter reload or reboot. All four aliases are correct (618 total IPs).

      I added "invalid" to alias_512 and applied, same.

      I emptied all four tables, ran a filter reload, and all four remained empty.

      I removed "invalid" and ran a filter reload, all tables remained empty.

      I had to "killall filterdns" and filter reload, and after that the tables populated correctly.

      next:
      empty all tables
      add "invalid" to alias_512, and apply
      all tables remain empty
      killall filterdns, and reload filter
      all tables are populated correctly

      ...so, killing filterdns is suddenly required to get the tables to recreate at all. @stephenw10, does a filter reload actively empty the tables when it runs, or does it leave them and attempt to update them?

      next:
      I started over, imported the aliases with the extra two error lines, just like last night, and was unable to replicate my original observed case (incomplete aliases). Unclear why it is different today. I shut down the VM overnight, which seems irrelevant but did happen.

      It seems there is definitely "something wrong" because the alias tables are either sometimes incomplete or empty, but now I'm confused also.

      Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
      When upgrading, allow 10-15 minutes to reboot, or more depending on packages, CPU, and/or disk speed.
      Upvote 👍 helpful posts!

      tinfoilmattT 2 Replies Last reply Reply Quote 0
      • stephenw10S Offline
        stephenw10 Netgate Administrator
        last edited by

        Yes I expect it to re-populate the tables based on the loaded ruleset.

        It looks like there are at least two bugs still outstanding related to this. But as far as I know neither is a regression for 2.8.1/25.07.1.

        @Patch you first saw this in 2.8.1? Is it possible it was happening in 2.7.2 and you just didn't notice?

        P S tinfoilmattT 3 Replies Last reply Reply Quote 0
        • P Online
          Patch @stephenw10
          last edited by Patch

          @stephenw10 yes I first saw this in v2.81 and had not tripped it in v2.72

          I then installed v2.72 in a VM using the current installer an explicit testing as per. https://forum.netgate.com/post/1229337 showed essentially the same behaviour.

          The only real testing I had done after the error is triggered is to demonstrate creating a new trivial alias results in an alias table but it isn’t populated.

          I avoided further testing as I had previously found repairing the system by further changing the alias definition was difficult. The system behaves as if something has crashed or locked up. My current experience is data entry errors are handled correctly by pfsense but the alias table filling error once triggered persist. Which initially miss lead me into blaming data entry error handling. Hence my very frequent restarts / configuration restore in testing

          1 Reply Last reply Reply Quote 0
          • S Offline
            SteveITS Galactic Empire @stephenw10
            last edited by

            @stephenw10 said in Unexpected alias behaviour - two ranges:

            Yes I expect it to re-populate the tables based on the loaded ruleset.

            Yes, but the hair I'm splitting is whether the alias Apply is either 1) not updating the table as expected, or 2) not emptying the tables at the beginning of its run and thus presumably aborting very early in the process. Just thinking about the programming out loud, is all. Because if I manually empty them and they stay empty that implies the prior filter reload maybe didn't get to the point of emptying them.

            I guess I didn't explain it well but it seems like:

            I added "invalid" to alias_512 and applied, same.
            ...is possibly not a great test if I didn't "killall filterdns" and filter reload.

            Seems like one possibility is filterdns gets stuck and thus the tables aren't updated. Which may be what @Patch is talking about when mentioning lockups.

            Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
            When upgrading, allow 10-15 minutes to reboot, or more depending on packages, CPU, and/or disk speed.
            Upvote 👍 helpful posts!

            1 Reply Last reply Reply Quote 0
            • tinfoilmattT Offline
              tinfoilmatt @SteveITS
              last edited by tinfoilmatt

              @SteveITS said in Unexpected alias behaviour - two ranges:

              Onwards (read it all)...

              Clearly I've been 'reading it all', Steve. Otherwise I wouldn't still be here. Does it concern you that I somehow keep picking the most relevant bits out of the noise to maintain my position here?

              Your focus on the matter at-hand is showing with that comment (which I've of course taken the bait on and obliged you).

              @SteveITS said in Unexpected alias behaviour - two ranges:

              I had to "killall filterdns" and filter reload, and after that the tables populated correctly.

              I had a feeling...

              1 Reply Last reply Reply Quote 0
              • tinfoilmattT Offline
                tinfoilmatt @stephenw10
                last edited by

                @stephenw10 said in Unexpected alias behaviour - two ranges:

                It looks like there are at least two bugs still outstanding related to this.

                Redmine links?

                1 Reply Last reply Reply Quote 0
                • tinfoilmattT Offline
                  tinfoilmatt @SteveITS
                  last edited by

                  @SteveITS said in Unexpected alias behaviour - two ranges:

                  so, killing filterdns is suddenly required to get the tables to recreate at all.

                  Or you could just, like—not introduce user error and it probably wouldn't be necessary.

                  S 1 Reply Last reply Reply Quote 0
                  • S Offline
                    SteveITS Galactic Empire @tinfoilmatt
                    last edited by

                    @tinfoilmatt said in Unexpected alias behaviour - two ranges:

                    Clearly I've been 'reading it all',

                    That wasn't directed at you, I just meant to read my whole post, there, since the behaviors changed.

                    @tinfoilmatt said in Unexpected alias behaviour - two ranges:

                    I had a feeling...

                    That wasn't the case last night, they did update on Apply.

                    @tinfoilmatt said in Unexpected alias behaviour - two ranges:

                    Or you could just, like—not introduce user error and it probably wouldn't be necessary.

                    What was the error you allege in today's post? AFAIK if I empty a table and filter reload, pfSense is supposed to populate the table.

                    Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                    When upgrading, allow 10-15 minutes to reboot, or more depending on packages, CPU, and/or disk speed.
                    Upvote 👍 helpful posts!

                    tinfoilmattT P 2 Replies Last reply Reply Quote 0
                    • tinfoilmattT Offline
                      tinfoilmatt @SteveITS
                      last edited by

                      @SteveITS said in Unexpected alias behaviour - two ranges:

                      I just meant to read my whole post

                      I would hope anybody participating here and on the entire forum—nay, the entire Internet—thoroughly reads and considers in earnest any communtication directed at them by a fellow human being.

                      But back to topic at-hand, anything you did today is preempted by the fact that you didn't start with...

                      [in Unexpected alias behaviour - two ranges:]

                      I created a VM with 2.8.1.
                      I used easyrule to allow access on WAN.
                      I bypassed the GUI setup wizard.

                      ...like you did yesterday. (Some people refer to this methodology colloquially as 'blowing everything out and starting over.') In other words you didn't even consistently recreate your own test.

                      I could break any system with some formulation of rm or system-specific equivalent. What does that tell anyone?

                      stephenw10S 1 Reply Last reply Reply Quote 0
                      • stephenw10S Offline
                        stephenw10 Netgate Administrator @tinfoilmatt
                        last edited by

                        @tinfoilmatt said in Unexpected alias behaviour - two ranges:

                        I would hope anybody participating here and on the entire forum—nay, the entire Internet—thoroughly reads and considers in earnest any communtication directed at them by a fellow human being.

                        😂

                        tinfoilmattT 1 Reply Last reply Reply Quote 0
                        • tinfoilmattT Offline
                          tinfoilmatt @stephenw10
                          last edited by

                          @stephenw10 Hey, at least you're getting paid to placate this behavior. I'm only here to have fun! 🤣

                          1 Reply Last reply Reply Quote 1
                          • P Online
                            Patch @SteveITS
                            last edited by Patch

                            @SteveITS
                            After more testing I suspect the root cause of the bug investigated in this thread is

                            • An alias containing one or more FQDN is limited to a little over 512 entries (total across all such aliases), not the 5000 per alias limit suggested in the manual.
                            • if you go over that limit further alias table updates are blocked for all aliases

                            For the duplicate removal but not restored bug discovered again in this thread, and referenced in your thread, I can't see that being resolved any time soon. Learning how best to live with it is more sensible imo.

                            Easy bug trigger

                            Tested in a clean pfsense v2.8.1 install, enable WAN GUI access. Save that configuration as baseline. Done to ensure easy repeatability and reduce the validity of the shooting the messenger crap.

                            The easiest way of triggering the bug is entering the following 1024 consecutive element alias -> 473 element are shown in the corresponding alias table.
                            92 Combined FQDN x1 IPv4 x1024 consecutive.jpg

                            Random IP addresses & lower bound

                            To put a bound on the lower limit and confirm sequential IP addresses are irrelevant
                            Reload the baseline configuration and enter the a 513 element alias such as the following, -> 514 elements are shown in the corresponding alias table
                            91 Combined FQDN x1 IPv4 x512.jpg

                            Reload the baseline configuration and enter a 1025 element alias (using the method illustrated above) -> 473 element are shown in the corresponding alias table. Which is exactly the same as the sequential IP initial test case. Waiting longer, Filter reload and "killall filterdns" all make no difference.

                            Limit per alias or total record count

                            To illustrate the limit depends on the load implied by other alias
                            Restore the baseline configuration
                            Enter Alias like IP_set1 with 50 IP + FQDN x1 as shown below -> 52 records in Alias table
                            93 IP_set1 FQDN x1 IPv4 x50.jpg

                            Enter Alias like IP_set2 with 50 IP + FQDN x1 using the same method (mine started 202.) -> 52 records shown in the alias table

                            Now again try and enter IP_set3 with 512 IP + FQDN x1 as shown below -> zero records. After several filter reloads 271 records. Waiting longer, Filter reload and "killall filterdns" all make no difference.
                            Which contrasts with the 514 records when no other aliases were entered.
                            94 IP_set1 FQDN x1 IPv4 x512.jpg

                            Limit for "mixed mode" alias or also pure FQDN

                            To investigate if using explicit IP addresses vs FQDN addresses matter
                            Reload the baseline configuration and enter a 1024 FQDN, I copied the first 1024 entires form pfblocker. -> 569 records after wait about 1 hour and lots of filter reloads. After which

                            • "killall filterdns" -> No matching processes were found
                            • create an Test_loaded containing a single FQDN -> empty alias table

                            Which strongly suggests "mixed mode" alias are no different to pure FQDN aliases, just testing takes longer.

                            95 Combined FQDN x1024.jpg

                            P tinfoilmattT 2 Replies Last reply Reply Quote 0
                            • P Online
                              Patch @Patch
                              last edited by Patch

                              Updated above post to investigate if using explicit IP addresses vs FQDN records matters
                              Updated title to more accurately describe root cause evident in hind sight

                              1 Reply Last reply Reply Quote 0
                              • tinfoilmattT Offline
                                tinfoilmatt @Patch
                                last edited by

                                @Patch said in Unexpected alias behaviour - two ranges / size limits with FQDN:

                                Easy bug trigger
                                [ . . . ]
                                The easiest way of triggering the bug is entering the following 1024 consecutive element alias -> 473 element are shown in the corresponding alias table.

                                I can confirm this behavior.

                                ecd46d8a-5569-45ac-b89e-a0000e4d2888-image.png

                                9be71e3c-e06f-41ec-b9f3-686cab04606d-image.png

                                1447f1d9-3d22-4501-8f9e-e5ff333dd837-image.png

                                1 Reply Last reply Reply Quote 0
                                • tinfoilmattT Offline
                                  tinfoilmatt
                                  last edited by

                                  I am holding here despite the...

                                  Aliases become Tables when loaded into the active firewall ruleset. The contents displayed on this page reflect the current addresses inside tables used by the firewall.

                                  ...advisory.

                                  Please let me know if there's something I can/should do next that would further assist the thread.

                                  1 Reply Last reply Reply Quote 0
                                  • tinfoilmattT Offline
                                    tinfoilmatt
                                    last edited by

                                    The 473 records are sequential from 123.123.120.0 through 123.123.121.216.

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S Offline
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      Hmm, I don't think this is a numerical limit. It seems more like a function of timing to me. But either way I would bet the root issue is in filterdns.

                                      1 Reply Last reply Reply Quote 0
                                      • P Online
                                        Patch
                                        last edited by Patch

                                        @bbcan177 You have been dealing with far larger sets of IP addresses and FQDN in pfsense. How have you been working around

                                        1. The limitation in cumulative number of entries in aliases containing a FQDN as described above

                                        2. The possibly related consequence of using incremental alias update with duplicate removal but then using incremental IP removal without restoration of he removed duplicate as per https://redmine.pfsense.org/issues/13792 and possibly https://redmine.pfsense.org/issues/9296 and https://redmine.pfsense.org/issues/13793

                                        The short term fix

                                        1. is relatively simple, just document the limit, add it to redmine and say it's being worked on. Something I can easily work with.

                                        2. is more difficult as that bug essential requires the user ensure all alias containing FQDN are always non intersecting (never trigger duplicate removal). Which severely limits the value of alias for me.

                                        The longer term fix maybe more difficult as it maybe a program architecture limit rather than a coding bug.

                                        • The limit for "mixed mode" aliases may be resolvable by internally processing the constants (explicitly specified IP addresses) separately to the variable entries (FQDN) then combining the two "sub aliases". The constant portion only need to be evaluated at alias editing or program boot up. Optimisation to group longer runs of consecutive addresses into network would be possible.

                                        • Regression analysis I suspect would show the progressive slowing if alias update and eventual lock up investigated in this thread is a result of earlier work to try and avoid the limitations of implicit duplicate removal addressed in https://redmine.pfsense.org/issues/9296

                                        • A solution maybe to pre process so addition to filterdns are only made when the duplicate count for that address goes from 0->1 and similalry deletion from filterdns are only made when the duplicate count for that address goes from 1->0. Other duplicate count transitions 1<->2<->3 etc result in no filterdns calls.

                                        • Alternatively incremental deletes could be avoided completely by all updated being done by a full alias rebuild to a temporary variable then full replacement of the old with the new.

                                        None of which sound trivial to me.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.