Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    NTP problems

    Scheduled Pinned Locked Moved General pfSense Questions
    20 Posts 3 Posters 3.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      rct
      last edited by

      Hello,

      Since some times I've problems to get my pfsense backup server to synchronize itself from world ntp servers.
      My first server has no problem and configuration is synchronized to the the backup.
      The ntp is served to local machines from pfsense.

      I've checked DNS, ntp servers and everything seems to be the same in the 2 pfsense servers.

      When I do ntpq -np on the backup it answer that:

           remote           refid      st t when poll reach   delay   offset  jitter
      ==============================================================================
       194.2.0.58      .INIT.          16 u    -  512    0    0.000    0.000   0.000
       194.2.0.28      .INIT.          16 u    -  512    0    0.000    0.000   0.000
      

      Do you have an idea of what to do/what to check to make it work again?

      1 Reply Last reply Reply Quote 0
      • johnpozJ
        johnpoz LAYER 8 Global Moderator
        last edited by

        and can you query those ntp0.oleane.net and ntp1.oleane.net ntp servers?

        from ntpq do a as what does that show?

        An intelligent man is sometimes forced to be drunk to spend time with his fools
        If you get confused: Listen to the Music Play
        Please don't Chat/PM me for help, unless mod related
        SG-4860 24.11 | Lab VMs 2.8, 24.11

        1 Reply Last reply Reply Quote 0
        • R
          rct
          last edited by

          Hello,

          Thanks for your answer.

          Ntpq is answering this :

          On the backup server (sync ko):

          ind assID status  conf reach auth condition  last_event cnt
          ===========================================================
            1 32520  8011   yes   yes  none    reject    IP error  1
            2 32521  8011   yes   yes  none    reject    IP error  1
          

          On the main server (sync ok):

          ind assID status  conf reach auth condition  last_event cnt
          ===========================================================
            1 37584  8011   yes   yes  none    reject    IP error  1
            2 37585  963a   yes   yes  none  sys.peer              3
          
          1 Reply Last reply Reply Quote 0
          • R
            rct
            last edited by

            From the ntpd.log:

            On the backup server:

            Aug 25 12:19:48 poseidon2 ntpd[70503]: Listening on routing socket on fd #35 for interface updates
            Aug 25 12:20:02 poseidon2 ntpd[70503]: ntpd exiting on signal 15 (Terminated: 15)
            Aug 25 12:20:23 poseidon2 ntpdate[32067]: adjust time server 194.2.0.28 offset -0.000273 sec
            Aug 25 12:20:23 poseidon2 ntp: Successfully synced time after 1 attempts.
            Aug 25 12:20:23 poseidon2 ntp: Starting NTP Daemon.
            

            On the main server:

            Aug 21 19:05:09 poseidon1 ntpd[57849]: Listening on routing socket on fd #35 for interface updates
            Aug 21 19:05:09 poseidon1 ntpd[57849]: restrict default: KOD does nothing without LIMITED.
            Aug 21 19:05:09 poseidon1 ntpd[57849]: restrict ::: KOD does nothing without LIMITED.
            Aug 25 03:03:02 poseidon1 ntpdate[12313]: can't find host ntp1.oleane.net
            Aug 25 03:03:03 poseidon1 ntpdate[12313]: adjust time server 194.2.0.28 offset 0.043526 sec
            Aug 25 03:03:03 poseidon1 ntp: Successfully synced time after 1 attempts.
            Aug 25 03:03:03 poseidon1 ntp: Starting NTP Daemon.
            

            Reading the logs the NTP sync seems to be fine on the backup server but this is the one which does not work.
            I don't know what to think of that…

            1 Reply Last reply Reply Quote 0
            • R
              rct
              last edited by

              I tried to change the ntp servers as one of the previous seems to be down.

              After the change the main server has quickly synced and the backup fails with the same errors.

              1 Reply Last reply Reply Quote 0
              • A
                AIMS-Informatique
                last edited by

                Does your backup have a dedicated WAN IP (appart from the CARP/VIP shared by the Backup + Main cluster).

                Looks like when your Backup sends the request, the Main hold the response from the NTP server. As if it didn't handle the whole connection besause "speaking" on the WEB with your Main WAN IP Adress.

                1 Reply Last reply Reply Quote 0
                • R
                  rct
                  last edited by

                  Edit: yes AIMS it looks like you've put the finger on it

                  Hmm I think I've found why it doesn't work: the main & the backup servers have a VIP (carp) for each interface.
                  I looks like the ntp request issued from the backup uses the source address of the LAN's interface VIP and is later NATted with the VIP of the WAN's interface VIP so the return packet could not find it's way back.

                  I could modify the NAT to use the backup server WAN dedicated IP address but later I think the packet will try to reach the LAN VIP and it won't work again.

                  Does someone have an idea of how to make this work?

                  1 Reply Last reply Reply Quote 0
                  • johnpozJ
                    johnpoz LAYER 8 Global Moderator
                    last edited by

                    See the reject..

                    Yeah that is not going to work ;)  Now need to figure out why rejected or what is causing the ip error?  Normally you should setup more than 1 ntp server for your sources.  Could be a state issue?  Just spit balling here.  Do both pfsense wan go through a nat?  or do they have their own public IPs?

                    Add more than 1 server, using 2 different names IPs for the same source is not really good setup - pick some different ntp servers in your region, 3 or 4 of them even.  Use pool if you want.  This way even if you have issues with 1 source ntp on pfsense will just sync with one of the other ones.

                    You can do some direct queries to the ntp servers you pick, and do some higher level troubleshooting to why they might not be answering you or even rejecting you for a specific reason.  I am not really familiar with the ip error..  But if your pfsense boxes are from different public IP or the same could help us figure out why 1 can sync to ntp server A, but pfsense 2 is not, etc.

                    edit:  Ah looks like you have a carp setup, etc.. Yeah looks like that root of the issue.

                    An intelligent man is sometimes forced to be drunk to spend time with his fools
                    If you get confused: Listen to the Music Play
                    Please don't Chat/PM me for help, unless mod related
                    SG-4860 24.11 | Lab VMs 2.8, 24.11

                    1 Reply Last reply Reply Quote 0
                    • A
                      AIMS-Informatique
                      last edited by

                      the design in HA mode is :
                      1 WAN IP for each PF (Main and Backup) plus a CARP (or VIP depending on L2 /L3 architecture you choosed). So in a HA infrastructure, you'll need 3 IP given by your ISP. (And the same for LAN side!)

                      Moreover, you'll need a dedicated "Sync" interface for the firewall to chat and exchange theire states (this IP is used in HighAvailability -> Synchronize Config to IP).

                      Don't mess with NAT, an HA infrastructure doen't involve NATting in it, but will propagate form a unit to another your NAT settings, but wont use NAT for HA!

                      So here is the workaround :
                      WANMAIN1 : 80.90.150.120 (For PF internals / HA purpose, don't declare this IP to any DNS, declare you CARP IP)
                      WANBKP1 : 80.90.150.121 (For PF internals / HA purpose, don't declare this IP to any DNS, declare you CARP IP)
                      WANSHARED (CARP L2 / VIP L3) : 80.90.150.122 (which will be the only one "shown" on the Internet.)

                      Same for LAN side.

                      rely on this post to start with HA : https://doc.pfsense.org/index.php/Configuring_pfSense_Hardware_Redundancy_%28CARP%29

                      1 Reply Last reply Reply Quote 0
                      • R
                        rct
                        last edited by

                        I will check this and make the appropriate changes when it will be possible.

                        Thank you

                        1 Reply Last reply Reply Quote 0
                        • A
                          AIMS-Informatique
                          last edited by

                          Let us know!

                          1 Reply Last reply Reply Quote 0
                          • R
                            rct
                            last edited by

                            I just checked the document you linked and we already have this configuration.
                            For a production system I'm happy not to have to change too much things!

                            But my problem remains! :)

                            1 Reply Last reply Reply Quote 0
                            • A
                              AIMS-Informatique
                              last edited by

                              Can you send us a traceroute to 8.8.8.8 from both Main & Bkp PF ?

                              1 Reply Last reply Reply Quote 0
                              • R
                                rct
                                last edited by

                                Thank you also johnpoz, didn't see your second message.

                                I'm really bad at working with ntp… Don't know why I never really understood how it works, how can I debug it, ...

                                I don't really know what's the best way to make direct queries to the ntp servers but I tried this :

                                • stopped the ntpd on the backup server
                                • launched a ntpdate 0.fr.pool.ntp.org it is ok : 25 Aug 18:27:25 ntpdate[84361]: adjust time server 37.59.25.31 offset 0.014577 sec
                                • started back the ntpd on the backup server
                                  => stays rejected with the same error but I think it's just my test that is not good (I haven't seen a communication on port 123 with ntpdate and I've seen that ntpd do stuff on port 123…)

                                @AIMS
                                From main server
                                traceroute to 8.8.8.8 (8.8.8.8), 64 hops max, 52 byte packets
                                1  rev-XXX-XXX-XXX.isp3.alsatis.net (XXX.XXX.XXX.17)  0.247 ms  0.161 ms  0.098 ms
                                2  rev-97-143-19.isp1.alsatis.net (92.245.143.97)  0.564 ms  0.614 ms  0.619 ms
                                3  ge-1-0-1.tcr1.bal.tls.core.as8218.eu (213.152.0.145)  43.086 ms  59.560 ms  17.504 ms
                                4  xe-1-3-0.tcr2.bal.tls.core.as8218.eu (83.167.56.242)  17.561 ms  17.495 ms  17.516 ms
                                5  xe-1-2-0.ter2.neodc.mpl.core.as8218.eu (83.167.55.52)  18.069 ms  21.494 ms  17.736 ms
                                6  xe-1-3-0.ter1.neodc.mpl.core.as8218.eu (83.167.55.48)  17.728 ms  17.703 ms  17.725 ms
                                7  xe-1-1-0.tcr1.sfr.mrs.core.as8218.eu (83.167.55.64)  17.884 ms  17.771 ms  17.717 ms
                                8  ae0.tcr1.sfr.lyn.core.as8218.eu (83.167.55.18)  17.596 ms  17.728 ms  17.591 ms
                                9  ae8.tcr1.rb.par.core.as8218.eu (83.167.55.12)  17.651 ms  17.934 ms  19.481 ms
                                10  ae3.tcr1.th2.par.core.as8218.eu (83.167.56.221)  17.817 ms
                                    ae0.tcr1.th2.par.core.as8218.eu (83.167.55.22)  17.764 ms
                                    et-1-0-0.tcr2.rb.par.core.as8218.eu (83.167.55.149)  17.587 ms
                                11  et-1-0-0.tcr2.th2.par.core.as8218.eu (83.167.55.47)  17.677 ms  17.693 ms  17.592 ms
                                12  213.152.30.17.static.not.updated.as8218.eu (213.152.30.17)  18.050 ms  17.964 ms  17.890 ms
                                13  72.14.239.145 (72.14.239.145)  18.361 ms
                                    72.14.239.205 (72.14.239.205)  18.283 ms  18.270 ms
                                14  209.85.245.72 (209.85.245.72)  20.166 ms  18.489 ms
                                    209.85.245.70 (209.85.245.70)  18.115 ms
                                15  209.85.254.62 (209.85.254.62)  23.252 ms
                                    216.239.43.233 (216.239.43.233)  37.423 ms  48.552 ms
                                16  google-public-dns-a.google.com (8.8.8.8)  22.598 ms  22.724 ms  22.945 ms

                                From backup server:
                                traceroute to 8.8.8.8 (8.8.8.8), 64 hops max, 52 byte packets
                                1  rev-XXX-XXX-XXX.isp3.alsatis.net (XXX-XXX-XXX.17)  0.225 ms  0.179 ms  0.091 ms
                                2  rev-97-143-19.isp1.alsatis.net (92.245.143.97)  0.565 ms  0.649 ms  0.586 ms
                                3  ge-1-0-1.tcr1.bal.tls.core.as8218.eu (213.152.0.145)  17.749 ms  17.775 ms  17.686 ms
                                4  xe-1-3-0.tcr2.bal.tls.core.as8218.eu (83.167.56.242)  18.991 ms  55.620 ms  17.679 ms
                                5  xe-1-2-0.ter2.neodc.mpl.core.as8218.eu (83.167.55.52)  43.218 ms  17.725 ms  17.683 ms
                                6  xe-1-3-0.ter1.neodc.mpl.core.as8218.eu (83.167.55.48)  20.209 ms  17.745 ms  17.755 ms
                                7  xe-1-1-0.tcr1.sfr.mrs.core.as8218.eu (83.167.55.64)  17.555 ms  52.417 ms  17.544 ms
                                8  ae0.tcr1.sfr.lyn.core.as8218.eu (83.167.55.18)  17.534 ms  17.521 ms  17.559 ms
                                9  ae8.tcr1.rb.par.core.as8218.eu (83.167.55.12)  17.625 ms  17.638 ms  17.620 ms
                                10  ae3.tcr1.th2.par.core.as8218.eu (83.167.56.221)  17.711 ms
                                    ae0.tcr1.th2.par.core.as8218.eu (83.167.55.22)  19.383 ms
                                    ae3.tcr1.th2.par.core.as8218.eu (83.167.56.221)  17.661 ms
                                11  et-1-0-0.tcr2.th2.par.core.as8218.eu (83.167.55.47)  17.755 ms  17.730 ms  17.706 ms
                                12  213.152.30.17.static.not.updated.as8218.eu (213.152.30.17)  17.894 ms  17.807 ms  17.961 ms
                                13  72.14.239.205 (72.14.239.205)  18.128 ms
                                    72.14.239.145 (72.14.239.145)  18.090 ms
                                    72.14.239.205 (72.14.239.205)  18.137 ms
                                14  209.85.245.81 (209.85.245.81)  18.434 ms
                                    209.85.245.72 (209.85.245.72)  18.530 ms
                                    209.85.245.81 (209.85.245.81)  18.457 ms
                                15  209.85.242.132 (209.85.242.132)  22.975 ms
                                    209.85.249.16 (209.85.249.16)  24.813 ms
                                    209.85.248.202 (209.85.248.202)  24.799 ms
                                16  google-public-dns-a.google.com (8.8.8.8)  22.569 ms  22.498 ms
                                    209.85.250.163 (209.85.250.163)  22.973 ms

                                1 Reply Last reply Reply Quote 0
                                • A
                                  AIMS-Informatique
                                  last edited by

                                  You are going out with the same IP, and because your Main is the "Master" unit, it does get your response from the Bkp request.

                                  When you traceroute from a PF unit it uses its internal (default) GW for 127.0.0.1 (generally the first WAN configured, which is actually your default GW in Routing menu).

                                  For me, you did configure your 2 firewalls with the same IP. It works becaus the unit 1 (Main) is set as Master in your HA cluster config. But as long as you want to test with your Bkp unit, this one will fail receiving packets because master does. Try to get your Bkp unit as the Master Unit : Sure it will get its hour from your NTP pool.

                                  Are you sure u are using CARP / VIP for shared WAN IP ?

                                  Both firewalls should show you 2 different IPs for first hop (when tracing route to somewhere). It would never go out with the CARP IP. What make the firewall uses its CARP (fail over IP), is by having a 1:1 NAT configured, or an AON with a rules nating something from the LAN with this CARP IP.

                                  You might have missed something in the HA design ?

                                  1 Reply Last reply Reply Quote 0
                                  • R
                                    rct
                                    last edited by

                                    The traceroute's first hop IP XXX.XXX.XXX.17 is the WAN1 provider's router.
                                    Each pfSense has 2 WANs interfaces (WAN1 and WAN2) and each WAN[1|2] interface has a different IP on main and backup servers.
                                    Shared WAN ip is using CARP.
                                    Each internal network that is authorized to go on WAN has an advanced NAT outbound rule.

                                    I reviewed the the HA design and except we don't have a DHCP server on pfSense it seems to be ok.

                                    Like you said making the backup server being the master make time sync work!
                                    I'm not really surprised by that because when I traced NTP packets on the backup server I've seen that they were transmitted with our LAN VIP address and were natted to our WAN1 (active gw) VIP address. Those packets can't find their way back to the backup server. If they were transmitted by localhost we could do another NAT rule to make them be natted with the main/backup server WAN address and it should be ok.

                                    1 Reply Last reply Reply Quote 0
                                    • A
                                      AIMS-Informatique
                                      last edited by

                                      Didn't catch you had 2 WANS on each…
                                      The HA design you opted for is a HA concerning PF boxes (not the traffic). And i understand now that you need link redundancy as well... ok.

                                      You should first test if your cluster is OK with 1 link. And then, we will see how you can add another link for your traffic redundancy.
                                      Basically, put your WAN2 offline on both units, delete nat / rules related to WAN2. Take a basic switch, plug in your 2 WAN1 (Master and Slave) and your ISP box. You should endup with 3 cables plugged in the Switch.

                                      And test if your CARP works and if you can make the Bkp (Slave) unit to work on the Internet.

                                      Can you tell us precisely what your ISP delivers to you ? 2 xDSL links + 3 IPs on each ? The same IPs or not ?

                                      1 Reply Last reply Reply Quote 0
                                      • R
                                        rct
                                        last edited by

                                        AIMS my english is not good and I'm not sure you understood me or I don't understand you :)

                                        We have 2 pfsense boxes with several internal networks (lan ,dmz, …) and two different wans (different isps).
                                        They are configured to failover if one of the box goes down. We have one main wan and a backup wan configured to failover if main wan goes down.

                                        We have no problem with our setup except that since some time the backup server shows as not syncing time (and now I think that the problem has always been there and it is a recent update of nagios plugins that pointed us to the problem).

                                        To answer you about isps we have :

                                        • ISP1 : 1 ip for each pfsense box and 1 ip for carp / sdsl
                                        • ISP2 : 1 ip for each pfsense box and 1 ip for carp / sdsl
                                          It's 2 different ranges of IPs.

                                        The problem causes mostly an annoying nagios message but as we've seen when the backup server becomes master the ntp sync goes functional.

                                        1 Reply Last reply Reply Quote 0
                                        • A
                                          AIMS-Informatique
                                          last edited by

                                          :) i'm not english too so…

                                          Fisrts things firts... Try to make the system working with only 1 ISP. (That was the meaning of my last post).

                                          The goal is to check that your HA design doesn't interfer badly with your problem. Because i think the problem is not relative to your HA design (Master Slave), but i think your problem is the way you want to handle the multi WAN.

                                          Try to make things working with one 1 ISP and then we will be moving on adding the second one.

                                          1 Reply Last reply Reply Quote 0
                                          • R
                                            rct
                                            last edited by

                                            I will try that later because I really can't now.
                                            Thanks for helping

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.