Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    SMTP Notification issue since upgrade from 2.7.2 to 2.8

    Scheduled Pinned Locked Moved General pfSense Questions
    21 Posts 3 Posters 709 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • GertjanG
      Gertjan @EyeTap
      last edited by

      @EyeTap said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

      smtp server residing at the provider which is reached via WAN1

      Do you know what mail port is used ?
      If it's "25" then it's normal that you can send mails when "WAN1" is down, as the outgoing mail would go over WAN2, and from WAN2 (another ISP ?!), you can't (are not a allowed) to reach your WAN1 ISP mail server.
      This is pretty standard behavior.

      Or do you use submission (port 58) or smpts (port 465) ? In that case, you could send mails using the ISP providing WAN1, over WAN2.

      @EyeTap said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

      (as the provider might not accept smtp from ip addresses from outside of his range)

      Exact, so you're aware of this gotcha.
      Not really a problem, as sending mails using port "25" should be 'forbidden' anyway - use port 587 or 465 (conditions might apply) and suddenly you can send mails using your ISP from everywhere on the planet.

      @EyeTap said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

      Jun 27 03:49:20 php-cgi 17083 notify_monitor.php: Message sent to my@emailaddress.at OK
      Jun 27 03:48:50 php-cgi 17083 notify_monitor.php: Could not send the message to my@emailaddress.at -- Error: Failed to add recipient: my@emailaddress.at [SMTP: Invalid response code received from server (code: -1, response: )]

      At 03:48:50 the mail couldn't be sent,
      but 30 seconds later, 03:49:20, it was send.

      So, a working gateway came up or gateway fail-over started to work ?

      No "help me" PM's please. Use the forum, the community will thank you.
      Edit : and where are the logs ??

      E 1 Reply Last reply Reply Quote 0
      • E
        EyeTap @Gertjan
        last edited by

        @Gertjan said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

        @EyeTap said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

        smtp server residing at the provider which is reached via WAN1

        Do you know what mail port is used ?
        If it's "25" then it's normal that you can send mails when "WAN1" is down, as the outgoing mail would go over WAN2, and from WAN2 (another ISP ?!), you can't (are not a allowed) to reach your WAN1 ISP mail server.
        This is pretty standard behavior.

        I fear we do have a misunderstanding. What I tried to outline is, that if WAN2 (Backup only) is going down / is ommitted from the Gateway-group, Notification mails arent sent (port 587) - despite they should be transferred via WAN1 (Main line, never went down) anyway.

        Or do you use submission (port 58) or smpts (port 465) ? In that case, you could send mails using the ISP providing WAN1, over WAN2.

        Port 587 is used.

        @EyeTap said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

        (as the provider might not accept smtp from ip addresses from outside of his range)

        Exact, so you're aware of this gotcha.
        Not really a problem, as sending mails using port "25" should be 'forbidden' anyway - use port 587 or 465 (conditions might apply) and suddenly you can send mails using your ISP from everywhere on the planet.

        Agreed - but still...

        @EyeTap said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

        Jun 27 03:49:20 php-cgi 17083 notify_monitor.php: Message sent to my@emailaddress.at OK
        Jun 27 03:48:50 php-cgi 17083 notify_monitor.php: Could not send the message to my@emailaddress.at -- Error: Failed to add recipient: my@emailaddress.at [SMTP: Invalid response code received from server (code: -1, response: )]

        At 03:48:50 the mail couldn't be sent,
        but 30 seconds later, 03:49:20, it was send.

        So that's not odd if the main line WAN1 - which connects to the provider with the SMTP service didnt change at all?

        So, a working gateway came up or gateway fail-over started to work ?

        Well.. the basically unused backup WAN2 came up. That's all... and what confuses me that much...

        GertjanG 1 Reply Last reply Reply Quote 0
        • GertjanG
          Gertjan @EyeTap
          last edited by

          @EyeTap said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

          I fear we do have a misunderstanding. What I tried to outline is, that if WAN2 (Backup only) is going down / is ommitted from the Gateway-group,

          Ok, get it (better ^^) :
          A none used (non essential WAN2 - but part of the gateway group) goes down, while WAN1 is still up.
          I agree, an smtp outgoing mail should be possible.

          When the WAN2 is down, what happens, see also System log, when you press this button :

          2345c3eb-f275-4351-9e1b-4b7b330631bf-image.png

          ?

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          E 1 Reply Last reply Reply Quote 0
          • E
            EyeTap @Gertjan
            last edited by

            @Gertjan
            I tested by switching off the Provider Gateway / Modem for WAN2 - sending a testmessage works flawlessly in this state.
            Neither a Gateway down nor a Gateway up notification is sent though

            Here the log contents when the Gateway went down:

            f9ea9079-4bde-41ee-854c-1af4e1dcb38b-image.png

            Here the log contents when the Gateway went down (not sure what those "sendto errors:13 mean though. Those show up between both parts quite some times.):

            3bc65185-57e5-4662-9811-71483f01e9cc-image.png

            E 1 Reply Last reply Reply Quote 0
            • E
              EyeTap @EyeTap
              last edited by

              @Gertjan

              Just noticed something strange - if WAN2 recovers this impacts WAN1 connections.
              I can even see this in the monitoring charts of the WAN1/2 IF traffic.
              (I would assume due some counter reset the data is skipped as the values would run backwards)

              Not really what I would expect of a backup line though.. but maybe I should reread the release notes of 2.8. again.. I got something in the back of my mind that something had been mentioned around flushing states on Gateway recovery...

              380afe52-fd83-4747-941d-990ae360eea1-image.png

              GertjanG 1 Reply Last reply Reply Quote 0
              • GertjanG
                Gertjan @EyeTap
                last edited by

                @EyeTap

                These "sendto error 13" are puzzling.
                It's FreeBSD so this, afaik, applies.

                13 EACCES Permission denied. An attempt was made to access a file in a
                way forbidden by its file access permission

                A permission error ?

                Something is 'not good' about the gateways ... Can't say more.

                No "help me" PM's please. Use the forum, the community will thank you.
                Edit : and where are the logs ??

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Snort/Suricata blocking it?

                  E 1 Reply Last reply Reply Quote 0
                  • E
                    EyeTap @stephenw10
                    last edited by EyeTap

                    @stephenw10
                    Nope, definitely not. Neither of those installed…
                    In any case it’s confusing me that WAN2 outages / recoveries impact connections being active on WAN1…
                    From my understanding that shouldn’t happen…
                    I’ve never seen that on 2.7.2 or earlier…

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Hmm, have you tried changing the firewall state policy back to floating as a test?

                      Hard to see how that would end up with blocked traffic since all outbound traffic is allowed by default but that is something that's changed in 2.8:
                      https://docs.netgate.com/pfsense/en/latest/releases/2-8-0.html#general

                      Unless you added a custom block rule outbound maybe?

                      E 1 Reply Last reply Reply Quote 0
                      • E
                        EyeTap @stephenw10
                        last edited by

                        @stephenw10
                        Thanks for pointing that out- I will give this a try and give feedback once done.
                        Outbound block rules - yes but for port 138 and similar- i don’t know by heart right now…

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Ah, if you have custom outbound block rules then you may potentially have been allowing the traffic in 2.7.2 by virtue on the floating state policy and that no longer present.

                          If that does turn out to be the case then you almost certainly have a bad rule that was just being hidden. Should be easy enough to find.

                          E 1 Reply Last reply Reply Quote 0
                          • E
                            EyeTap @stephenw10
                            last edited by

                            @stephenw10

                            So.. I did go to System / Advanced / Firewall & NAT and switched Firewall State Policy: Interface Bound States ==> Floating States.
                            Subsequently I rebooted - just to be on the safe side.

                            I gave it a test then to switch off again the WAN2 Provider Modem and the test smtp arrived properly.
                            After switching the Modem back on it took a while until the respective Gatewaystatus switched from Unknown to Offline - but a Mail was sent properly:

                            Notifications in this message: 1

                            19:20:26 MONITOR: WAN2_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 9.9.9.10|192.168.100.19|WAN2_Gateway|0ms|0ms|100%|down|highloss

                            After the WAN2 came back functional also the "all good" mail was sent:

                            Notifications in this message: 1

                            19:21:26 MONITOR: WAN2_Gateway is available now, adding to routing group MULTIPLE_WAN 9.9.9.10|212.186.xx.yy|WAN2_Gateway|26.378ms|4.516ms|0.0%|online|none

                            What I noticed though is, that for a brief time BOTH - WAN 1 and 2 were shown with status "unknown"

                            Again the traffic graph shows an interruption on both gateways.

                            Running another test with an active stream (should have used WAN1) worked without any interruption - contrary to yesterday (with Interface Bound States) where I was kicked from the stream.
                            Again the traffic graph still shows an interruption on both gateways.

                            I attached the log, maybe it helps....
                            pfs_280_1.txt

                            Here my rules:

                            920ad160-36e5-46d0-9656-2f179e4219a2-image.png

                            The Gateways:

                            703adc03-678e-4f36-8753-4ac454e6a0d9-image.png

                            Last but not least the Gateway groups:

                            73a03637-b0b9-43f6-97dd-fa9355dca7d4-image.png

                            Thanks a lot for your kind support!

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              What is your default gateway set to?

                              Do you have any floating outbound rules? That's what could cause a problem for SMTP connections from the firewall itself.

                              E 2 Replies Last reply Reply Quote 0
                              • E
                                EyeTap @stephenw10
                                last edited by

                                @stephenw10

                                I implemented the CoDel Limiter for both WAN lines and the fixes for Tracert and so:

                                4cb4f0b9-e42d-4934-8310-236f1b2de4ca-image.png

                                1 Reply Last reply Reply Quote 0
                                • E
                                  EyeTap @stephenw10
                                  last edited by

                                  @stephenw10

                                  Tested again now - no iterruption on WAN 1, mails telling about the status of WAN 2 sent properly...
                                  Maybe it really was Interface Bound States that caused issues.. although I wouldnt understand the reason.. but I am by faaaaaar no expert in this area.. just glad if I get things working the way I hope them to do....

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Hmm, and you never saw any blocked outbound traffic with those sendto error 13 logs?

                                    I'd guess it's somehow reusing an open state on a different WAN and that fails to match the states are bound. But I can't see how that could happen.

                                    You could try setting those CoDel rules individually as floating (in the advanced options) whilst keeping the global option as interface bound as a test. That would prove it if the issue is there.

                                    E 2 Replies Last reply Reply Quote 0
                                    • E
                                      EyeTap @stephenw10
                                      last edited by

                                      @stephenw10 said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

                                      Hmm, and you never saw any blocked outbound traffic with those sendto error 13 logs?

                                      No. I didnt notice anything being blocked - but I have to admit that I didnt check the logs and I usually dont log blocks anyway either.
                                      But nothing obvious I'd have noticed...

                                      I'd guess it's somehow reusing an open state on a different WAN and that fails to match the states are bound. But I can't see how that could happen.

                                      You could try setting those CoDel rules individually as floating (in the advanced options) whilst keeping the global option as interface bound as a test. That would prove it if the issue is there.

                                      I will give this a try but I have to ask for patience for a bit.. not sure I will manage to do tomorrow, the days after are rather busy, but on the weekend it should work out...

                                      1 Reply Last reply Reply Quote 1
                                      • E
                                        EyeTap @stephenw10
                                        last edited by

                                        @stephenw10 said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:

                                        You could try setting those CoDel rules individually as floating (in the advanced options) whilst keeping the global option as interface bound as a test. That would prove it if the issue is there.

                                        OK, I found time to give that a try - I set the "State Policy" from within the rules configuration for all floating rules to "Floating States" and subsequently set the "Firewall State Policy" back again to "Interface Bound States".

                                        When I simulated an outage of WAN2 again (switching off the Providers Modem (CPE?)) I received the mail once the pfSense listed thes Status of WAN2 as "Offline".
                                        When WAN2 Status came "Online" again I also received the respective mail.

                                        Strange though, that at perfectly the same time, with the same mail I was told that WAN1 would have packet loss and is ommitted from the routing group. Nevertheless this is not reflected at the Quality Monitoring, neither would I have noticed any outage....

                                        Notifications in this message: 1
                                        ================================
                                        
                                        15:05:09 MONITOR: WAN2_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 9.9.9.10|192.168.100.19|WAN2_Gateway|0ms|0ms|100%|down|highloss
                                        
                                        Notifications in this message: 2
                                        ================================
                                        
                                        15:06:10 MONITOR: WAN1_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 1.1.1.3|93.83.uv.xy|WAN1_Gateway|7.788ms|0ms|50%|down|highloss
                                        15:06:10 MONITOR: WAN2_Gateway is available now, adding to routing group MULTIPLE_WAN 9.9.9.10|212.186.ab.cd|WAN2_Gateway|25.869ms|4.865ms|0.0%|online|none
                                        

                                        Giving this another try (switching off the CPE for WAN2 and on after a couple of minutes) I didn't get any notification for WAN1 (as expected)

                                        Notifications in this message: 1
                                        ================================
                                        
                                        15:29:01 MONITOR: WAN2_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 9.9.9.10|192.168.100.19|WAN2_Gateway|0ms|0ms|100%|down|highloss
                                        
                                        Notifications in this message: 1
                                        ================================
                                        
                                        15:30:01 MONITOR: WAN2_Gateway is available now, adding to routing group MULTIPLE_WAN 9.9.9.10|212.186.ab.cd|WAN2_Gateway|25.295ms|4.465ms|0.0%|online|none
                                        

                                        Here the respective quality charts for WAN 1 and WAN 2:

                                        d2721793-8c3b-400c-bc64-9733a8618586-image.png

                                        bc31e319-7ee3-4ac7-916e-959cdf783694-image.png

                                        WAN 2 more details:

                                        835b5307-2e63-428a-a42d-c8e4a715e906-image.png

                                        What I noticed though - and I hadn't truly been aware of this before - the WAN2 Interface changes its IP from n/a to 192.168.100.19 (obviously when the port comes up ==> Status changes to Offline / packet loss at this point in time) and later to the offial one 212.186.ab.cd. before the Status changes to "online" again after a couple of few more seconds (The IF is set to DHCP as by the directive of the ISP).
                                        Not sure if or to which extent this could add up to the observed behavior - so I thought I'd better mention explicitly..

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Ah that private IP might be coming from the modem before it syncs with upstream connection. That's quite commonly provided so you can diagnose any issue. You can set the WAN to refuse leases from the local server though.

                                          E 1 Reply Last reply Reply Quote 0
                                          • E
                                            EyeTap @stephenw10
                                            last edited by

                                            @stephenw10
                                            As long as this temporary private IP isn't causing issues I don't mind at all.

                                            At the end of the day what counts is, that things work properly - and even while the first test looked a bit odd it's looking like setting the floating rules' "State Policy" to "Floating States" did the trick...

                                            Thanks a lot for your precious help! 😄

                                            1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.