SMTP Notification issue since upgrade from 2.7.2 to 2.8
-
@stephenw10
Nope, definitely not. Neither of those installed…
In any case it’s confusing me that WAN2 outages / recoveries impact connections being active on WAN1…
From my understanding that shouldn’t happen…
I’ve never seen that on 2.7.2 or earlier… -
Hmm, have you tried changing the firewall state policy back to floating as a test?
Hard to see how that would end up with blocked traffic since all outbound traffic is allowed by default but that is something that's changed in 2.8:
https://docs.netgate.com/pfsense/en/latest/releases/2-8-0.html#generalUnless you added a custom block rule outbound maybe?
-
@stephenw10
Thanks for pointing that out- I will give this a try and give feedback once done.
Outbound block rules - yes but for port 138 and similar- i don’t know by heart right now… -
Ah, if you have custom outbound block rules then you may potentially have been allowing the traffic in 2.7.2 by virtue on the floating state policy and that no longer present.
If that does turn out to be the case then you almost certainly have a bad rule that was just being hidden. Should be easy enough to find.
-
So.. I did go to System / Advanced / Firewall & NAT and switched Firewall State Policy: Interface Bound States ==> Floating States.
Subsequently I rebooted - just to be on the safe side.I gave it a test then to switch off again the WAN2 Provider Modem and the test smtp arrived properly.
After switching the Modem back on it took a while until the respective Gatewaystatus switched from Unknown to Offline - but a Mail was sent properly:Notifications in this message: 1
19:20:26 MONITOR: WAN2_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 9.9.9.10|192.168.100.19|WAN2_Gateway|0ms|0ms|100%|down|highloss
After the WAN2 came back functional also the "all good" mail was sent:
Notifications in this message: 1
19:21:26 MONITOR: WAN2_Gateway is available now, adding to routing group MULTIPLE_WAN 9.9.9.10|212.186.xx.yy|WAN2_Gateway|26.378ms|4.516ms|0.0%|online|none
What I noticed though is, that for a brief time BOTH - WAN 1 and 2 were shown with status "unknown"
Again the traffic graph shows an interruption on both gateways.
Running another test with an active stream (should have used WAN1) worked without any interruption - contrary to yesterday (with Interface Bound States) where I was kicked from the stream.
Again the traffic graph still shows an interruption on both gateways.I attached the log, maybe it helps....
pfs_280_1.txtHere my rules:
The Gateways:
Last but not least the Gateway groups:
Thanks a lot for your kind support!
-
What is your default gateway set to?
Do you have any floating outbound rules? That's what could cause a problem for SMTP connections from the firewall itself.
-
-
Tested again now - no iterruption on WAN 1, mails telling about the status of WAN 2 sent properly...
Maybe it really was Interface Bound States that caused issues.. although I wouldnt understand the reason.. but I am by faaaaaar no expert in this area.. just glad if I get things working the way I hope them to do.... -
Hmm, and you never saw any blocked outbound traffic with those sendto error 13 logs?
I'd guess it's somehow reusing an open state on a different WAN and that fails to match the states are bound. But I can't see how that could happen.
You could try setting those CoDel rules individually as floating (in the advanced options) whilst keeping the global option as interface bound as a test. That would prove it if the issue is there.
-
@stephenw10 said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:
Hmm, and you never saw any blocked outbound traffic with those sendto error 13 logs?
No. I didnt notice anything being blocked - but I have to admit that I didnt check the logs and I usually dont log blocks anyway either.
But nothing obvious I'd have noticed...I'd guess it's somehow reusing an open state on a different WAN and that fails to match the states are bound. But I can't see how that could happen.
You could try setting those CoDel rules individually as floating (in the advanced options) whilst keeping the global option as interface bound as a test. That would prove it if the issue is there.
I will give this a try but I have to ask for patience for a bit.. not sure I will manage to do tomorrow, the days after are rather busy, but on the weekend it should work out...
-
@stephenw10 said in SMTP Notification issue since upgrade from 2.7.2 to 2.8:
You could try setting those CoDel rules individually as floating (in the advanced options) whilst keeping the global option as interface bound as a test. That would prove it if the issue is there.
OK, I found time to give that a try - I set the "State Policy" from within the rules configuration for all floating rules to "Floating States" and subsequently set the "Firewall State Policy" back again to "Interface Bound States".
When I simulated an outage of WAN2 again (switching off the Providers Modem (CPE?)) I received the mail once the pfSense listed thes Status of WAN2 as "Offline".
When WAN2 Status came "Online" again I also received the respective mail.Strange though, that at perfectly the same time, with the same mail I was told that WAN1 would have packet loss and is ommitted from the routing group. Nevertheless this is not reflected at the Quality Monitoring, neither would I have noticed any outage....
Notifications in this message: 1 ================================ 15:05:09 MONITOR: WAN2_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 9.9.9.10|192.168.100.19|WAN2_Gateway|0ms|0ms|100%|down|highloss
Notifications in this message: 2 ================================ 15:06:10 MONITOR: WAN1_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 1.1.1.3|93.83.uv.xy|WAN1_Gateway|7.788ms|0ms|50%|down|highloss 15:06:10 MONITOR: WAN2_Gateway is available now, adding to routing group MULTIPLE_WAN 9.9.9.10|212.186.ab.cd|WAN2_Gateway|25.869ms|4.865ms|0.0%|online|none
Giving this another try (switching off the CPE for WAN2 and on after a couple of minutes) I didn't get any notification for WAN1 (as expected)
Notifications in this message: 1 ================================ 15:29:01 MONITOR: WAN2_Gateway has packet loss, omitting from routing group MULTIPLE_WAN 9.9.9.10|192.168.100.19|WAN2_Gateway|0ms|0ms|100%|down|highloss
Notifications in this message: 1 ================================ 15:30:01 MONITOR: WAN2_Gateway is available now, adding to routing group MULTIPLE_WAN 9.9.9.10|212.186.ab.cd|WAN2_Gateway|25.295ms|4.465ms|0.0%|online|none
Here the respective quality charts for WAN 1 and WAN 2:
WAN 2 more details:
What I noticed though - and I hadn't truly been aware of this before - the WAN2 Interface changes its IP from n/a to 192.168.100.19 (obviously when the port comes up ==> Status changes to Offline / packet loss at this point in time) and later to the offial one 212.186.ab.cd. before the Status changes to "online" again after a couple of few more seconds (The IF is set to DHCP as by the directive of the ISP).
Not sure if or to which extent this could add up to the observed behavior - so I thought I'd better mention explicitly.. -
Ah that private IP might be coming from the modem before it syncs with upstream connection. That's quite commonly provided so you can diagnose any issue. You can set the WAN to refuse leases from the local server though.
-
@stephenw10
As long as this temporary private IP isn't causing issues I don't mind at all.At the end of the day what counts is, that things work properly - and even while the first test looked a bit odd it's looking like setting the floating rules' "State Policy" to "Floating States" did the trick...
Thanks a lot for your precious help!