Help me understand a packet path inside of the pfSense please
-
Hello everyone! Been trying to solve a following issue without any luck for about a year and a half! maybe you guys can help me!
Here's my setup:
single WAN to the onboard NIC re0. DHCP from the ISP.
LAN is on the onboard re1.
OpenVPN client to the NordVPN with custom firewall rule to pass the traffic from the list of websites through that interface.Everything works after the fresh start, but after a while something happens(I suggest some glitches on the ISP side i see some timeout in the Gateway section of the logs) and VPN routing stops working. all the interfaces are green, I can ping VPN server from within the pfSense. but the websites from the list are unavailable from the LAN interface.
A first I was blaming the ISP or VPN provider, but the rest of the websites(not from the list) are working as they should, restarting VPN service is not resolving the issue as well. Only a full restart of the system helps for some time.Is there a way to get a list of services that control that routing behavior to try to find the one that misbehaves?
I did try to create a fresh installation, and got the same results.
Can you please help me track down the root of the issue? -
First thing I would check is the default route in Diag > Routes when it's in the broken state.
If you have the OpenVPN interface assigned it will create a gateway. If your default gateway in System > Routing > Gateways is set to automatic the system may be choosing the VPN gateway as default but that will disappear when it goes down. You may end up with the wrong or no default route. You may have the policy rule routing to a gateway that no longer exists. If you resave the LAN firewall rules that would likely correct it if that is the cause.
Steve
-
@stephenw10 Thanks! that sounds like a plan.
Will try to check if routing is wrong, if that's the case do you think there's a way to fix routing without rebooting the machine? -
Yes, if you set the default IPv4 gateway to the WAN specifically it will not ever switch to some other gateway.
You may also want to try using
Skip rules when gateway is down
that is an option in System > Advanced > Misc.By default, when a rule has a gateway specified and this gateway is down, the rule is created omitting the gateway. This option overrides that behavior by omitting the entire rule instead.
If your VPN goes down traffic to those sites will be passed via the WAN. States there may remain open if traffic is using them for some time.
Steve
-
@stephenw10 that definitely will help, but maybe I can run a script to force routing rules to refresh once in a while, since sites that are supposed to be accessed via the VPN are not accessible via the WAN interface.
-
The rules should be reloaded when a gateway changes state anyway. I would not expect a problem with what you're attempting here normally.
Steve
-
@stephenw10 well it didn’t worked.
The default gateway is set to WAN and I tried to enable that option about skipping the rule if Gateway is down. This morning an ISP had a hiccup on a line and everything went south again.
That’s the kind of messages I see in a Gateway log:Mar 18 09:34:32 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr “VPN_address_here” bind_addr “VPN_address_here” identifier "OPT1_DHCP "
Mar 18 09:28:03 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr “ISP_serverd_address_here” bind_addr “ISP_dhcp_address_here” identifier "WAN_DHCP "
What’s interesting: WAN log entry contains different addresses for destination and bind and VPN entry contains the same address.
-
Those entries are just dpinger restarting. Are there any entries showing 'alarm'? That's what you would see if the gateway goes down.
Otherwise check the system log entries at that time.Steve
-
@stephenw10 thats how the general logs look at the moment of beggining of the issue.
That pattern more or less appears on each eventthis entry looks suspicious: "Mar 18 09:47:35 php-fpm 369 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.7.3.5 -> 10.7.2.3 - Restarting packages." but I'm not quite sure why it's happening
Mar 18 09:47:36 php-fpm 15429 /rc.start_packages: Restarting/Starting all packages.
Mar 18 09:47:35 check_reload_status Starting packages
Mar 18 09:47:35 php-fpm 369 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 10.7.3.5 -> 10.7.2.3 - Restarting packages.
Mar 18 09:47:33 php-fpm 369 /rc.newwanip: Creating rrd update script
Mar 18 09:47:31 dhcpleases kqueue error: unknown
Mar 18 09:47:31 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
Mar 18 09:47:31 dhcpleases /etc/hosts changed size from original!
Mar 18 09:47:26 php-fpm 369 /rc.newwanip: IP Address has changed, killing all states (ip_change_kill_states is set).
Mar 18 09:47:26 dhcpleases /etc/hosts changed size from original!
Mar 18 09:47:26 php-fpm 369 /rc.newwanip: rc.newwanip: on (IP address: 10.7.2.3) (interface: OPT1[opt1]) (real interface: ovpnc1).
Mar 18 09:47:26 php-fpm 369 /rc.newwanip: rc.newwanip: Info: starting on ovpnc1.
Mar 18 09:47:25 check_reload_status rc.newwanip starting ovpnc1
Mar 18 09:47:25 kernel ovpnc1: link state changed to UP
Mar 18 09:47:24 check_reload_status Reloading filter
Mar 18 09:47:24 kernel ovpnc1: link state changed to DOWN -
Ah Ok so it looks like the OpenVPN link is going down and that flips it's gateway triggering a bunch of stuff.
You might try setting the gateway monitoring action to disabled in the OpenVPN gateway. That will prevent it triggering those reloads but it might also prevent some things reloading that need to... try it and see it might be preferable for you.
Steve
-
@stephenw10 alright. I did switch the monitoring and monitoring actions(that was empty, but I'm not taking any chances)
let's see if that will help.Thanks Steve! I really appriciate your time and effort!