Routing between networks is hanging



  • Evening

    I've got a pfsense box with 3 intel nic's connecting 3 private networks (10.0.).  I upgraded to 2.1 last night and ever since, the connection between the networks hang's about every 30s for about 20s.

    2.1-BETA0 (i386)
    built on Wed Nov 7 02:44:11 EST 2012
    FreeBSD 8.3-RELEASE-p4

    Each network can talk to the pfsense box just fine (use the web interface etc).  But traffic between the networks is hanging.  Occasionally, traffic out the WAN is also hanging.  I'm also seeing thousands of these messages which I suspect are related…

    Nov 7 23:11:44 check_reload_status: Reloading filter
    Nov 7 23:11:44 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:11:44 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:11:44 check_reload_status: Updating all dyndns
    Nov 7 23:11:24 check_reload_status: Reloading filter
    Nov 7 23:11:24 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:11:24 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:11:24 check_reload_status: Updating all dyndns
    Nov 7 23:11:04 check_reload_status: Reloading filter
    Nov 7 23:11:04 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:11:04 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:11:04 check_reload_status: Updating all dyndns
    Nov 7 23:10:44 check_reload_status: Reloading filter
    Nov 7 23:10:44 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:10:44 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:10:44 check_reload_status: Updating all dyndns
    Nov 7 23:10:24 check_reload_status: Reloading filter
    Nov 7 23:10:24 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:10:24 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:10:24 check_reload_status: Updating all dyndns
    Nov 7 23:10:04 check_reload_status: Reloading filter
    Nov 7 23:10:04 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:10:04 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:10:04 check_reload_status: Updating all dyndns
    Nov 7 23:09:44 check_reload_status: Reloading filter
    Nov 7 23:09:44 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:09:44 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:09:44 check_reload_status: Updating all dyndns
    Nov 7 23:09:24 check_reload_status: Reloading filter
    Nov 7 23:09:24 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:09:24 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:09:24 check_reload_status: Updating all dyndns
    Nov 7 23:09:04 check_reload_status: Reloading filter
    Nov 7 23:09:04 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:09:04 check_reload_status: Restarting ipsec tunnels
    Nov 7 23:09:04 check_reload_status: Updating all dyndns
    Nov 7 23:08:44 check_reload_status: Reloading filter
    Nov 7 23:08:44 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Nov 7 23:08:44 check_reload_status: Restarting ipsec tunnels

    I've tried reinstalling all the packages, disabling and enabling all services, dumping the config out to xml and reloading to no avail.  I also notice that check_reload_status is using a close to 100% cpu.  Looking around, I see a lot of posts about this problem and I've tried nearly all the "fixes" people mention to no avail.

    Any assistance greatly appreciated.

    Thanks.



  • OK, I have managed to fix this.  For a very long time, the relayd service has been broken on my pfsense system (I start the service and then it crashes).  Anyway, I tried to restart it under 2.1 and suddenly the load on the system plummeted and I stopped getting the error messages in the logs.

    relayd still crashes…

    php: /status_services.php: The command '/usr/local/sbin/relayd -f /var/etc/relayd.conf' returned exit code '1', the output was '/var/etc/relayd.conf:7: syntax error no redirections, nothing to do unused protocol: dnsproto'

    but at least the check_reload_status is no longer chewing cpu and not reloading my firewall rules and not killing my connections.



  • oops spoke to soon.  Change anything on my router and it starts hanging again.



  • Doing a killall -9 check_reload_status stops the thousands of messages and also allows the router to pass traffic… but it makes it nearly impossible to apply any other settings.

    I finally got all the settings I wanted (and wasn't able to get external access through the router) and then killed the check_reload_status and now the router at least functions as expected...


  • Rebel Alliance Developer Netgate

    Status > Gateways, make sure your gateway setup is proper and shows online when it's actually up.

    Sounds like it's repeatedly restarting the services because it thinks your WAN is going up/down.

    Also check the Gateway log under the system logs.



  • The gateways are setup correctly - once I kill the check_reload_status script/daemon all routing works as expected.  All gateways are shown as online and are pingable.  Their has been no entry in the gateway log for 9 days now…



  • I finally worked out what was going on.  Two of my external wan routers didn't respond to ping… so the gateways were continually being marked as down. I changed the IP address that was pinged and routing started working and the check_reload_status process is now well behaved.


Locked