2.3 stops routing traffic every day, driving me nuts.



  • Hi everyone,

    I'm having a very strange problem with 2.3 where seemingly around 3-5am just about every day (although sometimes every other day) pfsense will pretty much stop routing traffic. Some traffic does manage to pass through, but even the web interface becomes unreachable at times. If I IPMI into the console everything seems just fine though (except I stil lcan't access WAN traffic, remote VPN interfaces are still reachable though). These problems started after an upgrade from 2.2.6 and continue through a completely fresh install starting from scratch (and not restoring my config).

    The only notable services I have running are 1 openvpn server, 1 openvpn client, and one ipsec client. I've looked through the system logs but i've never been able to find anything obvious. Never had these problems before running for years with this exact setup. Are there any logs I can provide that may be of some help? Thanks!



  • Could you get a status tgz dump from status.php ("hidden" page with no menu) and get that to me somehow? Can email to cmb at pfsense dot org, or PM me here with a link or other means of obtaining it.



  • @cmb:

    Could you get a status tgz dump from status.php ("hidden" page with no menu) and get that to me somehow? Can email to cmb at pfsense dot org, or PM me here with a link or other means of obtaining it.

    Sent, thank you so much!



  • I'm having the exact same problem… having to reboot the FW every morning, I can see the FW stops between 1 and 5 every morning...



  • Thanks. Couple things stick out from the gateway log, probably outside of what you're referring to.

    Apr 13 20:31:07 pfSense dpinger: WAN_DHCP 192.168.1.1: sendto error: 64
    Apr 13 20:31:07 pfSense dpinger: WAN_DHCP 192.168.1.1: sendto error: 64
    Apr 13 20:31:08 pfSense dpinger: WAN_DHCP 192.168.1.1: sendto error: 64

    Apr 18 20:20:26 pfSense dpinger: WAN_DHCP 192.168.1.1: sendto error: 64
    Apr 18 20:20:27 pfSense dpinger: WAN_DHCP 192.168.1.1: sendto error: 64
    Apr 18 20:20:27 pfSense dpinger: WAN_DHCP 192.168.1.1: sendto error: 64

    Something is issuing your WAN DHCP leases on 192.168.1.x with a gateway of 1.1. Since the gateway conflicts with the LAN IP and the subnet conflicts with the LAN subnet, that'll leave you in a mess. Is it possibly issuing that lease to itself? WAN and LAN plugged into the same broadcast domain. If so, they can't be, make sure there's no interconnection there.

    That looks to be separate from the 4 AM issue you described, but something you should fix if you haven't already.

    Your monitor IP is set to your gateway IP, which often isn't a good idea as it's not a good indicator of actual connectivity, and especially with cable ISPs the replies will be rate limited, dropped or delayed, at times. System>Routing, edit your WAN gateway, set monitor IP to 8.8.8.8. Save, apply changes.

    What were the symptoms you were having at the time? It doesn't appear anything would not be passing traffic. You get alarms on only some of your gateways, so traffic is still passing in some regard at least.



  • @denmly:

    I'm having the exact same problem… having to reboot the FW every morning, I can see the FW stops between 1 and 5 every morning...

    You have a similar symptom. There isn't some "stop working at 1-5 AM" problem. Please start a new thread as it's a near certainty it has no relation to OP's issue and that will just clutter up his thread.

    In your post, please include:
    system log
    gateways log
    type of WAN
    Can you get to the web interface when it's not working? If so what does Status>Interfaces show?

    and I'll reply to you on your thread.



  • Hello,

    I have now had this issue twice after upgrade to 2.3 - before that never ever before! This happens only with Alix board- not on any of our VMware installations. This issue is Critical!

    HW:
    AMD G-T40E Processor
    2 CPUs: 1 package(s) x 2 core(s)

    Version:
    2.3-RELEASE (i386)
    built on Mon Apr 11 18:12:06 CDT 2016
    FreeBSD 10.3-RELEASE

    WAN with loadbalancing with Tier1 in both interface.

    My symptom is that pfSense 2.3 stopping working every one week. Last and first case was last Friday and now on this Friday :-)

    @cmb:

    @denmly:

    I'm having the exact same problem… having to reboot the FW every morning, I can see the FW stops between 1 and 5 every morning...

    You have a similar symptom. There isn't some "stop working at 1-5 AM" problem. Please start a new thread as it's a near certainty it has no relation to OP's issue and that will just clutter up his thread.

    In your post, please include:
    system log
    gateways log
    type of WAN
    Can you get to the web interface when it's not working? If so what does Status>Interfaces show?

    and I'll reply to you on your thread.



  • cmb has been more than helpful trying to look into this issue (thanks man, you rock!) but unfortunately I've gone back to 2.2.6 for now which so far has resolved the problem. I just can't afford to have the internet go down daily/every 2 days regardless of the cause right now.



  • @Clouseau:

    I have now had this issue twice after upgrade to 2.3 - before that never ever before! This happens only with Alix board- not on any of our VMware installations. This issue is Critical!

    HW:
    AMD G-T40E Processor
    2 CPUs: 1 package(s) x 2 core(s)

    Version:
    2.3-RELEASE (i386)
    built on Mon Apr 11 18:12:06 CDT 2016
    FreeBSD 10.3-RELEASE

    That's not an ALIX but an APU with that CPU, so why aren't you using the recommended amd64 version?



  • That's not an ALIX but an APU with that CPU, so why aren't you using the recommended amd64 version?

    I think this too! Why is this a i386 image on a fully 64Bit supported hardware?



  • @BlueKobold:

    That's not an ALIX but an APU with that CPU, so why aren't you using the recommended amd64 version?

    I think this too! Why is this a i386 image on a fully 64Bit supported hardware?

    Long story why - but is i368 image broken or somehow not supported in 64bit hardware? And true - it's APU board (used to be Alix)



  • Long story why -

    In older days some users or customers were choosing i386 because its more smooth and liquid as the
    amd64Bit image, that was for a short decade perhaps matching. But not as today! Earlier or later, I
    mean perhaps earlier as we can expect it, the i386 aka 32Bit image will be gone or will be not longer
    maintained anymore. And if there will be a 64Bit hardware in the "game" you should also take the
    64Bit pfSense image as well because, and that is surely only an intense of me, the 32Bit image will
    be perhaps then not really actual and supporting all kind of diferent 64bit hardware, and then or
    often, there will be perhaps occurring something that can not really by explained and solved out.
    And this can be a failure, an issue or a malfunction likes described here in that thread, for sure it
    is not a must be! And this is since something around two or more years the mainstream or most
    common method. If your hardware is 64 bit capable, only use the 64 bit version

    but is i368 image broken or somehow not supported in 64bit hardware?

    If there is a 32Bit driver that is not able to handle some 64Bit hardware it could cause an
    issue that is not seen anymore together with an installed 64Bit version, where proper 64Bit
    drivers are play nice together with the real 64Bit hardware.

    And true - it's APU board (used to be Alix)

    Alix boards are 32Bit boards and the newer APU and APU2 series are real 64Bit boards.
    In my opinion you could try out a 64Bit pfSense image and play back your saved config.xml
    file that is able to save now on the 32Bit system. And lets see what is going on then. Would
    be in my eyes the best way to walk.



  • I have the same problem with pfSense 2.3 on dell poweredge 2900.
    In virtualized pfsenses (vmware 5) this problem does not occur.



  • Ha - Friday again and total freeze in my pfSense 2.3 on APU1 with 32bit image!

    I'm switching to 64bit image and hope to see what happens…



  • Hey guys…  (and by guys I mean you hijacking the thread...)  Some people subscribe to the thread they start and get email notifications when someone answers.

    They probably do not want to see all the hijacked input.

    https://forum.pfsense.org/index.php?topic=70.msg314#msg314

    Thank you for understanding.    :)



  • @chpalmer:

    Hey guys…  (and by guys I mean you hijacking the thread...)  Some people subscribe to the thread they start and get email notifications when someone answers.

    They probably do not want to see all the hijacked input.

    https://forum.pfsense.org/index.php?topic=70.msg314#msg314

    Thank you for understanding.    :)

    Hello chpalmer, what are you talking about? What hijacked input, where, who? Not understanding your post at all????



  • @Clouseau:

    Hello chpalmer, what are you talking about? What hijacked input, where, who? Not understanding your post at all? ???

    Well-  diablo266 started this thread about his/her problem. CMB began to interface with the original poster after which others began to post about their problems. This is hijacking.

    Here CMB asked a new poster to post his own thread.

    Yet several came along and continued to hijack this same thread.

    Please start a new post with your problems.  My post was directed at the hijackers so if the shoe fits….    ;D



  • @chpalmer:

    @Clouseau:

    Hello chpalmer, what are you talking about? What hijacked input, where, who? Not understanding your post at all? ???

    Well-  diablo266 started this thread about his/her problem. CMB began to interface with the original poster after which others began to post about their problems. This is hijacking.

    Here CMB asked a new poster to post his own thread.

    Yet several came along and continued to hijack this same thread.

    Please start a new post with your problems.  My post was directed at the hijackers so if the shoe fits….    ;D

    Okey - I have the same problem than diablo266 - should I start a new thread?



  • @Clouseau:

    Okey - I have the same problem than diablo266 - should I start a new thread?

    He says "where seemingly around 3-5am"  same symptoms?  (we're still hijacking here…)  If so then engage him and not the other hijackers.

    But read the original post and if it really is identical then stick around.  If not, its not a big deal to start your own thread.  Just edicate.  (or however the hell you spell that.)  ;D



  • If not, its not a big deal to start your own thread.  Just edicate.  (or however the hell you spell that.)

    Were you trying to say: "Just elucidate" (ie. describe the problem more completely)?

    Not trying to put anyone down - just hoping to add a little clarity -feel free to ignore me if you feel it's warranted….  ;)



  • @Clouseau:

    Okey - I have the same problem than diablo266 - should I start a new thread?

    Not if you actually have the same problem. At this point, this is a well-established known issue, and we don't need any further specifics as we have a replicable test case and are working on finding the root cause. Can follow here for updates:
    https://redmine.pfsense.org/issues/6296


Log in to reply