May 2nd Snapshot doesnt work, breaks everything! Beware



  • Updated this morning to the latest development snapshot and my router stopped working. I constantly get an nginx 502 bad gateway and internet will not work (goes in and out).

    I figured it might have something to do with all my plugins. I formatted and installed a fresh copy of the development snapshot and its having the same issue. Nothing done to it, just a fresh install. Wont properly route anything and the login page wont load. Very odd!

    Going to install a fresh copy of April 30th snapshot and will post updates.

    Just a word of warning to anyone thinking about updating



  • Can confirm.

    Build is borked.

    May 2 18:53:38 php-fpm 29900 /rc.interfaces_wan_configure: The command '/sbin/dhclient -c /var/etc/dhclient_opt1.conf igb2 > /tmp/igb2_output 2> /tmp/igb2_error_output' returned exit code '15', the output was ''

    May 2 18:50:47 php-fpm 47005 /rc.linkup: The command '/sbin/dhclient -c /var/etc/dhclient_opt1.conf igb2 > /tmp/igb2_output 2> /tmp/igb2_error_output' returned exit code '1', the output was ''

    May 2 18:50:34 php-fpm 15234 /rc.linkup: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1525308634] unbound[86621:0] error: bind: address already in use [1525308634] unbound[86621:0] fatal error: could not open ports'



  • Installed the version before this (took forever to download on hotspot).

    I have my pfsense settings backup every night. Restored the file and VOILA everything is back to working perfectly.

    On a side note: Their backup feature is just amazing. I cant believe how well it puts everything back. Every package, every setting of that package, just WOW

    If anyone is wondering what I use to backup
    https://github.com/KoenZomers/pfSenseBackup



  • I'm using the May 2 version just fine. I'm running 2.4.4.a.20180502.1614. It's a VM inside ESXI.



  • I’ve been having issues from snapshots from the last week or so now, I thought it was just WebGUI dying but actually that was just what I happened to notice first as routing etc was working but DHCP has been dying too.  Power cycling cures the problems for a day but by next morning issues have returned.

    When I try to restart from console I get a message init: some processes would not die; ps axl advised which I haven’t done yet (next time it dies I’ll shut down from console.). Are there any other checks you folks would advise while I’m there?



  • I have updated to the May 2nd snapshot and my WAN connection refuses to use IPv4 DHCP and receive an address. It received an IPv6 DHCP and an IPv4 gateway and that is it. No assigned IPv4 address to the interface. My other two static addressed interfaces for other WAN's works fine. Just any interface with DHCP IPv4 doesn't seem to work.



  • Just updated this morning, pfsense doesn't boot :(

    I don't have any backups. Where do I go from here (linux noob)?

    -Jamie M.



  • I believe I also remember seeing the may 2nd build was based on FreeBSD 11.2 (currently 11.1).

    Maybe this has something to do with it.


  • Rebel Alliance Developer Netgate

    May 2 snapshots are working fine for me here on both real and virtual hardware, with static and DHCP WAN setups (IPv4 and IPv6).

    Any other logs, details, errors, etc?


  • Rebel Alliance Developer Netgate

    @toysareforboys:

    Just updated this morning, pfsense doesn't boot :(

    I don't have any backups. Where do I go from here (linux noob)?

    That looks more like your disk ran out of space or something else caused the kernel to be corrupted. You'll need to reinstall. Luckily, the installer has an option to rescue the config.xml off the target drive so you might not be completely out of luck. That's most likely unrelated to this thread, though, so if you need more help start a fresh thread for that.



  • @jimp:

    May 2 snapshots are working fine for me here on both real and virtual hardware, with static and DHCP WAN setups (IPv4 and IPv6).

    Any other logs, details, errors, etc?

    What other logs do you need?

    I have a Multi-WAN setup with 1 DHCP and 1 Static connection.

    I also keep sending in this crash report:

    Crash report details:

    PHP Errors:
    [03-May-2018 01:11:30 America/Regina] PHP Warning:  file_get_contents(/tmp/igb2_router): failed to open stream: No such file or directory in /etc/inc/gwlb.inc on line 1225

    No FreeBSD crash data found.


  • Rebel Alliance Developer Netgate

    Anything else that seems relevant from any other logs. Like the DHCP log for dhclient, for example.



  • Should I see the snapshot? My SG-3100 says I am up to date with the last update being from April 25th.
    "built on Wed Apr 25 02:28:12 CDT 2018"


  • Rebel Alliance Developer Netgate

    @gsmornot:

    Should I see the snapshot? My SG-3100 says I am up to date with the last update being from April 25th.
    "built on Wed Apr 25 02:28:12 CDT 2018"

    If you see that, you're probably on the Factory images and not CE. Factory needs more work before the FreeBSD 11-STABLE (11.2-PRERELEASE) builds are ready.



  • @jimp:

    @gsmornot:

    Should I see the snapshot? My SG-3100 says I am up to date with the last update being from April 25th.
    "built on Wed Apr 25 02:28:12 CDT 2018"

    If you see that, you're probably on the Factory images and not CE. Factory needs more work before the FreeBSD 11-STABLE (11.2-PRERELEASE) builds are ready.

    Yes, thats it. The SG-3100 is still within the first year so it gets the factory image. No rush, just curious.


  • Rebel Alliance Developer Netgate

    @gsmornot:

    @jimp:

    @gsmornot:

    Should I see the snapshot? My SG-3100 says I am up to date with the last update being from April 25th.
    "built on Wed Apr 25 02:28:12 CDT 2018"

    If you see that, you're probably on the Factory images and not CE. Factory needs more work before the FreeBSD 11-STABLE (11.2-PRERELEASE) builds are ready.

    Yes, thats it. The SG-3100 is still within the first year so it gets the factory image. No rush, just curious.

    You can use the factory image indefinitely, not just the first year.


  • Rebel Alliance Developer Netgate

    I suspect I may have found part of the issue people are seeing in this thread:

    https://redmine.pfsense.org/issues/8495

    It's possible that your firewall never rebooted after updating and it's running half updated, if you run /etc/rc.reboot that will do most of the process but the last command fails, you can then safely run 'reboot' after to finish the job.

    For example:

    /etc/rc.reboot
    /sbin/reboot
    

    Once you're on a snap with the fix I just committed for that bug it will restart itself OK again.



  • @jimp:

    I suspect I may have found part of the issue people are seeing in this thread:

    https://redmine.pfsense.org/issues/8495

    It's possible that your firewall never rebooted after updating and it's running half updated, if you run /etc/rc.reboot that will do most of the process but the last command fails, you can then safely run 'reboot' after to finish the job.

    For example:

    /etc/rc.reboot
    /sbin/reboot
    

    Once you're on a snap with the fix I just committed for that bug it will restart itself OK again.

    Ran the commands and my DHCP connection still isn't coming back.

    May 3 17:12:25 php-fpm 324 /interfaces.php: The command '/sbin/dhclient -c /var/etc/dhclient_opt1.conf igb2 > /tmp/igb2_output 2> /tmp/igb2_error_output' returned exit code '15', the output was ''



  • I just updated to 2.4.4 built on Thu May 03 18:39:20 CDT 2018 and it's generating a crash report, which I sent.

    I'm still having dhcpv6 problems. I put wireshark on it and there do not seem to be any ipv6 messages at all.



  • Getting this error now in console with build: 2.4.4.a.20180503.1839



  • Something is definitly wrong with those builds, I cant halt or reboot via gui or ssh…



  • I reverted to 2.4.3 release and it works fine.



  • Same here, DNS Resolver and DHCP not starting after update, update never rebooted and ran half updated for 1-2 days, until i reverted back to stabile 2.4.3 with no problems.


  • Rebel Alliance Developer Netgate

    @Dazog:

    Getting this error now in console with build: 2.4.4.a.20180503.1839

    I can replicate that, but it appears harmless: https://redmine.pfsense.org/issues/8497


  • Rebel Alliance Developer Netgate

    @maverick_slo:

    Something is definitly wrong with those builds, I cant halt or reboot via gui or ssh…

    Please read the thread. That has been addressed.



  • /etc/rc.reboot
    /sbin/reboot

    This code helped. Runtime now reseted.



  • I can confirm this issue is still in the latest 5-04 snapshot
    Running the above commands did not help
    /etc/rc.reboot
    /sbin/reboot



  • I am still having issues with the new snapshot (5/4) too. I have 3 WAN connections and the two statics still seem to be working, but the appliance I have screeches to a halt after upgrading. I also cannot get a DHCP address on the third connection. Attached is the errors I get on initial boot and sequential reboots result the same.

    ![IMG_20180504_211145 - Copy.jpg](/public/imported_attachments/1/IMG_20180504_211145 - Copy.jpg)
    ![IMG_20180504_211145 - Copy.jpg_thumb](/public/imported_attachments/1/IMG_20180504_211145 - Copy.jpg_thumb)


  • Rebel Alliance Developer Netgate

    @tmushy:

    I can confirm this issue is still in the latest 5-04 snapshot
    Running the above commands did not help
    /etc/rc.reboot
    /sbin/reboot

    There is no "this issue" in this thread. You need to provide details about exactly what is not working, with console and/or log entries related to the issue.


  • Rebel Alliance Developer Netgate

    @LostInIgnorance:

    I am still having issues with the new snapshot (5/4) too. I have 3 WAN connections and the two statics still seem to be working, but the appliance I have screeches to a halt after upgrading. I also cannot get a DHCP address on the third connection. Attached is the errors I get on initial boot and sequential reboots result the same.

    Please at least post the DHCP log and any dhclient entries from it, and anything that looks relevant in the system or routing logs as well.

    I can't replicate any DHCP client issues here, mine are all working OK.



  • Unfortunately I had to rebuild it so I could post this so I only have my syslog to go back to. I am using the 2.4.4-DEVELOPMENT (amd64) built on Thu Apr 26 14:32:50 CDT 2018 FreeBSD 11.1-STABLE snapshot to restore to and the C2758 board you used to use. When I upgrade to the latest snapshot, I am unable to do much of anything with the appliance. It looks like it just keeps bouncing the interface for that wan.

    dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
    check_reload_status: Configuring interface wan
    php-fpm[87613]: /rc.newwanip: rc.newwanip: Failed to update wan IP, restarting…
    php-fpm[87613]: /rc.newwanip: rc.newwanip: on (IP address: ) (interface: HOME[wan]) (real interface: igb2).
    php-fpm[87613]: /rc.newwanip: rc.newwanip: Info: starting on igb2.
    dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
    kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
    dhclient[20017]: exiting.
    dhclient[20017]: connection closed
    dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
    kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
    kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
    php-fpm[43905]: /rc.linkup: HOTPLUG: Configuring interface wan
    php-fpm[43905]: /rc.linkup: DEVD Ethernet attached event for wan
    dhclient: /sbin/route add default 47.34.34.1
    dhclient: Adding new routes to interface: igb2
    dhclient: New Routers (igb2): 47.34.34.1
    dhclient: New Broadcast Address (igb2): 255.255.255.255
    dhclient: New Subnet Mask (igb2): 255.255.254.0
    dhclient: New IP Address (igb2): 47.34.X.X
    charon: 13[KNL] 47.34.X.X appeared on igb2
    charon: 13[KNL] 47.34.X.X disappeared from igb2
    dhclient: ifconfig igb2 inet 47.34.X.X netmask 255.255.254.0 broadcast 255.255.255.255
    dhclient: Starting add_new_address()
    dhclient: REBOOT
    kernel: igb2: link state changed to DOWN
    check_reload_status: Linkup starting igb2
    HOME_DHCP 47.34.34.1: sendto error: 64



  • JimP, I can send you a 4m syslog from the time of upgrade if you would like.

    after thumbing through more of the syslog, it seems pretty consistent on these repeated lines:

    php-fpm[43905]: /rc.linkup: DEVD Ethernet attached event for wan
    php-fpm[43905]: /rc.linkup: HOTPLUG: Configuring interface wan
    charon: 04[KNL] 47.34.X.X disappeared from igb2
    kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
    kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
    dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
    dhclient[20017]: connection closed
    dhclient[20017]: exiting.
    kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
    dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
    php-fpm[87613]: /rc.newwanip: rc.newwanip: Info: starting on igb2.
    php-fpm[87613]: /rc.newwanip: rc.newwanip: on (IP address: ) (interface: HOME[wan]) (real interface: igb2).
    php-fpm[87613]: /rc.newwanip: rc.newwanip: Failed to update wan IP, restarting…
    check_reload_status: Configuring interface wan
    dpinger: HOME_DHCP 47.34.34.1: sendto error: 65



  • I will have to roll back to April Build until this is fixed.

    My DHCP connection has the same errors as the poster above.



  • Iv been trying every new development build (didnt try 5-9) and the issue seems to keep happening. I too have to roll back to the last April build

    What's odd is I ran a virtual appliance and pfsense ran fine in it. Im starting to wonder if its hardware compatibility issues. Im using a quotom box


  • Rebel Alliance Developer Netgate

    I don't doubt there is a problem here but I need a lot more detail than "it's broken" or "the same errors". Post the errors (even if they are duplicates), log entries, route table contents, anything you can come up with. I need to know exactly what isn't working, with detail. For example: interfaces missing addresses, missing or incorrect routes, services not running (exactly which ones are not running, and any relevant logs from them), and so on.

    I still can't replicate any issues here in my lab. We might have one person here who is able to replicate this but they're still testing to find out if it's similar, too soon to say if it's related.



  • @tmushy:

    Iv been trying every new development build (didnt try 5-9) and the issue seems to keep happening. I too have to roll back to the last April build

    What's odd is I ran a virtual appliance and pfsense ran fine in it. Im starting to wonder if its hardware compatibility issues. Im using a quotom box

    me too ..

    pfsense start normal , but no internet connection in pfsense or lan …

    in logs a lot of  "route has not been found""

    this happening after update pfsense 2.4 in 05/09 , how you roll back to a old version ?

    thanks



  • @jimp:

    I don't doubt there is a problem here but I need a lot more detail than "it's broken" or "the same errors". Post the errors (even if they are duplicates), log entries, route table contents, anything you can come up with. I need to know exactly what isn't working, with detail. For example: interfaces missing addresses, missing or incorrect routes, services not running (exactly which ones are not running, and any relevant logs from them), and so on.

    I still can't replicate any issues here in my lab. We might have one person here who is able to replicate this but they're still testing to find out if it's similar, too soon to say if it's related.

    Hello
    Id be happy to give you log files of the errors. Im just not sure which ones you want. Can you please tell me the location of the log files. Webgui is not accessible so I would need to pull them by SSH



  • I would love to send in logs as I have a 4m CSV dump from my syslog server, but still I have not been told where to send them. As they are raw dumps, I am not posting them into the forums but would gladly send them to one of the developers.


  • Rebel Alliance Developer Netgate

    I don't need 4M worth of records. I don't have time to sort through all of that. Just the last dozen or so lines of each log file is sufficient.

    I think we have a lead on part of the problem, I pushed a fix for one potential path that could break it but there is one other that I haven't tracked down yet.

    https://redmine.pfsense.org/issues/8504

    More interesting to me now than logs are two things:

    1. The <gateways>section of your configuration(s) before and after upgrade, or at least after. You can redact IP addresses but do not alter anything else.
    2. Whether or not you have a default route for IPv4 or IPv6 in "netstat -rnW" after upgrade.</gateways>


  • Rebel Alliance Developer Netgate

    OK, there are at least three separate issues here from the looks of it:

    0. Harmless route errors spamming the console/logs https://redmine.pfsense.org/issues/8497  (Fixed now)
    1. An issue with the upgrade code not converting and handling default gateways properly in some cases https://redmine.pfsense.org/issues/8504 (Also fixed)
    2. An issue where certain DHCP WANs (igb interfaces at least) constantly link cycle which leads to all sorts of other symptoms (services not running, IP addresses/routes missing, GUI inaccessible, etc) https://redmine.pfsense.org/issues/8506

    We're still working on that last one.

    Now what I need to know is:

    • What hardware are you running where this is happening?
    • What type of network interface is it happening to? (Both systems here, and the logs posted in the thread are all igb, but we don't know if that's a coincidence or not)
    • Check "clog /var/log/system.log | grep link" and/or "dmesg | grep link" output to see if the link is flapping

 

© Copyright 2002 - 2018 Rubicon Communications, LLC | Privacy Policy