May 2nd Snapshot doesnt work, breaks everything! Beware
-
/etc/rc.reboot
/sbin/rebootThis code helped. Runtime now reseted.
-
I can confirm this issue is still in the latest 5-04 snapshot
Running the above commands did not help
/etc/rc.reboot
/sbin/reboot -
I am still having issues with the new snapshot (5/4) too. I have 3 WAN connections and the two statics still seem to be working, but the appliance I have screeches to a halt after upgrading. I also cannot get a DHCP address on the third connection. Attached is the errors I get on initial boot and sequential reboots result the same.
![IMG_20180504_211145 - Copy.jpg](/public/imported_attachments/1/IMG_20180504_211145 - Copy.jpg)
![IMG_20180504_211145 - Copy.jpg_thumb](/public/imported_attachments/1/IMG_20180504_211145 - Copy.jpg_thumb) -
I can confirm this issue is still in the latest 5-04 snapshot
Running the above commands did not help
/etc/rc.reboot
/sbin/rebootThere is no "this issue" in this thread. You need to provide details about exactly what is not working, with console and/or log entries related to the issue.
-
I am still having issues with the new snapshot (5/4) too. I have 3 WAN connections and the two statics still seem to be working, but the appliance I have screeches to a halt after upgrading. I also cannot get a DHCP address on the third connection. Attached is the errors I get on initial boot and sequential reboots result the same.
Please at least post the DHCP log and any dhclient entries from it, and anything that looks relevant in the system or routing logs as well.
I can't replicate any DHCP client issues here, mine are all working OK.
-
Unfortunately I had to rebuild it so I could post this so I only have my syslog to go back to. I am using the 2.4.4-DEVELOPMENT (amd64) built on Thu Apr 26 14:32:50 CDT 2018 FreeBSD 11.1-STABLE snapshot to restore to and the C2758 board you used to use. When I upgrade to the latest snapshot, I am unable to do much of anything with the appliance. It looks like it just keeps bouncing the interface for that wan.
dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
check_reload_status: Configuring interface wan
php-fpm[87613]: /rc.newwanip: rc.newwanip: Failed to update wan IP, restarting…
php-fpm[87613]: /rc.newwanip: rc.newwanip: on (IP address: ) (interface: HOME[wan]) (real interface: igb2).
php-fpm[87613]: /rc.newwanip: rc.newwanip: Info: starting on igb2.
dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
dhclient[20017]: exiting.
dhclient[20017]: connection closed
dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
php-fpm[43905]: /rc.linkup: HOTPLUG: Configuring interface wan
php-fpm[43905]: /rc.linkup: DEVD Ethernet attached event for wan
dhclient: /sbin/route add default 47.34.34.1
dhclient: Adding new routes to interface: igb2
dhclient: New Routers (igb2): 47.34.34.1
dhclient: New Broadcast Address (igb2): 255.255.255.255
dhclient: New Subnet Mask (igb2): 255.255.254.0
dhclient: New IP Address (igb2): 47.34.X.X
charon: 13[KNL] 47.34.X.X appeared on igb2
charon: 13[KNL] 47.34.X.X disappeared from igb2
dhclient: ifconfig igb2 inet 47.34.X.X netmask 255.255.254.0 broadcast 255.255.255.255
dhclient: Starting add_new_address()
dhclient: REBOOT
kernel: igb2: link state changed to DOWN
check_reload_status: Linkup starting igb2
HOME_DHCP 47.34.34.1: sendto error: 64 -
JimP, I can send you a 4m syslog from the time of upgrade if you would like.
after thumbing through more of the syslog, it seems pretty consistent on these repeated lines:
php-fpm[43905]: /rc.linkup: DEVD Ethernet attached event for wan
php-fpm[43905]: /rc.linkup: HOTPLUG: Configuring interface wan
charon: 04[KNL] 47.34.X.X disappeared from igb2
kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
dhclient[20017]: connection closed
dhclient[20017]: exiting.
kernel: arpresolve: can't allocate llinfo for 47.34.34.1 on igb2
dpinger: HOME_DHCP 47.34.34.1: sendto error: 65
php-fpm[87613]: /rc.newwanip: rc.newwanip: Info: starting on igb2.
php-fpm[87613]: /rc.newwanip: rc.newwanip: on (IP address: ) (interface: HOME[wan]) (real interface: igb2).
php-fpm[87613]: /rc.newwanip: rc.newwanip: Failed to update wan IP, restarting…
check_reload_status: Configuring interface wan
dpinger: HOME_DHCP 47.34.34.1: sendto error: 65 -
I will have to roll back to April Build until this is fixed.
My DHCP connection has the same errors as the poster above.
-
Iv been trying every new development build (didnt try 5-9) and the issue seems to keep happening. I too have to roll back to the last April build
What's odd is I ran a virtual appliance and pfsense ran fine in it. Im starting to wonder if its hardware compatibility issues. Im using a quotom box
-
I don't doubt there is a problem here but I need a lot more detail than "it's broken" or "the same errors". Post the errors (even if they are duplicates), log entries, route table contents, anything you can come up with. I need to know exactly what isn't working, with detail. For example: interfaces missing addresses, missing or incorrect routes, services not running (exactly which ones are not running, and any relevant logs from them), and so on.
I still can't replicate any issues here in my lab. We might have one person here who is able to replicate this but they're still testing to find out if it's similar, too soon to say if it's related.
-
Iv been trying every new development build (didnt try 5-9) and the issue seems to keep happening. I too have to roll back to the last April build
What's odd is I ran a virtual appliance and pfsense ran fine in it. Im starting to wonder if its hardware compatibility issues. Im using a quotom box
me too ..
pfsense start normal , but no internet connection in pfsense or lan …
in logs a lot of "route has not been found""
this happening after update pfsense 2.4 in 05/09 , how you roll back to a old version ?
thanks
-
I don't doubt there is a problem here but I need a lot more detail than "it's broken" or "the same errors". Post the errors (even if they are duplicates), log entries, route table contents, anything you can come up with. I need to know exactly what isn't working, with detail. For example: interfaces missing addresses, missing or incorrect routes, services not running (exactly which ones are not running, and any relevant logs from them), and so on.
I still can't replicate any issues here in my lab. We might have one person here who is able to replicate this but they're still testing to find out if it's similar, too soon to say if it's related.
Hello
Id be happy to give you log files of the errors. Im just not sure which ones you want. Can you please tell me the location of the log files. Webgui is not accessible so I would need to pull them by SSH -
I would love to send in logs as I have a 4m CSV dump from my syslog server, but still I have not been told where to send them. As they are raw dumps, I am not posting them into the forums but would gladly send them to one of the developers.
-
I don't need 4M worth of records. I don't have time to sort through all of that. Just the last dozen or so lines of each log file is sufficient.
I think we have a lead on part of the problem, I pushed a fix for one potential path that could break it but there is one other that I haven't tracked down yet.
https://redmine.pfsense.org/issues/8504
More interesting to me now than logs are two things:
1. The <gateways>section of your configuration(s) before and after upgrade, or at least after. You can redact IP addresses but do not alter anything else.
2. Whether or not you have a default route for IPv4 or IPv6 in "netstat -rnW" after upgrade.</gateways> -
OK, there are at least three separate issues here from the looks of it:
0. Harmless route errors spamming the console/logs https://redmine.pfsense.org/issues/8497 (Fixed now)
1. An issue with the upgrade code not converting and handling default gateways properly in some cases https://redmine.pfsense.org/issues/8504 (Also fixed)
2. An issue where certain DHCP WANs (igb interfaces at least) constantly link cycle which leads to all sorts of other symptoms (services not running, IP addresses/routes missing, GUI inaccessible, etc) https://redmine.pfsense.org/issues/8506We're still working on that last one.
Now what I need to know is:
- What hardware are you running where this is happening?
- What type of network interface is it happening to? (Both systems here, and the logs posted in the thread are all igb, but we don't know if that's a coincidence or not)
- Check "clog /var/log/system.log | grep link" and/or "dmesg | grep link" output to see if the link is flapping
-
Updated to the latestest beta and still getting issues
Im using a Qotom boxMay 11 17:55:36 pfSense php-fpm[22628]: /rc.linkup: DEVD Ethernet attached event for wan
May 11 17:55:36 pfSense php-fpm[22628]: /rc.linkup: HOTPLUG: Configuring interface wan
May 11 17:55:37 pfSense kernel: igb0: link state changed to UP
May 11 17:55:37 pfSense kernel: igb0: link state changed to DOWN
May 11 17:55:42 pfSense kernel: igb0: link state changed to UP
May 11 17:55:43 pfSense php-fpm[22628]: /rc.linkup: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1526086543] unbound[66133:0] error: bind: address already in use [1526086543] unbound[66133:0] fatal error: could not open ports'
May 11 17:55:43 pfSense kernel: igb0: link state changed to DOWN
May 11 17:55:45 pfSense php-fpm[71870]: /rc.linkup: DEVD Ethernet detached event for wanIts just looping the same thing over and over
-
JimP, let us know when we can begin testing snapshots again as I can't keep rebuilding and restoring my firewall.
-
JimP, let us know when we can begin testing snapshots again as I can't keep rebuilding and restoring my firewall.
Which is why you don't run snapshots on important production firewalls, at least not without proper lab testing first.
No progress since my last post except that an additional issue has been found:
3. Interface MTU being set incorrectly in some cases https://redmine.pfsense.org/issues/8507 – This can lead to what appears to be partially working connectivity. Some sites will load, others will fail, some may be partially work and partially broken due to resources that can't be fetched. Browsers may return a blank page rather than an error or fail to fetch links at all.
-
JimP, this is not an important firewall. It is only used for my home environment, but I get to listen to my wife complain about not being able to get online. More of an annoyance to reload than it is anything else. Let me know if there is more logs or testing you need on this.
-
I get to listen to my wife complain about not being able to get online.
If it's carrying your wife's traffic then that is THE very definition of an important production firewall :-)
Let me know if there is more logs or testing you need on this.
I think we have an OK grasp of the general issues at the moment but a lack of leads on where the problem lies. So far all I've seen are symptoms and not the root cause yet, but since it's so tricky to reproduce in a lab setup it's a pain to try to dig into it for any length of time.