No Web Interface on Thu May 29 08:48:37 CDT 2014
-
Is sad.
No, it's alpha
Thank you, Captain Obvious.
We're all aware it's Alpha.
Yes, but it sounded (to me) like you were expecting more from alpha.
To keep this post somewhat on-topic, I noticed an option in the installation menu to "rescue config.xml":
< quick/easy install > < custom install > < rescue config.xml > < reboot > < exit >
I haven't tried to use it, but perhaps it would be easier for some rather than using the other methods described in this thread.
-
Has anyone downloaded the last snapshot from today at 23:36?
-
I updated to amd64-20140529-1755, issue is still present. Tried updating again from SSH, didn't fix it. Restarting webConfigurator from SSH doesn't fix it. Routing and firewall functionality seems to be unaffected.
-
Mmm, the most recent update file is still far too small, <1MB for the Nano 1G update.
Steve
-
Playing with this mornings amd-64 snapshot, as a fresh install in a VM. As noted earlier, the boot hangs after 'Starting CRON'. Seems that the guilty line is in /etc/rc:
/usr/local/sbin/fcgicli -f /etc/rc.start_packages
If you comment out only that line in /etc/rc, then the boot completes and goes to the console menu.
Run that line at a console prompt in a recent snapshot, it hangs until you ^C out of it. But run that line in an earlier, working snapshot, and it completes and returns to the shell. Note that I'm testing without any packages installed, so the command should not actually do anything for me.
This explains the hanging boot, and may be related to the missing webgui, but I haven't dug that far yet.
As a side note, if you do a fresh install, you can run /etc/sshd by hand to generate the keys. sshd won't start up until keys are present.
[edit: noted as amd-64 arch]
-
The following cases all show a missing webgui and boot hang symptoms described above:
- amd-64, fresh install, 30-May image
- i386, fresh install, 30-May image
- amd-64, fresh install 23-May image (confirmed working and completed the wizard) and then confirmed failing when auto-upgraded to 30-May update
-
Still no go for web interface on amd64-20140530-1557.
EDIT: I can get the bootup to complete by running:
ps aux | grep -i rc
Then finding the PID of /usr/local/sbin/fcgicli -f /etc/rc.start_packages and running:
kill -9 xxxxx
Where xxxxx is the PID of the process. This doesn't bring up the web interface though.
-
I am running the 30th snapshot on i386 and sshd works for me after running it manually the last time I booted to create the sshd keys. The GUI doesn't work for me though.
I noticed the same /usr/local/sbin/fcgicli -f /etc/rc.start_packages command in the process list apparently stuck. I killed it and then a few more with different rc scripts specified as arguments were launched by minicron I think. Those are stuck now too.
I ran the tracing command truss manually on one of the command lines and it appears to lock up around the time of writting to /var/run/php-fpm.socket. php-fpm is running.
truss /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data
connect(3,{ AF_UNIX "/var/run/php-fpm.socket" },106) = 0 (0x0) __sysctl(0xbfbfe6c4,0x2,0xbfbfe708,0xbfbfe6c0,0x0,0x0) = 0 (0x0) __sysctl(0xbfbfe6c4,0x2,0xbfbfe808,0xbfbfe6c0,0x0,0x0) = 0 (0x0) __sysctl(0xbfbfe6c4,0x2,0xbfbfe908,0xbfbfe6c0,0x0,0x0) = 0 (0x0) __sysctl(0xbfbfe6c4,0x2,0xbfbfea08,0xbfbfe6c0,0x0,0x0) = 0 (0x0) __sysctl(0xbfbfe6c4,0x2,0xbfbfeb08,0xbfbfe6c0,0x0,0x0) = 0 (0x0) madvise(0x28804000,0x1000,0x5,0x281c15f8,0xbfbfe4f4,0x28120ccf) = 0 (0x0) madvise(0x28816000,0x1000,0x5,0x281c15f8,0xbfbfe4f4,0x28120ccf) = 0 (0x0) madvise(0x28818000,0x1000,0x5,0x281c15f8,0xbfbfe56c,0x28120ccf) = 0 (0x0) madvise(0x28803000,0x3000,0x5,0x281c15f8,0xbfbfe57c,0x28120ccf) = 0 (0x0) write(3,"\^A\^A\0\^A\0\b\0\0\0\^A\0\0\0\0"...,263) = 263 (0x107)
It just sits there forever. Any /usr/local/sbin/fcgicli command executed even by hand gets stuck there.
-
-
As soon as I ran
/usr/local/sbin/fcgicli -f /etc/rc.start_packages
The webgui doesn't work anymore. Any attempts to use php-fpm locks up the process writting to the fpm socket again.
It appears something in the command above locks up php-fpm. If I restart php-fpm it works again. I am going to comment out the command above from /etc/rc and see if the firewalls starts up properly. I have a feeling it will. I will just need to start the packages manually after a reboot. This is at home so it isn't a big deal :).
-
Well… It is not specifically startpackages which kills it. It seems to lock up with other fcgicli commands during boot. If I restart php-fpm and then execute the few fcgicli commands in order from /etc/rc one will eventually cause php-fpm to block on writing to it's socket. I will just manually kill php-fpm and restart it after every boot for now. It appears the GUI doesn't lock it up (I didn't test everything though... only viewing some of the pages).
-
I just updated to the 31st snapshot and the problem is still there. I just manually kill php-fpm and restart it per how it is started in /etc/rc
2.2-ALPHA (i386)
built on Sat May 31 10:32:02 CDT 2014
FreeBSD 10.0-STABLE -
The web interface stopped working again when I uninstalled the Patches package. I killed and restarted php-fpm and it started working again.
-
Great work! I know that commenting out the start_packages line from /etc/rc is not enough to get the webgui working, as you found out. There are a few minicron entries after that line in /etc/rc that use fcgicli as well, one hourly account expire and one daily alias url updater. If I understand the problem correctly, they should be commented out as well, right?
In the old days, I'd look through the recent commits to the pfSense-tools tree, but I haven't taken the steps to regain access to that yet.
Hopefully the devs can find a fix, now that you've narrowed down the problem even more.
-
I am still getting the 100% CPU by check_reload_status too. I killed that and restarted it and the CPU went back to normal again.
-
I'm in the process of cloning the pfsense-tools repo to have a look through the commits. Can someone give me a timeframe for when this issue started showing up?
-
If the previous snapshots are available I can start installing them backwards and see when the issue disappears.
The first version I noticed the problem was the 29th or the 30th build.
EDIT: I am not sure what version I was running previous to the 29th build. It might have been the 26th or 27th. I don't see anything in the logs to show that I rebooted on the 28th. I put in a request for logging the version on boot so that I can easily keep track of what version was installed by going through the logs on my remote syslog server. I am sure the devs are busy though to worry about such things :).
-
If the previous snapshots are available I can start installing them backwards and see when the issue disappears.
The first version I noticed the problem was the 29th or the 30th build.
EDIT: I am not sure what version I was running previous to the 29th build. It might have been the 26th or 27th. I don't see anything in the logs to show that I rebooted on the 28th. I put in a request for logging the version on boot so that I can easily keep track of what version was installed by going through the logs on my remote syslog server. I am sure the devs are busy though to worry about such things :) .
http://snapshots.pfsense.org/FreeBSD_stable/10/amd64/pfSense_HEAD/updates/?C=M;O=D
There were a few versions that showed up on the 29th. Look at the 4G non VGA 21:41hrs and 23:36hrs as an example. still too small up to the last snaps out.
-
I'm having dramas trying to clone the repo, not too sure what's going on but it doesn't look like I'll be able to pull the commit logs any time soon.
-
Also, be aware that there are some issues with the iso and update image names taking an earlier date (in the filename) than they should have. Just FYI, but it can add to the confusion when trying to back out what image was built when, and identify when a problem showed up.
https://forum.pfsense.org/index.php?topic=76744.0