No Web Interface on Thu May 29 08:48:37 CDT 2014


  • Hi

    Can't use web interface in nanobsd 1g 64 and Full 64 installed on Hyper-v.

    I've tried restart webConfigurator kill php and lighttpd, nothing. At the reeboot, in one of trays, i got it but just in time to only authenticate and see the button to reinstall package and press it.


  • This is confirmed for memstick version x86 and x64.
    Boot hangs after cron, webgui won't come up though.

    After the snapshot to fix this, is there some way to update without the GUI or fresh install?


  • I can confirm that.  Fresh (today's) snapshot doesn't complete booting for me: on the console, it stops here:

    Starting webConfigurator ... done.
    Configuring CRON ... done.
    Starting DNS forwarder ... done.
    Starting NTP time client ... done
    Starting DHCP service ... done.
    Starting DHCPv6 service ... done.
    Configuring Firewall ... done.
    Generating RRD graphs ... done.
    Starting syslog ... done.
    Starting CRON ... done.
    
    

    Though I can ^C out of that to /bin/sh and look around.  check_reload_status does not dump core, so that's an improvement.  The VM can ping, and be pinged, on the LAN.  Looks like it may be hanging in /etc/rc, right after starting CRON.  If I comment out the rest of the commands in /etc/rc and reboot, the boot continues and I get the console menu.

    Still no web server at that point, and as before, sshd can't be started.


  • @m3usv0x:

    After the snapshot to fix this, is there some way to update without the GUI or fresh install?

    Perhaps the easiest would be to save your configuration and do a fresh install.  To get your config file, boot into single user mode and get it from /cf/conf/config.xml (and older copies from the backup directory if you want).

    Alternately you can hack /etc/rc as I described above (commenting out everything past cron); this will get you to the console menu where you can choose item 13 "Upgrade from Console"


  • @charliem:

    @m3usv0x:

    After the snapshot to fix this, is there some way to update without the GUI or fresh install?

    Perhaps the easiest would be to save your configuration and do a fresh install.  To get your config file, boot into single user mode and get it from /cf/conf/config.xml (and older copies from the backup directory if you want).

    Alternately you can hack /etc/rc as I described above (commenting out everything past cron); this will get you to the console menu where you can choose item 13 "Upgrade from Console"

    How does one "comment out" as you explained?
    After I select "upgrade from console", how do I upgrade? As before with a usb thumbdrive?


  • Config from last install borks a fresh install from an earlier version.

    Is sad.


  • @m3usv0x:

    Is sad.

    No, it's alpha


  • @charliem:

    @m3usv0x:

    Is sad.

    No, it's alpha

    Thank you, Captain Obvious.

    We're all aware it's Alpha.

  • Rebel Alliance

    Is the same on "amd64-20140529-1554" snapshot (no webconfigurator, boot stuck at starting Cron…), but i was able to login via SSH and perform a "upgrade" ;)

    
    login as: root
    Using keyboard-interactive authentication.
    Password for root@pfsense.d510:
    *** Welcome to pfSense 2.2-ALPHA-pfSense (amd64) on pfsense ***
    
     WAN (wan)       -> pppoe0     -> v4/PPPoE: 1.1.1.1/32
     LAN (lan)       -> re1        -> v4: 192.168.1.254/24
     GEST (opt1)     -> re0        -> v4: 172.20.254.6/29
     W311U (opt2)    -> run0_wlan1 -> v4: 192.168.0.1/24
    
     0) Logout (SSH only)                  8) Shell
     1) Assign Interfaces                  9) pfTop
     2) Set interface(s) IP address       10) Filter Logs
     3) Reset webConfigurator password    11) Restart webConfigurator
     4) Reset to factory defaults         12) pfSense Developer Shell
     5) Reboot system                     13) Upgrade from console
     6) Halt system                       14) Disable Secure Shell (sshd)
     7) Ping host                         15) Restore recent configuration
    
    Enter an option: 13
    
    Starting the pfSense console firmware update system..
    
    1) Update from a URL
    2) Update from a local file
    Q) Quit
    
    Please select an option to continue: 1
    
    Enter the URL to the .tgz or .img.gz update file.
    Type 'auto' to use http://snapshots.pfsense.org/FreeBSD_stable/10/amd64/pfSense_HEAD/.updaters//latest.tgz
    > http://snapshots.pfsense.org/FreeBSD_stable/10/amd64/pfSense_HEAD/updates/pfSense-Full-Update-2.2-DEVELOPMENT-amd64-20140526-1601.tgz
    
    Fetching file size...
    
    File size: 80085938
    
    Fetching file...
    looking up snapshots.pfsense.org
    connecting to snapshots.pfsense.org:80
    requesting http://snapshots.pfsense.org/FreeBSD_stable/10/amd64/pfSense_HEAD/updates/pfSense-Full-Update-2.2-DEVELOPMENT-amd64-20140526-1601.tgz
    remote size / mtime: 80085938 / 1401202722
    /root/firmware.tgz                              6% of   76 MB  106 kBps 11m34s
    
    

    edit: fix code tag ;)


  • You can update through console or ssh (this one have ssh)

    Option 13 -> 1 -> "auto"


  • @mais_um:

    You can update through console or ssh (this one have ssh)

    Option 13 -> 1 -> "auto"

    I've already done this, but for the sake of knowing…
    If I get stuck at the loader where it get stuck at cron, how do I access the console?

  • Netgate Administrator

    As Charlie said, hit ^C (Control-C).

    Steve


  • @m3usv0x:

    @charliem:

    @m3usv0x:

    Is sad.

    No, it's alpha

    Thank you, Captain Obvious.

    We're all aware it's Alpha.

    Yes, but it sounded (to me) like you were expecting more from alpha.

    To keep this post somewhat on-topic, I noticed an option in the installation menu to "rescue config.xml":

    < quick/easy install >
    < custom install >
    < rescue config.xml >
    < reboot >
    < exit >
    

    I haven't tried to use it, but perhaps it would be easier for some rather than using the other methods described in this thread.


  • Has anyone downloaded the last snapshot from today at 23:36?


  • I updated to amd64-20140529-1755, issue is still present. Tried updating again from SSH, didn't fix it. Restarting webConfigurator from SSH doesn't fix it. Routing and firewall functionality seems to be unaffected.

  • Netgate Administrator

    Mmm, the most recent update file is still far too small, <1MB for the Nano 1G update.

    Steve


  • Playing with this mornings amd-64 snapshot, as a fresh install in a VM.  As noted earlier, the boot hangs after 'Starting CRON'.  Seems that the guilty line is in /etc/rc:

     /usr/local/sbin/fcgicli -f /etc/rc.start_packages
    

    If you comment out only that line in /etc/rc, then the boot completes and goes to the console menu.

    Run that line at a console prompt in a recent snapshot, it hangs until you ^C out of it.  But run that line in an earlier, working snapshot, and it completes and returns to the shell.  Note that I'm testing without any packages installed, so the command should not actually do anything for me.

    This explains the hanging boot, and may be related to the missing webgui, but I haven't dug that far yet.

    As a side note, if you do a fresh install, you can run /etc/sshd by hand to generate the keys.  sshd won't start up until keys are present.

    [edit: noted as amd-64 arch]


  • The following cases all show a missing webgui and boot hang symptoms described above:

    • amd-64, fresh install, 30-May image
    • i386, fresh install, 30-May image
    • amd-64, fresh install 23-May image (confirmed working and completed the wizard) and then confirmed failing when auto-upgraded to 30-May update

  • Still no go for web interface on amd64-20140530-1557.

    EDIT: I can get the bootup to complete by running:

    
    ps aux | grep -i rc
    
    

    Then finding the PID of /usr/local/sbin/fcgicli -f /etc/rc.start_packages and running:

    
    kill -9 xxxxx
    
    

    Where xxxxx is the PID of the process. This doesn't bring up the web interface though.


  • I am running the 30th snapshot on i386 and sshd works for me after running it manually the last time I booted to create the sshd keys.  The GUI doesn't work for me though.

    I noticed the same  /usr/local/sbin/fcgicli -f /etc/rc.start_packages command in the process list apparently stuck.  I killed it and then a few more with different rc scripts specified as arguments were launched by minicron I think.  Those are stuck now too.

    I ran the tracing command truss manually on one of the command lines and it appears to lock up around the time of writting to /var/run/php-fpm.socket.  php-fpm is running.

    truss /usr/local/sbin/fcgicli -f /etc/rc.update_alias_url_data

    connect(3,{ AF_UNIX "/var/run/php-fpm.socket" },106) = 0 (0x0)
    __sysctl(0xbfbfe6c4,0x2,0xbfbfe708,0xbfbfe6c0,0x0,0x0) = 0 (0x0)
    __sysctl(0xbfbfe6c4,0x2,0xbfbfe808,0xbfbfe6c0,0x0,0x0) = 0 (0x0)
    __sysctl(0xbfbfe6c4,0x2,0xbfbfe908,0xbfbfe6c0,0x0,0x0) = 0 (0x0)
    __sysctl(0xbfbfe6c4,0x2,0xbfbfea08,0xbfbfe6c0,0x0,0x0) = 0 (0x0)
    __sysctl(0xbfbfe6c4,0x2,0xbfbfeb08,0xbfbfe6c0,0x0,0x0) = 0 (0x0)
    madvise(0x28804000,0x1000,0x5,0x281c15f8,0xbfbfe4f4,0x28120ccf) = 0 (0x0)
    madvise(0x28816000,0x1000,0x5,0x281c15f8,0xbfbfe4f4,0x28120ccf) = 0 (0x0)
    madvise(0x28818000,0x1000,0x5,0x281c15f8,0xbfbfe56c,0x28120ccf) = 0 (0x0)
    madvise(0x28803000,0x3000,0x5,0x281c15f8,0xbfbfe57c,0x28120ccf) = 0 (0x0)
    write(3,"\^A\^A\0\^A\0\b\0\0\0\^A\0\0\0\0"...,263) = 263 (0x107)
    

    It just sits there forever.  Any /usr/local/sbin/fcgicli command executed even by hand gets stuck there.


  • I got the GUI working.  I killed php-fpm and restarted it.  So this is related to php-fpm somehow not starting up properly or something hanging it up.

    killall php-fpm

    /usr/local/sbin/php-fpm -c /usr/local/lib/php.ini -y /usr/local/lib/php-fpm.conf -RD 2>&1 >/dev/null


  • As soon as I ran

    /usr/local/sbin/fcgicli -f /etc/rc.start_packages

    The webgui doesn't work anymore.  Any attempts to use php-fpm locks up the process writting to the fpm socket again.

    It appears something in the command above locks up php-fpm.  If I restart php-fpm it works again.  I am going to comment out the command above from /etc/rc and see if the firewalls starts up properly.  I have a feeling it will.  I will just need to start the packages manually after a reboot.  This is at home so it isn't a big deal :).


  • Well… It is not specifically startpackages which kills it.  It seems to lock up with other fcgicli commands during boot.  If I restart php-fpm and then execute the few fcgicli commands in order from /etc/rc one will eventually cause php-fpm to block on writing to it's socket.  I will just manually kill php-fpm and restart it after every boot for now.  It appears the GUI doesn't lock it up (I didn't test everything though... only viewing some of the pages).


  • I just updated to the 31st snapshot and the problem is still there.  I just manually kill php-fpm and restart it per how it is started in /etc/rc

    2.2-ALPHA (i386)
    built on Sat May 31 10:32:02 CDT 2014
    FreeBSD 10.0-STABLE


  • The web interface stopped working again when I uninstalled the Patches package.  I killed and restarted php-fpm and it started working again.


  • Great work!  I know that commenting out the start_packages line from /etc/rc is not enough to get the webgui working, as you found out.  There are a few minicron entries after that line in /etc/rc that use fcgicli as well, one hourly account expire and one daily alias url updater.  If I understand the problem correctly, they should be commented out as well, right?

    In the old days, I'd look through the recent commits to the pfSense-tools tree, but I haven't taken the steps to regain access to that yet.

    Hopefully the devs can find a fix, now that you've narrowed down the problem even more.


  • I am still getting the 100% CPU by check_reload_status too.  I killed that and restarted it and the CPU went back to normal again.


  • I'm in the process of cloning the pfsense-tools repo to have a look through the commits. Can someone give me a timeframe for when this issue started showing up?


  • If the previous snapshots are available I can start installing them backwards and see when the issue disappears.

    The first version I noticed the problem was the 29th or the 30th build.

    EDIT:  I am not sure what version I was running previous to the 29th build.  It might have been the 26th or 27th.  I don't see anything in the logs to show that I rebooted on the 28th.  I put in a request for logging the version on boot so that I can easily keep track of what version was installed by going through the logs on my remote syslog server.  I am sure the devs are busy though to worry about such things :).


  • @adam65535:

    If the previous snapshots are available I can start installing them backwards and see when the issue disappears.

    The first version I noticed the problem was the 29th or the 30th build.

    EDIT:  I am not sure what version I was running previous to the 29th build.  It might have been the 26th or 27th.  I don't see anything in the logs to show that I rebooted on the 28th.  I put in a request for logging the version on boot so that I can easily keep track of what version was installed by going through the logs on my remote syslog server.  I am sure the devs are busy though to worry about such things :) .

    http://snapshots.pfsense.org/FreeBSD_stable/10/amd64/pfSense_HEAD/updates/?C=M;O=D

    There were a few versions that showed up on the 29th.    Look at the 4G non VGA 21:41hrs  and  23:36hrs as an example.  still too small up to the last snaps out.


  • I'm having dramas trying to clone the repo, not too sure what's going on but it doesn't look like I'll be able to pull the commit logs any time soon.


  • Also, be aware that there are some issues with the iso and update image names taking an earlier date (in the filename) than they should have.  Just FYI, but it can add to the confusion when trying to back out what image was built when, and identify when a problem showed up.
    https://forum.pfsense.org/index.php?topic=76744.0


  • Updated to the version below and same issue just as an FYI…

    2.2-ALPHA (i386)
    built on Mon Jun 02 06:28:31 CDT 2014
    FreeBSD 10.0-STABLE

    Manually restart php-fpm and the gui works again through ssh.

    killall php-fpm; sleep 2; /usr/local/sbin/php-fpm -c /usr/local/lib/php.ini -y /usr/local/lib/php-fpm.conf -RD 2>&1 >/dev/null

    I notice that check_reload_status sometimes goes to 100% too(mainly after reboot).  I manually kill and restart that and it goes back to normal low cpu usage.  I have to force this one with -9 .

    killall -9 check_reload_status; sleep 2; /usr/bin/nice -n20 /usr/local/sbin/check_reload_status


  • Any time I make a change to Suricata php-fpm has to be restarted again.  I tried changing the log file size.  The web page just sat there forever waiting for the post response I assume.  I restart php-fpm and the change was done to to setting.  I then tried to start Suricata and got the same waiting forever. I restarted php-fpm and the web gui started working again so I looked at the service did start.  It seems like the commands are getting through before the gui stops working(at least enough to make it look like they did anyway).

    I am thinking about uninstalling Suricata for testing 2.2 for now just so I don't have to deal with that.

    EDIT:  I just tried stopping Suricata and it stopped.  php-fpm just seems to randomly (seemingly) stop working.  Suricata might be a different issue as it goes to 100% CPU when I try to start it and doesn't seem to start anymore.  Suricata does eventually go to normal CPU usage but the webui never returns when telling it to start.  I still have to restart php-fpm.

    Too many things to troubleshoot right now so I am removing Suricata.


  • I'm surprised that more things are not broken, given what you've found with php-fpm.

    Do you have other packages installed that work OK, with just Suricata being a problem?  The author of Suricata package did suggest that problems be posted in the packages sub-forum, but your problem is quite likely an issue with current state of 2.2 rather than the package:
    https://forum.pfsense.org/index.php?topic=77311.msg421820#msg421820


  • I am sure more things will break php-fpm or are broken by php-fpm… whichever the case may be.  I just have only been messing with Suricata so that is where I was seeing the issues.

    I went ahead and added a rule and applied the changes and that worked  I then went to change the client DHCP range in the openvpn config and that locked up php-fpm too.

    So this is a more general failure of php-fpm it seems.

    EDIT:  I just checked the openvpn config after restarting php-fpm and it did make the change to the openvpn configuration even though php-fpm (and gui) stopped working.


  • Maybe it's time to create a bug in redmine, pointing back to these threads; so far, we don't even know if the devs are aware of the issue.  I'll do that later tonight unless someone else can get to it first.

    That would also be the right place to enter the feature request for version info to go into remote syslog files.



  • @charliem:

    I'm surprised that more things are not broken, given what you've found with php-fpm.

    Do you have other packages installed that work OK, with just Suricata being a problem?  The author of Suricata package did suggest that problems be posted in the packages sub-forum, but your problem is quite likely an issue with current state of 2.2 rather than the package:
    https://forum.pfsense.org/index.php?topic=77311.msg421820#msg421820

    I had the current Suricata package working fine on an earlier 2.2 snapshot (before the php-fpm and web GUI hang-ups started).  So I think Suricata is OK on 2.2, but for the moment 2.2 itself seems to have issues that often manifest themselves with any action using the GUI.

    Bill


  • Looks like everything's working again in the latest build (amd64-20140602-1822).