• Hi

    ssh down in the last 3 or 4 snapshots. Can't restart in Gui

    To restart i use console with command /etc/sshd restart  i have to put all path otherwise give me error need absolute path, (hyper-v 64bit Full).

    Edited:
    Tested on hyper-v 64bit Full and nanobsd 1G 64.


  • 2.2-ALPHA (amd64)  4G
    built on Fri May 23 23:45:59 CDT 2014

    Confirmed here as well…


  • I'm using

    2.2-ALPHA (amd64)
    built on Fri May 23 23:45:59 CDT 2014

    , too, and I can confirm this! :-\


  • I spun up a VM, and see that check_reload_status dumps core when trying to start sshd from the console text menu.

    php: rc.initialtoggle_sshd: sent_event: sent service reload sshd got
    kernel: pid 42825 (check_reload_status), uid 0: exited on signal 11 (core dumped)
    pfSense check_reload_status: Reloading check_reload_status becuase it exited from an error!
    

    I can't dig any deeper into check_reload_status (looking for recent problem commits) without signing the TLA.

    If I start /usr/sbin/sshd directly, it bails after failing to read any host keys.

    edit: check_reload_status still dumps core, even if I generate keys manually to get sshd running.

    [Side note: sshd by default is also looking for an ed25519 host key, but that key pair is not generated by /etc/sshd (only rsa1, rsa, dsa and ecdsa are generated).  Generating the ed25519 key by hand quiets sshd when starting up.]


  • Test the next coming snapshot.


  • No good ermal, sshd is still not loading on 2.2-DEVELOPMENT-amd64-20140527-1021. I'm also still seeing a lot of this in the logs:

    
    May 28 15:07:59	check_reload_status: check_reload_status is starting.
    May 28 15:07:59	check_reload_status: fcgipath /var/run/php-fpm.socket
    May 28 15:08:17	php-fpm[66529]: /firewall_rules_edit.php: Beginning https://portal.pfsense.org configuration backup.
    May 28 15:08:18	check_reload_status: Reloading check_reload_status because it exited from an error!
    May 28 15:08:18	kernel: pid 64474 (check_reload_status), uid 0: exited on signal 11 (core dumped)
    
    

    This appears to be preventing firewall rule changes from being loaded until a reboot is performed.

  • Developer Netgate Administrator

    Please try the new one


  • Today "no gui" snapshot have ssh.

    I will try but the last build is to short only 849668 Bytes (latest-nanobsd-vga-1g.img.gz  29-May-2014 23:36  849668), full seems ok.

    Updating my Hyper-V with 64 full.


  • 2.2-ALPHA (i386)
    built on Thu May 29 06:53:30 CDT 2014
    FreeBSD 10.0-STABLE

    I am using the snapshot from the 29th and sshd doesnt start for me either.  If I run '/etc/sshd start' from the console it does start up like someone else mentioned.  If I kill it from the gui it stops.

    If I try to send a message to /var/run/check_reload_status unix socket using send_event("service restart sshd"); from the php 'Command Prompt' menu item under Diagnostics I see the log event that sshd was started but it doesn't actually start.  I don't believe it is executing sshd though because the SSH keys were never generated using the gui (which uses send_event()).  When I started sshd using the console with '/etc/sshd start' command though I saw the gui message system pop up an alert that the ssh keys were being generated.

    It seems like the problem is from /usr/local/sbin/check_reload_status or something that it doesn't do right.

    Note:  I applied the latest patches from several hours ago related to sshd before doing these tests.  I also rebooted just to make sure before doing the tests above.

    Improve /etc/sshd
    http://github.com/pfsense/pfsense/commit/2d6e7bfb45ef798b4914b74a4ef71497709bacf6.patch

    Fix typo (sshd key)
    http://github.com/pfsense/pfsense/commit/33b42689019ec9483c4e83e844ba9f12774e870a.patch

    Add @ to silent any possible return of posix_kill  (sshd)
    http://github.com/pfsense/pfsense/commit/5125c7462648131dff08a8d88577ce6e3cce797d.patch

    glob() is already called by unlink_if_exists (sshd)
    http://github.com/pfsense/pfsense/commit/8490ba0fc6119415a0925ab5b80018ad6709969a.patch


  • Don't know if this is helpful, but like my other posts stated….
    Reloading for any reason causes a loop that never resolves itself.
    I post here regarding OP's issue with SSH because in an attempt to get the GUI working I reverted to a snapshot from the 26th, and found the the reload still endlessly loops, even though I had mistakenly assumed everything was functioning okay from that early point.


  • As I posted in another thread my CPU is at 100% all the time because of check_reload_status process eating it all.  The gui still seems functional for me.  Some things are slow at times though.  I really wish I could run an strace on check_reload_status like I do on programs running on Linux systems to see where it is spending it's time or look for errors in system calls.

  • Moderator

    I haven't tried it but would this work?

    http://www.freebsd.org/cgi/man.cgi?query=truss


  • Thanks!  That is what I was looking for in freebsd.

    It looks like it is in an infinite loop trying to read from a bunch of file descriptors (or unix sockets?).  I don't know enough about check_reload_status to know what it is supposed to do but it seems like there is no data to read according to the = 0.  Maybe the wait time in the loop is not actually waiting for some reason.  It is eating 100% CPU just checking if data is available.

    I assume kevent is supposed to block (until some timeout) until data is available but it appears to be returning even when there is no data pretty quickly.

    kevent(4,{},0,{0x9,EVFILT_READ,EV_EOF,0,0x0,0x28824b60 0xb,EVFILT_READ,EV_EOF,0,0x0,0x28825560 0xd,EVFILT_READ,EV_EOF,0,0x0,0x28825f60 0xf,EVFILT_READ,EV_EOF,0,0x0,0x28826960 0x11,EVFILT_READ,EV_EOF,0,0x0,0x28827360 0x7,EVFILT_READ,EV_EOF,0,0x0,0x28824160 0x10,EVFILT_READ,EV_EOF,0,0x0,0x28827d60 0x13,EVFILT_READ,EV_EOF,0,0x0,0x28828760},64,0x0) = 8 (0x8)
    recvfrom(9,0xbfbfebf8,8,0x0,NULL,0x0)            = 0 (0x0)
    recvfrom(11,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(13,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(15,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(17,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(7,0xbfbfebf8,8,0x0,NULL,0x0)            = 0 (0x0)
    recvfrom(16,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(19,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    kevent(4,{},0,{0x9,EVFILT_READ,EV_EOF,0,0x0,0x28824b60 0xb,EVFILT_READ,EV_EOF,0,0x0,0x28825560 0xd,EVFILT_READ,EV_EOF,0,0x0,0x28825f60 0xf,EVFILT_READ,EV_EOF,0,0x0,0x28826960 0x11,EVFILT_READ,EV_EOF,0,0x0,0x28827360 0x7,EVFILT_READ,EV_EOF,0,0x0,0x28824160 0x10,EVFILT_READ,EV_EOF,0,0x0,0x28827d60 0x13,EVFILT_READ,EV_EOF,0,0x0,0x28828760},64,0x0) = 8 (0x8)
    recvfrom(9,0xbfbfebf8,8,0x0,NULL,0x0)            = 0 (0x0)
    recvfrom(11,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(13,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(15,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(17,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(7,0xbfbfebf8,8,0x0,NULL,0x0)            = 0 (0x0)
    recvfrom(16,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(19,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    kevent(4,{},0,{0x9,EVFILT_READ,EV_EOF,0,0x0,0x28824b60 0xb,EVFILT_READ,EV_EOF,0,0x0,0x28825560 0xd,EVFILT_READ,EV_EOF,0,0x0,0x28825f60 0xf,EVFILT_READ,EV_EOF,0,0x0,0x28826960 0x11,EVFILT_READ,EV_EOF,0,0x0,0x28827360 0x7,EVFILT_READ,EV_EOF,0,0x0,0x28824160 0x10,EVFILT_READ,EV_EOF,0,0x0,0x28827d60 0x13,EVFILT_READ,EV_EOF,0,0x0,0x28828760},64,0x0) = 8 (0x8)
    recvfrom(9,0xbfbfebf8,8,0x0,NULL,0x0)            = 0 (0x0)
    recvfrom(11,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(13,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(15,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(17,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(7,0xbfbfebf8,8,0x0,NULL,0x0)            = 0 (0x0)
    recvfrom(16,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    recvfrom(19,0xbfbfebf8,8,0x0,NULL,0x0)           = 0 (0x0)
    

  • Good news and bad news….

    The good:
    I updated to the latest snapshot from the 30th and now ssh works and check_reload_status doesn't eat up the CPU anymore!

    The bad:
    Now the gui doesn't work for me anymore.  I just get a timeout.  Doh.

    last pid:  5548;  load averages:  0.02,  0.40,  0.32                                                                                            up 0+00:07:57  01:39:37
    30 processes:  1 running, 29 sleeping
    CPU:  0.0% user,  0.0% nice,  0.8% system,  0.0% interrupt, 99.2% idle
    Mem: 38M Active, 67M Inact, 101M Wired, 84M Buf, 736M Free
    Swap: 8192M Total, 8192M Free
    
      PID USERNAME    THR PRI NICE   SIZE    RES STATE    TIME    WCPU COMMAND
     5548 root          1  20    0 11308K  2432K RUN      0:00   0.10% top
      270 root          1  40   20 10184K  1756K kqread   1:25   0.00% check_reload_status
      255 root          1  20    0 82148K 28260K sbwait   0:01   0.00% php-fpm
     6866 root          1  20    0 10256K  1968K select   0:01   0.00% syslogd
    17469 root          1  20    0 10316K  1760K bpf      0:00   0.00% filterlog
    91049 root          1  20    0 15828K  4928K select   0:00   0.00% sshd
    37354 nobody        1  20    0 11336K  3640K select   0:00   0.00% dnsmasq
    45161 root          1  52   20 10544K  1992K wait     0:00   0.00% sh
    92564 root          1  20    0 12976K 13004K select   0:00   0.00% ntpd
    30599 root          1  20    0 13028K  5100K kqread   0:00   0.00% lighttpd
    21526 root          1  20    0 10100K  1740K select   0:00   0.00% apinger
      256 root          1  24    0 82148K 25644K sbwait   0:00   0.00% php-fpm
      257 root          1  21    0 82148K 26440K sbwait   0:00   0.00% php-fpm
    19005 root          1  20    0 10252K  1916K select   0:00   0.00% inetd
       22 root          1  52    0 10548K  1556K pause    0:00   0.00% sh
    37426 dhcpd         1  20    0 16260K  9200K select   0:00   0.00% dhcpd
    26092 root          1  20    0 10900K  2896K pause    0:00   0.00% tcsh
      253 root          1  20    0 82148K 19636K kqread   0:00   0.00% php-fpm
    19438 root          1  44    0 10544K  2272K wait     0:00   0.00% sh
    21774 root          1  20    0 11460K  2476K piperd   0:00   0.00% rrdtool
    20221 root          1  52    0 10544K  2168K wait     0:00   0.00% sh
    18890 root          1  20    0 12648K  4432K select   0:00   0.00% openvpn
    25727 root          1  20    0 10200K  1948K nanslp   0:00   0.00% cron
      283 root          1  20    0  8964K  3232K select   0:00   0.00% devd
    25826 root          1  52    0 10056K  1684K sbwait   0:00   0.00% fcgicli
     8914 root          1  20    0 13076K  4220K select   0:00   0.00% sshd
    15272 _dhcp         1  20    0 10276K  1932K select   0:00   0.00% dhclient
     5481 root          1  52   20  5948K  1624K nanslp   0:00   0.00% sleep
    10258 root          1  52    0 10276K  1876K select   0:00   0.00% dhclient
      272 root          1  52   20 10184K  1676K kqread   0:00   0.00% check_reload_status
    
    

  • It is getting a connection it seems.  If I go to the IP address using a browser instead of the name i get a certificate warning but then just a white screen with a waiting for IP message as the status waiting for data and eventually get a 'No data received' error from Chrome.

  • Moderator

    @adam65535:

    Thanks!  That is what I was looking for in freebsd.

    NP, Glad it worked for you!


  • @Renato:

    Please try the new one

    Bringing back GUI ended up with a sshd down.

  • Developer Netgate Administrator

    I found the issue and pushed a fix few minutes ago. Please wait the new snapshot