Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Two nodes with v2.3.2 - ssh faulty on one?

    Scheduled Pinned Locked Moved General pfSense Questions
    7 Posts 3 Posters 1.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • JeGrJ Offline
      JeGr LAYER 8 Moderator
      last edited by

      Hi all,

      as per topic title we installed two new nodes with pfsense 2.3.2 (latest stable) and imported our configuration. Besides a few minor problems with VIPs or packages from the old version, this went fine.
      We have a third machine installed with Ubuntu 14.04LTS which is configured to SSH into both nodes with a specific user and via SSH key. That was running smooth for the last 3 years. After the two new nodes were brought in and replaced the old machines, I now have a strange effect:

      The new node fwl01 is working fine. The server (crimson) is ssh'ing into it and scp'ing some files. Great!
      The other node fwl02 is working - 1 out of 10 times. :O If I manually hop onto crimson and just call a simple "ssh fwl02" I get a connection at about 1 out of 10 times. SSH debugging mode didn't work either, the server simply doesn't get a response from pfSense on the other end. pfSense itself logs that access with a strange SSH failure:

      Sep 23 11:40:11 fwl02 sshd[94279]: fatal: Fssh_ssh_dispatch_run_fatal: Connection from <ip>port 38615: Operation not permitted [preauth]
      Sep 23 11:43:29 fwl02 sshd[28060]: fatal: Fssh_ssh_dispatch_run_fatal: Connection from <ip>port 58669: Operation not permitted [preauth]
      Sep 23 11:59:00 fwl02 sshd[66932]: Accepted publickey for nbackup from <ip>port 58670 ssh2: RSA SHA256:9OtuB6wUUNpHyZSNB/B+BG2nlFUr9WvGDoGw9caQY7Y
      Sep 23 11:59:00 fwl02 sshd[67355]: Received disconnect from <ip>port 58670:11: disconnected by user</ip></ip></ip></ip>

      As can be seen, the first two connects were faulty, the one a few minutes later got through and actually worked. I can reproduce that at will on that server with normal SSH as well as SCP. As Ubuntu 14.04LTS already ships with ED25519 support, new(er) SSH KEX, MACs or Ciphers shouldn't be a problem here. Additionally fwl01 seems to have no trouble at all connecting with crimson. Even if you fire a dozen ssh requests at fwl01, it answers every one of it. fwl02 not so much. As they were both installed from the exact same USB Image and got literally the same configuration (as they are running in a CARP cluster scenario) I'm out of clues whatsoever triggers such a phenomenom. Never had problems with pfSense' SSH implementation at all.

      Any clues?

      Greets
      Jens

      Don't forget to upvote 👍 those who kindly offered their time and brainpower to help you!

      If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

      1 Reply Last reply Reply Quote 0
      • jimpJ Offline
        jimp Rebel Alliance Developer Netgate
        last edited by

        If it works inconsistently, then it's either something with the server itself (hardware, perhaps?) or something on the network between them. Perhaps the network layer is being interrupted or there is some other inconsistency there.

        Googling that error turns up nothing, which is even more perplexing.

        Are there any other errors in your system log around the ssh errors?

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • JeGrJ Offline
          JeGr LAYER 8 Moderator
          last edited by

          with the server itself (hardware, perhaps?)

          checked that multiple times already. No faulty hardware found up to this point. Even ECC on RAMs checked without error.

          Googling that error turns up nothing, which is even more perplexing.

          I hear you. You'd have to see my expression on the google results…

          Are there any other errors in your system log around the ssh errors?

          Checked that. No, the fatal error message is the only one triggered by a connection request via ssh. no others are logged (checked with running clog -f on system.log at the same time).
          Strange thing: from a windows machine with running mobaxterm and a shell like environment, i can connect to ssh on fwl02 multiple times without running into the same error. I'm at a loss on this one. Never had any trouble with the backup VM though as this is an almost naked minimal install of Ubuntu Trusty that otherwise has no problems connecting anywhere. And as fwl01 isn't making the same errors (nor are our two office pfSense's) I'm going bonkers about what the heck that may be.

          Don't forget to upvote 👍 those who kindly offered their time and brainpower to help you!

          If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

          1 Reply Last reply Reply Quote 0
          • johnpozJ Offline
            johnpoz LAYER 8 Global Moderator
            last edited by

            Well did you turn debug in your ssh client so you might gleen some actual information to work with?  Most likely some sort of issue with what cipher/algo to use, etc..

            An intelligent man is sometimes forced to be drunk to spend time with his fools
            If you get confused: Listen to the Music Play
            Please don't Chat/PM me for help, unless mod related
            SG-4860 25.07.1 | Lab VMs 2.8, 25.07.1

            1 Reply Last reply Reply Quote 0
            • JeGrJ Offline
              JeGr LAYER 8 Moderator
              last edited by

              @johnpoz: yeah already done and no, as soon as I see the SSH error message on fwl02 there's no feedback from the sshd to the client anymore.  The communication simply stops (for the client) with

              debug1: SSH2_MSG_KEXINIT sent
              

              Every other debug output is exactly the same. If the connection works, it gets a response with

              
              debug1: SSH2_MSG_KEXINIT received
              debug2: kex_parse_kexinit: curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
              debug2: kex_parse_kexinit: ssh-ed25519-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,ssh-rsa-cert-v01@openssh.com,ssh-dss-cert-v01@openssh.com,ssh-rsa-cert-v00@openssh.com,ssh-dss-cert-v00@openssh.com,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-rsa,ssh-dss
              debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-gcm@openssh.com,aes256-gcm@openssh.com,chacha20-poly1305@openssh.com,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
              debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-gcm@openssh.com,aes256-gcm@openssh.com,chacha20-poly1305@openssh.com,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
              debug2: kex_parse_kexinit: hmac-md5-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-ripemd160-etm@openssh.com,hmac-sha1-96-etm@openssh.com,hmac-md5-96-etm@openssh.com,hmac-md5,hmac-sha1,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
              debug2: kex_parse_kexinit: hmac-md5-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-ripemd160-etm@openssh.com,hmac-sha1-96-etm@openssh.com,hmac-md5-96-etm@openssh.com,hmac-md5,hmac-sha1,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
              debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
              debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
              ...
              ...etc etc...
              
              

              But other than that there is no visible "error". The client just never gets a response back from pfSense after. Even worse - the ssh client binary won't even get a timeout and just sits there waiting forever :/

              fatal: Fssh_ssh_dispatch_run_fatal: Connection from <ip> port 38615: Operation not permitted [preauth]</ip>
              

              Old KEY/MAC/Ciphers were my first guess as we had quite a few of those problems with hosting customers after trusty and xenial upped their game with thighter opensshd settings and quite a few customer tools with bad and old ssh settings no longer worked. But that error is a first for us.

              Don't forget to upvote 👍 those who kindly offered their time and brainpower to help you!

              If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

              1 Reply Last reply Reply Quote 0
              • johnpozJ Offline
                johnpoz LAYER 8 Global Moderator
                last edited by

                So seems when you client asks for kex the sshd is crashing..  That what I take from your saying sent but nothing comes back.

                An intelligent man is sometimes forced to be drunk to spend time with his fools
                If you get confused: Listen to the Music Play
                Please don't Chat/PM me for help, unless mod related
                SG-4860 25.07.1 | Lab VMs 2.8, 25.07.1

                1 Reply Last reply Reply Quote 0
                • JeGrJ Offline
                  JeGr LAYER 8 Moderator
                  last edited by

                  Aye, that's what I assume. As to why I'm stuck, as it doesn't "crash" every time (as if that whole thing wasn't crazy enough already)

                  Don't forget to upvote 👍 those who kindly offered their time and brainpower to help you!

                  If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.