Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Intermittently Unresponsive in Hyper-V

    Scheduled Pinned Locked Moved General pfSense Questions
    24 Posts 3 Posters 2.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      spittlbm @stephenw10
      last edited by

      This post is deleted!
      1 Reply Last reply Reply Quote 0
      • S
        spittlbm @stephenw10
        last edited by

        @stephenw10 errant GW disabled. System processed traffic overnight with the exception of nginx going down with a 502 Bad Gateway. Restart PHP-FPM got it back online.

        System.log was full of this to the point it rolled over

        Aug 18 09:45:05 firewall check_reload_status[66343]: Could not connect to /var/run/php-fpm.socket
        

        nignx had a few

        Aug 18 09:47:10 firewall nginx: 2023/08/18 09:47:10 [error] 12916#100237: *2631 connect() to unix:/var/run/php-fpm.socket failed (61: Connection refused) while connecting to upstream, client: 192.168.1.195, server: , request: "GET / HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket:", host: "192.168.1.1"
        

        kern.ipc.soacceptqueue = 128, so I bumped that up to 1024 via a system tunable and verified it applied.

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Hmm, nothing logged to show why php stopped responding?

          I suspect bumping that sysctl won't help if php never responds.

          S 1 Reply Last reply Reply Quote 0
          • S
            spittlbm @stephenw10
            last edited by

            @stephenw10 what's the best way to increase logging for PHP?

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              There's no easy way to do that. You can check /tmp/php_errors.txt but I'd expect any php errors to be shown as an alert.

              S 1 Reply Last reply Reply Quote 0
              • S
                spittlbm @stephenw10
                last edited by spittlbm

                Logged in tonight to see SWAP was at 52%.

                e5d15bf3-35fd-4ed7-9b65-cdc315471e78-image.png

                Ran top and see this:

                65112b36-e2fe-4833-873c-5e1d081fa2bb-image.png

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Ah, so some stuck script using all the RAM. Interesting.

                  Try running top as top -HaSP that may show more.

                  You might also try running ps -auxwwd that may show you hat script is running.

                  S 1 Reply Last reply Reply Quote 0
                  • S
                    spittlbm @stephenw10
                    last edited by

                    SWAP use is down to 5% a few (15) hours later without my intervention:

                    a6279156-e65f-4a41-a273-e3e4804cd3c9-image.png

                    Sorted and filtered for Swap:
                    2ce2fc31-8ba2-4f24-b11d-ec85cf798a3f-image.png

                    ps -auxwwd results:
                    094ef2e0-38ae-42bb-bee6-c71b58501e54-image.png

                    Stopped ntopng service and the stats seem similar:
                    414c786f-b035-49a2-9efd-20387afaf69a-image.png

                    top sorted by size:
                    e4942dd0-88dc-47ce-b796-465846c0ba0c-image.png

                    top sorted by Swap:
                    bec3393c-0306-43c8-8e0b-bb229280fe61-image.png

                    and lastly:
                    06779aa0-c38b-41ad-811c-a1e9300751a5-image.png

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Nothing shown in the process tree under php-fpm in the ps output?

                      S 1 Reply Last reply Reply Quote 0
                      • S
                        spittlbm @stephenw10
                        last edited by spittlbm

                        Not nothing...

                        790bc722-7a59-4444-bea2-35093c3e354a-image.png

                        Sorry for truncating that output in the previous message.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Hmm, dissapointing, I was hoping it might show there.

                          I'm not really aware of anything that behaves like that but don't run Hyper-V. You might ask in the virtualisation sub forum.

                          1 Reply Last reply Reply Quote 0
                          • X
                            xpxp2002
                            last edited by xpxp2002

                            Just posting here to let you know that I've also had similar issues with a system has been running stable for 5+ years on Hyper-V. Since upgrading to 23.05 (and now 23.05.1) this has begun happening to me. The only difference is that it occurs after about 14 days of uptime, so not easy to reproduce or troubleshoot.

                            I just had it happen about 45 minutes ago, but log rotation has already purged the relevant system logs. I also send syslog to another server, but of course it didn't receive anything during the outage. I doubled the log size for rotation and extended the retained logs to 21. I'll post another update to this thread next time it happens, and see if I can capture the relevant logs before they get purged.

                            S 1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Unless anyone is seeing this on another platform this thread would probably be better in the virtualization sub.

                              1 Reply Last reply Reply Quote 0
                              • S
                                spittlbm @xpxp2002
                                last edited by

                                We've been stable for 4-5 days without touching anything. If it happens again, I'll hop to the other sub. Don't love that something is using a bit of swap and always at 2gb+ ram usage, but it's working well.

                                S 1 Reply Last reply Reply Quote 0
                                • S
                                  spittlbm @spittlbm
                                  last edited by

                                  So it's been about 6 weeks since anything happened and we had a crash today. Just replying to add 2 pieces of information in case anyone comes across this post in the future. Today we rebooted the VM host for the first time in 8 weeks after pushing some Microsoft updates. Second, I tried to reboot from shell and it hung on stopping ntopng. I think the reboot is anecdotal/coincidental.

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.