Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Captive portal random deaths

    Scheduled Pinned Locked Moved Captive Portal
    15 Posts 4 Posters 3.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      carzin
      last edited by

      Ok.  So we had another event yesterday.  The captive portal was throwing Internal Server Error.  This was the top 50 from the server log.  What do I need to look at next?

      Last 50 system log entries
      Oct 26 15:21:21 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-2
      Oct 26 15:21:21 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 1 load: 1085
      Oct 26 15:21:21 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-0
      Oct 26 15:21:21 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 2 load: 1085
      Oct 26 15:21:21 lighttpd[32515]: (mod_fastcgi.c.3587) all handlers for /index.php?zone=cpzone&redirurl=/sj/data.gif?intype=32&andver=5.0&rom=0&actionname=kbd_main&kong=0&imei=351776064868573&mcc=310&serial=b888b894&root=0&prodid=2&channel=10000014&kvercode=4171000&androidid=221da9d43d0fbaaa&pid=3517760648685737445ee16a99140d388c9ae9ca3046d34&did=ims8xwf5fexpfhgwtmlsw54jqwhl&mac=48:5A:3F:03:6A:3F&busi_type=2&intime=20140502&newer=0&osver=21&cl=en&click=charging_dialog_show&display=10801920&brand=samsung&mode=SM-N9005&kbdver=4.17.1&gaid=7e7ba8fd-f29b-4656-a0a1-9e864c89df3c on .php are down.
      Oct 26 15:21:23 php-fpm[93200]: /index.php: Successful login for user 'admin' from: REMOVED
      Oct 26 15:21:23 php-fpm[93200]: /index.php: Successful login for user 'admin' from: REMOVED
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-4
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 1085
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-2
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 1 load: 1085
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-0
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 2 load: 1085
      Oct 26 15:21:24 lighttpd[32515]: (mod_fastcgi.c.3587) all handlers for /index.php?zone=cpzone&redirurl=/sj/data.gif?intype=32&andver=5.0&rom=0&actionname=kbd_main&kong=0&imei=351776064868573&mcc=310&serial=b888b894&root=0&prodid=2&channel=10000014&kvercode=4171000&androidid=221da9d43d0fbaaa&pid=3517760648685737445ee16a99140d388c9ae9ca3046d34&did=ims8xwf5fexpfhgwtmlsw54jqwhl&mac=48:5A:3F:03:6A:3F&busi_type=2&intime=20140502&newer=0&osver=21&cl=en&click=REPORT_ACTIVE_UM_V5&display=1080
      1920&brand=samsung&mode=SM-N9005&kbdver=4.17.1&gaid=7e7ba8fd-f29b-4656-a0a1-9e864c89df3c on .php are down.
      Oct 26 15:21:24 lighttpd[32515]: (request.c.1125) POST-request, but content-length missing -> 411
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-4
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 1085
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-2
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 1 load: 1085
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-0
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 2 load: 1085
      Oct 26 15:21:27 lighttpd[32515]: (mod_fastcgi.c.3587) all handlers for /index.php?zone=cpzone&redirurl=/sj/data.gif?intype=32&andver=5.0&rom=0&actionname=kbd_main&kong=0&imei=351776064868573&mcc=310&serial=b888b894&root=0&prodid=2&channel=10000014&kvercode=4171000&androidid=221da9d43d0fbaaa&pid=3517760648685737445ee16a99140d388c9ae9ca3046d34&did=ims8xwf5fexpfhgwtmlsw54jqwhl&mac=48:5A:3F:03:6A:3F&busi_type=2&intime=20140502&newer=0&osver=21&cl=en&display=10801920&brand=samsung&mode=SM-N9005&kbdver=4.17.1&gaid=7e7ba8fd-f29b-4656-a0a1-9e864c89df3c&REPORT_ACTIVE=SelfAlarm_1445872006093_1445872005790_rescd_500 on .php are down.
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-4
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 1085
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-2
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 1 load: 1085
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-0
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 2 load: 1085
      Oct 26 15:21:30 lighttpd[32515]: (mod_fastcgi.c.3587) all handlers for /index.php?zone=cpzone&redirurl=/sj/data.gif?intype=32&andver=5.0&rom=0&actionname=kbd_main&kong=0&imei=351776064868573&mcc=310&serial=b888b894&root=0&prodid=2&channel=10000014&kvercode=4171000&androidid=221da9d43d0fbaaa&pid=3517760648685737445ee16a99140d388c9ae9ca3046d34&did=ims8xwf5fexpfhgwtmlsw54jqwhl&mac=48:5A:3F:03:6A:3F&busi_type=2&intime=20140502&newer=0&osver=21&cl=en&display=1080
      1920&brand=samsung&mode=SM-N9005&kbdver=4.17.1&gaid=7e7ba8fd-f29b-4656-a0a1-9e864c89df3c&REPORT_ACTIVE=SelfAlarm_1445872006093_1445872005790_rescd_500 on .php are down.
      Oct 26 15:21:30 lighttpd[32515]: (mod_evasive.c.183) 172.18.8.102 turned away. Too many connections.
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi-cpzone.socket
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-4
      Oct 26 15:21:33 kernel: sonewconn: pcb 0xfffff8002c506e10: Listen queue overflow: 193 already in queue awaiting acceptance (63 occurrences)
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 1085
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-2
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 1 load: 1085
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.1754) connect failed: Connection refused on unix:/tmp/php-fastcgi-cpzone.socket-0
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 2 load: 1085
      Oct 26 15:21:33 lighttpd[32515]: (mod_fastcgi.c.3587) all handlers for /index.php?zone=cpzone&redirurl=/sj/data.gif?intype=32&andver=5.0&rom=0&actionname=kbd_main&kong=0&imei=351776064868573&mcc=310&serial=b888b894&root=0&prodid=2&channel=10000014&kvercode=4171000&androidid=221da9d43d0fbaaa&pid=3517760648685737445ee16a99140d388c9ae9ca3046d34&did=ims8xwf5fexpfhgwtmlsw54jqwhl&mac=48:5A:3F:03:6A:3F&busi_type=2&intime=20140502&newer=0&osver=21&cl=en&click=charging_dialog_show&display=1080*1920&brand=samsung&mode=SM-N9005&kbdver=4.17.1&gaid=7e7ba8fd-f29b-4656-a0a1-9e864c89df3c on .php are down.

      1 Reply Last reply Reply Quote 0
      • C
        cmb
        last edited by

        Root cause there is PHP's dying. With fastcgi, I guess that's 2.1.x or older version on there. Upgrade to 2.2.4 first, php-fpm is better in that regard if it's some scalability issue, and you could be triggering some problem in the old PHP version.

        1 Reply Last reply Reply Quote 0
        • GertjanG
          Gertjan
          last edited by

          This guy:
          @carzin:

          Oct 26 15:21:30 lighttpd[32515]: (mod_evasive.c.183) 172.18.8.102 turned away. Too many connections.

          is it a client on the captive portal ?

          If so, its probably a case of a lousy written 'app' that doesn't understand what a 'portal' is and hammering your your portal. The portal send over a 'login page', the client (172.18.8.102) doesn't want that page, and keeps asking again and again …. up until 'no more resources' and PHP breaks.

          But, hey, that's just a thought. Can't remember well these issues with ancient versions ;)

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          1 Reply Last reply Reply Quote 0
          • C
            cmb
            last edited by

            @Gertjan:

            This guy:
            @carzin:

            Oct 26 15:21:30 lighttpd[32515]: (mod_evasive.c.183) 172.18.8.102 turned away. Too many connections.

            is it a client on the captive portal ?

            If so, its probably a case of a lousy written 'app' that doesn't understand what a 'portal' is and hammering your your portal. The portal send over a 'login page', the client (172.18.8.102) doesn't want that page, and keeps asking again and again …. up until 'no more resources' and PHP breaks.

            Yes, that would be a client. The fact the client connections limit is being met should prevent it from exhausting the PHP resources. But, that is along the lines of what I was thinking, except that something it was doing repeatedly caused PHP to crash rather than just run out of resources.

            1 Reply Last reply Reply Quote 0
            • C
              carzin
              last edited by

              All:  this box was running 2.2.4.  So I'm on the latest and greatest.  I've had this problem since we started using pfsense years ago, across multiple builds.

              1 Reply Last reply Reply Quote 0
              • GertjanG
                Gertjan
                last edited by

                It's probably not PHP. On a lower level you have this:

                Oct 26 15:21:33  kernel: sonewconn: pcb 0xfffff8002c506e10: Listen queue overflow: 193 already in queue awaiting acceptance (63 occurrences)

                Google FreeBSD + sonewconn (so you know that you are not the only one), try what the first link proposes.

                Other links will help you nailing down the process - port - etc.

                No "help me" PM's please. Use the forum, the community will thank you.
                Edit : and where are the logs ??

                1 Reply Last reply Reply Quote 0
                • C
                  carzin
                  last edited by

                  I need some spoon feeding.  I am not a Linux guru.  From the searches, I ran the following command (netstat -Lan) and saw a bunch of:

                  tcpX 0/0/128  which should tell me the queue size is 128.

                  The instructions tell you to issue the command:
                  sysctl kern.ipc.somaxconn=2048 and I get a readout of:

                  kern.ipc.somaxconn:128 -> 2048

                  However, when I run the netstat -Lan command again, it still shows a queue value of 128.  What else do I need to do?

                  1 Reply Last reply Reply Quote 0
                  • GertjanG
                    Gertjan
                    last edited by

                    @carzin:

                    I need some spoon feeding.  I am not a Linux guru.  …..

                    It even worse, Linux is not FreeBSD (at all).

                    Anyway, without putting my hands on your system, I can not explain why your identical pfSense is behaving differently as mine.
                    Adapting the queues is just a counter measure because
                    -> Your system can't handle the load (the queues are filing up without pfSense being able to handle it)
                    or
                    -> (so) analyze this 'load' … whats coming into your pfSense ? Is it the WAN , LAN ? other interface that is flooding ?

                    Can you limit the number of user ?

                    Can tcpdump tell you something ?

                    What did you change from the default setup ?

                    Note that I'm not a network expert neither, but these are the steps that I would take to dig up the problem.

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    1 Reply Last reply Reply Quote 0
                    • C
                      carzin
                      last edited by

                      Well, there isn't much I can do to limit the users.  The pfSense virtual machines (4 of them) are what I use to authenticate users when they connect to a setup SSID and funnel them to the appropriate configuration website.  I use the DNS forwarding functionality to limit what they have access to after they connect.  So, I have no control over how the users connect, or really, how many connect.

                      I suspect I see a lot more load on my boxes than most of you.  At peak, I can have 100s of users connecting through at a single instance.  And the box works just fine with that load.  The pfSense death happens for apparently no reason, and is not generally associated with load.  Which is why I liked the idea of a 'bad client' basically beating the hell outta the server until it dies.

                      1 Reply Last reply Reply Quote 0
                      • GertjanG
                        Gertjan
                        last edited by

                        Just a thought.

                        You said:

                        Well, there isn't much I can do to limit the users

                        but you really 'nag' them with this:

                        I use the DNS forwarding functionality to limit what they have access to after they connect.

                        What I make of it:
                        The users device knows it is connected (there is a DNS server, a gateway) : the link seems up.
                        But may DNS requests will not receive a reply - or a wrong reply.
                        What does the 'app' doing with this situation ?? A request to resolve i.e. facebook.com will yield many retries because it 'won't work'.

                        So: use tcpdump incoming port 53 - protocol UDP and TCP to see if your DNS resolver get swamped …

                        => This is just an idea ....

                        No "help me" PM's please. Use the forum, the community will thank you.
                        Edit : and where are the logs ??

                        1 Reply Last reply Reply Quote 0
                        • C
                          carzin
                          last edited by

                          This is fun.  Another zone, different from the last time, died.  And this is in the syslog:

                          Nov 1 10:58:17 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 10:58:17 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number
                          Nov 1 11:08:19 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 11:08:19 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number
                          Nov 1 11:18:21 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 11:18:21 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number
                          Nov 1 11:28:23 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 11:28:23 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number
                          Nov 1 11:38:25 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 11:38:25 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number
                          Nov 1 11:48:27 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 11:48:27 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number
                          Nov 1 11:51:04 lighttpd[34493]: (connections.c.305) SSL: 1 error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
                          Nov 1 11:54:23 lighttpd[34493]: (connections.c.305) SSL: 1 error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
                          Nov 1 11:58:29 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 11:58:29 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number
                          Nov 1 12:02:27 lighttpd[34493]: (connections.c.305) SSL: 1 error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
                          Nov 1 12:05:33 lighttpd[34493]: (connections.c.305) SSL: 1 error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
                          Nov 1 12:08:31 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher
                          Nov 1 12:08:31 lighttpd[34493]: (connections.c.305) SSL: 1 error:1408A10B:SSL routines:SSL3_GET_CLIENT_HELLO:wrong version number

                          1 Reply Last reply Reply Quote 0
                          • GertjanG
                            Gertjan
                            last edited by

                            Probably a client connection to a '443' (https) not using a https 'talk'.

                            No "help me" PM's please. Use the forum, the community will thank you.
                            Edit : and where are the logs ??

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.