RADIUS authentication failing (timed out) and dumping core
-
We have three offices with nearly identical pfSense configurations. We have
VPN servers for clients to connect to and a local RADIUS server to provide
authentication for them. On one of our pfSense boxes, it fails to
authenticate using the local RADIUS server. It sends requests and
access-accept messages are returned. However, the php-fpm process appears to
ignore the responses and then crashes.I'm getting blocked so the rest is here
-
Program received signal SIGSEGV, Segmentation fault.
Address not mapped to object.
0x000000084f71083f in rad_get_attr () from /usr/local/lib/php/20220829/radius.so
(gdb) bt
#0 0x000000084f71083f in rad_get_attr () from /usr/local/lib/php/20220829/radius.so
#1 0x000000084f70eced in zif_radius_get_attr () from /usr/local/lib/php/20220829/radius.so
#2 0x00000000006c2295 in ?? ()
#3 0x000000000068c768 in execute_ex ()But that's pretty useless without debug symbol information
-
@opoplawski said in RADIUS authentication failing (timed out) and dumping core:
authentication for them. On one of our pfSense boxes, it fails to
You have more boxes with the same setup ?
That a gold mine : you can compare the working one with the failing one.
This is the starting point : the binaries are teh same at a bit level.
Only the local setting differ.@opoplawski said in RADIUS authentication failing (timed out) and dumping core:
without debug symbol information
Its open source, so you can gave a look at the failing lines and see the test condition where it failed.
So you'll know what variable (example) was not set or out of range or not known but mandatory.
from there you work upwards. -
@opoplawski said in RADIUS authentication failing (timed out) and dumping core:
Interestingly, the other offices are able to use the RADIUS server in the
problem office to authenticate. And the problem pfSense box is able to
authenticate using the RADIUS servers in the other offices.Mmm, that is interesting. Same pfSense versions? Same architectures?
-
@Gertjan Yeah, I was hoping to spot some differences between the different machines, but I haven't been able to find any that seem relevant yet.
Looking at:
select(10,{ 9 },0x0,0x0,{ 5.000000 }) = 1 (0x1)
recvfrom(9,"^B\M-)\0\M-t\rv\M-]O~\M-$}\M-4~"...,4096,MSG_WAITALL,{ AF_INET
RADIUS:1812 },0x820d022fc) = 244 (0xf4)
recvfrom(9,0x37bbc7deab51,4096,MSG_WAITALL,0x820d02300,0x820d022fc) ERR#35
'Resource temporarily unavailable'
select(10,{ 9 },0x0,0x0,{ 4.708796 }) = 0 (0x0)This seems to be the recvfrom() call:
https://github.com/LawnGnome/php-radius/blob/1.4.0b1/radlib.c#L503What I don't understand from reading the code how we could possibly be calling recvfrom() twice in a row without any other system calls in between - or maybe there is something I just don't understand about the truss output.
I'd love to be able to step through the code via gdb, but I can't without debug symbols.
-
Hi.
Truss shows only the contents of system calls and nothing else.
As far as I can see, the socket opens in non-blocking modefcntl(9,F_SETFL,O_RDWR|O_NONBLOCK) = 0 (0x0)
Therefore, the error for this mode is
35 EAGAIN Resource temporarily unavailable. This is a temporary condition and later calls to the same routine may complete normally.
it is quite realThe main question for me is why the client rejects the "good" first response from the server
For example ,
recvfrom(9,"^B\M-)\0\M-t\rv\M-]O~\M-$}\M-4~"...,4096,MSG_WAITALL,{ AF_INET
RADIUS:1812 },0x820d022fc) = 244 (0xf4)RADIUS: 1812 - is this the real output of the truss utility ? or has the real ip address been replaced?
recvfrom(9,"^B\M-7\0\M-tL\M-6`\M^T\M^V\M^L"...,4096,MSG_WAITALL, { AF_INET 10.10.11.10:1812 } 0x820d022fc) = 244 (0xf4)
Are there any entries in the client's log?
Is there a way to intercept packet exchange using tcpdump?if it is possible to run the dtrace utility, then you can see what happens during the connection
dtrace -n 'fbt::rad_send_request:return , fbt::rad_continue_send_request:return , fbt::rad_init_send_request:return , fbt::is_valid_response:return {printf("=>%d",arg1)}'
-
What pfSense version are you using here?
-
@stephenw10 pfSense 2.7.2-RELEASE (amd64)
-
@Konstanti RADIUS:1812 is a redaction of the IP address. dtrace looks like a bridge to far to get working on pfSense.
This is the only thing that ends up in the logs:
openvpn[94373]: /openvpn.auth-user.php: Error during RADIUS authentication : Operation timed out
-
Are you able to test this in Plus? Where dtrace is available.
-
@stephenw10 I get:
dtrace: invalid probe specifier fbt::rad_send_request:return , fbt::rad_continue_send_request:return , fbt::rad_init_send_request:return , fbt::is_valid_response:return {printf("=>%d",arg1)}: probe description fbt::rad_send_request:return does not match any probes
-
That's in 24.11? Looks like it's working but the query you're using in invalid. I won't pretend to be any sort of expert with dtrace though!
-
@stephenw10 24.03 - looks like I was stuck on previous stable for some reason. I'll try to update soon.
Seems to me like dtrace is for kernel stuff, not just C-library tracing, but I'm completely unfamiliar with dtrace.
-
Hmm, OK you're seeing the same error in radius from all three versions?
-
@opoplawski
DTrace is a comprehensive dynamic tracing framework ported from Solaris. DTrace provides a powerful infrastructure that permits administrators, developers, and service personnel to concisely answer arbitrary questions about the behavior of the operating system and user programs.here is an example of working in user space, maybe you need to know the name of the process that calls the necessary functions
or
-
@Konstanti As I understand it, I would need probes for the php radius library in order to trace it. When I look at the output of dtrace -l to show the available probes I don't see anything relevant to what I want to trace. There are kernel functions, malloc stuff, syscalls (but I already have that with truss), etc.
It appears that php can be built with dtrace support, but it appears that it hasn't in pfSense Plus. But again what I really want to trace is php-pecl-radius
-
Hi
You were right, I was wrong, dtrace does not work with all functions. Therefore, as a result, we see a probe error