RADIUS authentication failing (timed out) and dumping core
- 
 We have three offices with nearly identical pfSense configurations. We have 
 VPN servers for clients to connect to and a local RADIUS server to provide
 authentication for them. On one of our pfSense boxes, it fails to
 authenticate using the local RADIUS server. It sends requests and
 access-accept messages are returned. However, the php-fpm process appears to
 ignore the responses and then crashes.I'm getting blocked so the rest is here 
- 
 Program received signal SIGSEGV, Segmentation fault. 
 Address not mapped to object.
 0x000000084f71083f in rad_get_attr () from /usr/local/lib/php/20220829/radius.so
 (gdb) bt
 #0 0x000000084f71083f in rad_get_attr () from /usr/local/lib/php/20220829/radius.so
 #1 0x000000084f70eced in zif_radius_get_attr () from /usr/local/lib/php/20220829/radius.so
 #2 0x00000000006c2295 in ?? ()
 #3 0x000000000068c768 in execute_ex ()But that's pretty useless without debug symbol information 
- 
 @opoplawski said in RADIUS authentication failing (timed out) and dumping core: authentication for them. On one of our pfSense boxes, it fails to You have more boxes with the same setup ? 
 That a gold mine : you can compare the working one with the failing one.
 This is the starting point : the binaries are teh same at a bit level.
 Only the local setting differ.@opoplawski said in RADIUS authentication failing (timed out) and dumping core: without debug symbol information Its open source, so you can gave a look at the failing lines and see the test condition where it failed. 
 So you'll know what variable (example) was not set or out of range or not known but mandatory.
 from there you work upwards.
- 
 @opoplawski said in RADIUS authentication failing (timed out) and dumping core: Interestingly, the other offices are able to use the RADIUS server in the 
 problem office to authenticate. And the problem pfSense box is able to
 authenticate using the RADIUS servers in the other offices.Mmm, that is interesting. Same pfSense versions? Same architectures? 
- 
 @Gertjan Yeah, I was hoping to spot some differences between the different machines, but I haven't been able to find any that seem relevant yet. Looking at: 
 select(10,{ 9 },0x0,0x0,{ 5.000000 }) = 1 (0x1)
 recvfrom(9,"^B\M-)\0\M-t\rv\M-]O~\M-$}\M-4~"...,4096,MSG_WAITALL,{ AF_INET
 RADIUS:1812 },0x820d022fc) = 244 (0xf4)
 recvfrom(9,0x37bbc7deab51,4096,MSG_WAITALL,0x820d02300,0x820d022fc) ERR#35
 'Resource temporarily unavailable'
 select(10,{ 9 },0x0,0x0,{ 4.708796 }) = 0 (0x0)This seems to be the recvfrom() call: 
 https://github.com/LawnGnome/php-radius/blob/1.4.0b1/radlib.c#L503What I don't understand from reading the code how we could possibly be calling recvfrom() twice in a row without any other system calls in between - or maybe there is something I just don't understand about the truss output. I'd love to be able to step through the code via gdb, but I can't without debug symbols. 
- 
 Hi. 
 Truss shows only the contents of system calls and nothing else.
 As far as I can see, the socket opens in non-blocking modefcntl(9,F_SETFL,O_RDWR|O_NONBLOCK) = 0 (0x0) Therefore, the error for this mode is 
 35 EAGAIN Resource temporarily unavailable. This is a temporary condition and later calls to the same routine may complete normally.
 it is quite realThe main question for me is why the client rejects the "good" first response from the server 
 For example ,
 recvfrom(9,"^B\M-)\0\M-t\rv\M-]O~\M-$}\M-4~"...,4096,MSG_WAITALL,{ AF_INET
 RADIUS:1812 },0x820d022fc) = 244 (0xf4)RADIUS: 1812 - is this the real output of the truss utility ? or has the real ip address been replaced? recvfrom(9,"^B\M-7\0\M-tL\M-6`\M^T\M^V\M^L"...,4096,MSG_WAITALL, { AF_INET 10.10.11.10:1812 } 0x820d022fc) = 244 (0xf4) Are there any entries in the client's log? 
 Is there a way to intercept packet exchange using tcpdump?if it is possible to run the dtrace utility, then you can see what happens during the connection dtrace -n 'fbt::rad_send_request:return , fbt::rad_continue_send_request:return , fbt::rad_init_send_request:return , fbt::is_valid_response:return {printf("=>%d",arg1)}' 
- 
 What pfSense version are you using here? 
- 
 @stephenw10 pfSense 2.7.2-RELEASE (amd64) 
- 
 @Konstanti RADIUS:1812 is a redaction of the IP address. dtrace looks like a bridge to far to get working on pfSense. This is the only thing that ends up in the logs: openvpn[94373]: /openvpn.auth-user.php: Error during RADIUS authentication : Operation timed out 
- 
 Are you able to test this in Plus? Where dtrace is available. 
- 
 @stephenw10 I get: dtrace: invalid probe specifier fbt::rad_send_request:return , fbt::rad_continue_send_request:return , fbt::rad_init_send_request:return , fbt::is_valid_response:return {printf("=>%d",arg1)}: probe description fbt::rad_send_request:return does not match any probes 
- 
 That's in 24.11? Looks like it's working but the query you're using in invalid. I won't pretend to be any sort of expert with dtrace though! 
- 
 @stephenw10 24.03 - looks like I was stuck on previous stable for some reason. I'll try to update soon. Seems to me like dtrace is for kernel stuff, not just C-library tracing, but I'm completely unfamiliar with dtrace. 
- 
 Hmm, OK you're seeing the same error in radius from all three versions? 
- 
 @opoplawski 
 DTrace is a comprehensive dynamic tracing framework ported from Solaris. DTrace provides a powerful infrastructure that permits administrators, developers, and service personnel to concisely answer arbitrary questions about the behavior of the operating system and user programs.here is an example of working in user space, maybe you need to know the name of the process that calls the necessary functions  or  
- 
 @Konstanti As I understand it, I would need probes for the php radius library in order to trace it. When I look at the output of dtrace -l to show the available probes I don't see anything relevant to what I want to trace. There are kernel functions, malloc stuff, syscalls (but I already have that with truss), etc. It appears that php can be built with dtrace support, but it appears that it hasn't in pfSense Plus. But again what I really want to trace is php-pecl-radius 
- 
 Hi 
 You were right, I was wrong, dtrace does not work with all functions. Therefore, as a result, we see a probe errorP.S. 
 I think I've found a solution for the userland programs- we find the pid of the php-fpm process
 ps -awx | grep php
 342 - Ss 8:55.06 php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm)
 23960 - I 3:31.30 php-fpm: pool nginx (php-fpm)
 27761 - I 2:07.14 php-fpm: pool nginx (php-fpm)
 31652 - I 3:16.94 php-fpm: pool nginx (php-fpm)
 37038 - I 3:12.25 php-fpm: pool nginx (php-fpm)
 46849 - I 0:55.34 php-fpm: pool nginx (php-fpm)
 55575 - I 3:18.29 php-fpm: pool nginx (php-fpm)
 94877 - I 2:03.82 php-fpm: pool nginx (php-fpm)
 2 using dtrace 
 dtrace -qn 'pid342:radius.so::entry { @[probefunc] = count();}'dtrace -l | grep pid 70870 pid342 radius.so zif_radius_put_vendor_addr entry 
 70871 pid342 radius.so rad_close entry
 70872 pid342 radius.so rad_create_request entry
 70873 pid342 radius.so rad_put_vendor_int entry
 70874 pid342 radius.so rad_demangle entry
 70875 pid342 radius.so zif_radius_get_vendor_attr entry
 70876 pid342 radius.so rad_strerror entry
 70877 pid342 radius.so zif_radius_send_request entry
 70878 pid342 radius.so zif_radius_get_tagged_attr_tag entry
 70879 pid342 radius.so zif_radius_demangle entry
 70880 pid342 radius.so zif_radius_request_authenticator entry
 70881 pid342 radius.so zif_radius_put_int entry
 70882 pid342 radius.so rad_salt_value entry
 70883 pid342 radius.so zif_radius_salt_encrypt_attr entry
 70884 pid342 radius.so zif_radius_strerror entry
 70885 pid342 radius.so zif_radius_config entry
 70886 pid342 radius.so zif_radius_cvt_int entry
 70887 pid342 radius.so rad_acct_open entry
 70888 pid342 radius.so zif_radius_cvt_string entry
 70889 pid342 radius.so zif_radius_server_secret entry
 70890 pid342 radius.so zif_radius_get_tagged_attr_data entry
 70891 pid342 radius.so rad_put_vendor_attr entry
 70892 pid342 radius.so rad_send_request entry
 70893 pid342 radius.so rad_put_vendor_addr entry
 70894 pid342 radius.so zif_radius_add_server entry
 70895 pid342 radius.so zif_radius_close entry
 70896 pid342 radius.so rad_get_vendor_attr entry
 70897 pid342 radius.so rad_init_send_request entry
 70898 pid342 radius.so zif_radius_auth_open entry
 70899 pid342 radius.so _radius_close entry
 70900 pid342 radius.so rad_continue_send_request entry
 70901 pid342 radius.so get_module entry
 70902 pid342 radius.so rad_put_int entry
 70903 pid342 radius.so rad_put_attr entry
 70904 pid342 radius.so rad_auth_open entry
 70905 pid342 radius.so rad_get_attr entry
 70906 pid342 radius.so rad_cvt_int entry
 70907 pid342 radius.so zif_radius_acct_open entry
 70908 pid342 radius.so rad_cvt_addr entry
 70909 pid342 radius.so rad_put_addr entry
 70910 pid342 radius.so rad_server_secret entry
 70911 pid342 radius.so zif_radius_put_attr entry
 70912 pid342 radius.so zif_radius_get_attr entry
 70913 pid342 radius.so zif_radius_cvt_addr entry
 70914 pid342 radius.so zif_radius_put_addr entry
 70915 pid342 radius.so zm_startup_radius entry
 70916 pid342 radius.so rad_config entry
 70917 pid342 radius.so zm_info_radius entry
 70918 pid342 radius.so zif_radius_demangle_mppe_key entry
 70919 pid342 radius.so zif_radius_create_request entry
 70920 pid342 radius.so rad_open entry
 70921 pid342 radius.so zif_radius_put_vendor_int entry
 70922 pid342 radius.so zif_radius_put_string entry
 70923 pid342 radius.so rad_cvt_string entry
 70924 pid342 radius.so rad_demangle_mppe_key entry
 70925 pid342 radius.so rad_put_string entry
 70926 pid342 radius.so rad_request_authenticator entry
 70927 pid342 radius.so zif_radius_put_vendor_string entry
 70928 pid342 radius.so zif_radius_put_vendor_attr entry
 70929 pid342 radius.so zm_shutdown_radius entry
 70930 pid342 radius.so rad_put_vendor_string entry
 70931 pid342 radius.so rad_add_server entryfor example , dtrace -n 'pid342:radius.so:rad_send_request:return {printf("=>%d",arg1)}' 
 dtrace: description 'pid342:radius.so:rad_send_request:return ' matched 1 probe
- we find the pid of the php-fpm process
- 
 @Konstanti well, that was an interesting if ultimately unfruitful direction, thanks. I created the following script to create a dtruss script: #!/usr/local/bin/perl open(PS,'ps -awx |'); while (<PS>) { next unless /php-fpm: pool/; ($pid) = split; push(@pids, $pid); } close(PS); print "syscall::recvfrom:return\n"; print "/pid == " . join(" || pid == ", @pids) . "/\n"; print "{\n"; print 'printf("\n%d:\trecvfrom() = %d\n", pid, arg1);' . "\n"; print "}\n"; my $last_pid = pop(@pids); foreach $pid (@pids) { print "pid" . $pid . ":radius.so::entry,\n"; } print "pid" . $last_pid . ":radius.so::entry\n"; print "{\n"; print 'printf("%d:\t%s(%d, 0x%x, %4d)\t", pid, probefunc, arg0, arg1, arg2);' . "\n"; print "}\n"; foreach $pid (@pids) { print "pid" . $pid . ":radius.so:rad_strerror:return,\n"; } print "pid" . $last_pid . ":radius.so:rad_streeor:return\n"; print "{\n"; print 'printf("(%d)\t = %s\n", pid, copyinstr(arg1));' . "\n"; print "}\n"; foreach $pid (@pids) { print "pid" . $pid . ":radius.so::return,\n"; } print "pid" . $last_pid . ":radius.so::return\n"; print "{\n"; print 'printf("(%d)\t = %d\n", pid, arg1);' . "\n"; print "}\n";It produces something like: syscall::recvfrom:return /pid == 14656 || pid == 30511 || pid == 31820 || pid == 53410 || pid == 74857 || pid == 99423 || pid == 99617/ { printf("\n%d:\trecvfrom() = %d\n", pid, arg1); } pid14656:radius.so::entry, pid30511:radius.so::entry, pid31820:radius.so::entry, pid53410:radius.so::entry, pid74857:radius.so::entry, pid99423:radius.so::entry, pid99617:radius.so::entry { printf("%d:\t%s(%d, 0x%x, %4d)\t", pid, probefunc, arg0, arg1, arg2); } pid14656:radius.so:rad_strerror:return, pid30511:radius.so:rad_strerror:return, pid31820:radius.so:rad_strerror:return, pid53410:radius.so:rad_strerror:return, pid74857:radius.so:rad_strerror:return, pid99423:radius.so:rad_strerror:return, pid99617:radius.so:rad_streeror:return { printf("(%d)\t = %s\n", pid, copyinstr(arg1)); } pid14656:radius.so::return, pid30511:radius.so::return, pid31820:radius.so::return, pid53410:radius.so::return, pid74857:radius.so::return, pid99423:radius.so::return, pid99617:radius.so::return { printf("(%d)\t = %d\n", pid, arg1); }And I got (cleaned up): 53410: rad_send_request(45709006907392, 0x299275926a00, 80) 53410: recvfrom() = 244 53410: recvfrom() = -1 53410: recvfrom() = 244 53410: recvfrom() = -1 53410: recvfrom() = 244 53410: recvfrom() = -1 (53410) = 4294967295 53410: rad_strerror(45709006907392, 0x299275926a00, 80) (53410) = Operation timed outWhich is pretty much what I knew before. Still no idea why rad_send_request() does not accept the response that it gets. On the up side, I do know a lot more know about dtrace. 
- 
 @opoplawski 
 Hi
 , what happens when the recvfrom function is executed has been known for a long timethe question is what happens when the client receives a response (244 bytes) 
 during the execution of this part of the program, therad_send_request 
 rad_continue_send_request
 rad_init_send_request
 is_valid_responsefunctions are used the most interesting function, in my opinion, is 
 is_valid_responsebut, apparently, it is impossible to get the result of its execution, this is an internal function that dtrace does not "see". you can try to see what result the rest of the functions from this list return using dtrace print 'printf("(%d)\t %s= %d\n", pid,probefunc,arg1);' . "\n"; I also recommend using tcpdump to analyze traffic on port 1812 suspect that the problem is precisely the result that the is_valid_response function returns 
 in case of an error, it returns 0 and forces the program to receive data from the socket (the second attempt to call the recvfrom function), as expected, the socket is empty and the function returns an error 35 EAGAIN53410: recvfrom() = 244 
 recvfrom(9,"^B\M-)\0\M-t\rv\M-]O~\M-$}\M-4~"...,4096,MSG_WAITALL,{ AF_INET
 RADIUS:1812 },0x820d022fc) = 244 (0xf4)53410: recvfrom() = -1 
 recvfrom(9,0x37bbc7deab51,4096,MSG_WAITALL,0x820d02300,0x820d022fc) ERR#35
 'Resource temporarily unavailable'unfortunately, I do not find in the library code (radius.so) any implementation of zeroing the data reception buffer before calling the recvfrom function. Therefore, it is not known what is in the buffer after receiving the data, 
 is_valid_response- Check the source address
 if ok, then
- Check the message length
 if ok , then
- Check the response authenticator
 if ok, the packet has been verified (return 1)
 otherwise (return 0)
 it is necessary to use tcpdump to analyze the radius server response 
- Check the source address
- 
 @Konstanti said in RADIUS authentication failing (timed out) and dumping core: it is necessary to use tcpdump to analyze the radius server response Don't forget 
 radtest
 radsniff
 radsniffand 
 running radius manuallyradiusd -X(kill in first in the GUI then start it on the console or SSH ) 

