The AP was rebooting itself every 20 minutes. I was thrown off the trail by the fact that the packet loss was showing up every 40 minutes, and that the rate of loss didn't appear consistent, except in chunks of 24 hours. The latter can be explained by rounding, since the rrd samples are 5 minutes, while the down time was less than a minute. I don't know how to explain the fact that every second outage was not manifest in the rrd graph though.
As for the ssh hanging, you're right, I didn't have the box checked to override state killing on gateway failure, so pfsense was killing all states when that backhaul went down.