@vbman213
I don't know if this will help any. I've been running SIP through pfSense for more than 12 years now and have come across various difficulties that I've mostly beaten into submission at this point... although, the maturity of all the moving pieces has helped a lot along the way. The following laundry list is not specific to your issue, just something to review and consider what the impact would be, if any, in your specific setup.
SIP challenges
IP layer addresses/ports are written in Application Layer headers.
PBX may be trying to mitigate problems by predicting the public address of a server on a private subnet. It’s important to understand what behavior is enabled and determine if it’s suitable to the architecture, including all edge cases.
SIP application layer header rewriting rules may be required, and if so, care must be taken to mitigate issues with edge cases
When STUN is in use there will be problems if it’s used incorrectly; i.e. reached through wrong path, returns wrong answer. Care must be taken to avoid race conditions if/when failovers can dynamically change the public IP.
Out-of-Band SIP application protocol negotiates a transport stream protocol with specific IP layer address/port pairs.
Usually this is implemented by opening a static block to be utilized for S/RTP port assignments, rather than synchronizing the negotiation with rules dynamically. This pool MUST be synchronized with the PBX.
You need to be clear on which side can initiate the S/RTP stream so the rules are in the right place.
Packet filters generally can’t distinguish separate application layer streams over UDP, so a SIP REGISTER (outbound) will enable a SIP INVITE (inbound) to PASS even when there is no functional inbound rule, as long as the State continues to exist (assuming both sides are using port 5060).
SIP should be implemented over TCP since the control protocol benefits from being reliable.
S/RTP streams should be implemented over UDP because reliability is undesirable. In real time applications, packets arriving late or out of order have no value. You can’t play audio packets out of order, and can’t hold up real-time streams for retransmission of lost packets without introducing an unrecoverable delay.
Reliable tunnels can exacerbate the problem, and certainly will contribute to judder. Real-time traffic needs to basically follow a now or never pattern.
Silence is not golden. In some configurations, silence, like muting a phone while on a conference call, will cause no S/RTP packets to be sent.
I’ve had issues with silence lasting more than 5-minutes causing the call to terminate. It appears that something in the path decides to timeout if no S/RTP packets are received for 5-minutes, and declares the call to be abandoned.
Firewalls may expire state table entries considered stale even though the call is active from the application perspective.
There is usually an option to “generate silence packets” which actually creates packets with pseudo-background noise, which then maintains a consistent S/RTP stream and lessens the likelihood calls will drop due to S/RTP timeouts. FYI: I’ve seen some phones that were too smart for their own good; generating silence packets when the audio was silent, but as soon as Mute was activates the S/RTP stream stopped anyway.