IP Phones one way audio on 2nd WAN

pfrickroll

I have now a new problem with IP Phones. I don't have much knowledge about SIP/RTP. We have phones that we rent from provider. They configure the phone and ship it to us. I just plug it into our network and it works, sometimes set up VLAN on the phone if needed and that's it.

After setting up Failover with Double WAN I got another problem. During testing when I unplug WAN 1 STATIC, it takes about 2.5 mins for the phone to be back online on WAN 2 DHCP (which is fine). However, if I call out from that phone to anywhere the other party cannot hear me but I can hear them.

When I plug WAN 1 one Failover switches back from WAN 2 to WAN 1 just fine but phones still go out from WAN 2. If I kill states in WAN 2 and begin to dial from phone it still goes out from WAN 2 with one way audio.

I don't have any strange configs, no VLANS, no VPNs but Failover set up and appropriate Firewall rule in LAN.

When I did testing in my office with another firewall with same settings except it had WAN 1 DHCP (no STATIC) and WAN 2 DHCP. During the failover switch back and forth the initial call wouldn't have audio but next call and all others after were fine.

NogBadTheBad

Tried instlling the siproxd pgk?

pfrickroll

@NogBadTheBad I did before but I don't know how to set it up really. I looked at pfSense doc but its too vague and short. For example "outbound interface" which do I pick? My primary WAN? How about failover WAN?

This is what what I dug up from the IP Phone itself (Model: Mitel 6865i)

TypeService DSCP:
TypeService SIP 24
TypeService RTP 46
TypeService RTCP 46

Nat SIP Port 51620
NAT RTP Port 51720
RTP Port Base 3000

Derelict

So inside phone to outside SIP?

What NAT rules (both port forwards/rules and outbound NAT) do you have for the VoIP on the WAN that works?

pfrickroll

@Derelict Yes I have the phone and SIP provider is outside. I don't have any rules, it just works on primary WAN when I plug it into the switch.

Derelict

Then it should work on the secondary. What does your VoIP provider have to say?

senseivita

Rules should look somewhat like this:
Screen Shot 2020-10-07 at 17.03.28.png

See the Gateway column? That's what the traffic will be routed through. You set it up in System>Routing>GatewayGroups.

You can create a second rule, identical to that right underneath but without a gateway set so if the traffic doesn't match the rule with the gateway you want, then the second rule would be set to reject/block and the traffic would not pass at all. Load balacing is defined in Gateway Groups if you do wish for the traffic to fail over.

To define the gateway in the rule, click on the Advanced button and scroll all the way down. In my experience is usually the PBX/phone that's misconfigured, disable any NAT "aids" on it, let pfSense handle it. TURN, if you've been provided a server, that leave it on.

pfrickroll

@Derelict I made a ticket but no answer yet.

pfrickroll

@skilledinept Here is what I have.

You can create a second rule, identical to that right underneath but without a gateway set so if the traffic doesn't match the rule with the gateway you want, then the second rule would be set to reject/block and the traffic would not pass at all.

Like this you mean?

senseivita

@pfrickroll Yep, except the action should be set to block/reject.

Think of rules as {yes} or {no} only, the rest of the values in a rule are conditions for the rule to match (and add/remove tags but that's super advanced stuff). They don't necessarily shape the traffic.

So, if the gateways isn't available, traffic won't match the rule, it'll match the next one where there isn't a gateway or there's a gateway that you don't want the traffic to take and that'll be set to reject the traffic.

senseivita

BTW, if you don't know already;
block means "ignore" and some devices while waiting for a never-arriving response hang, this may get you angry users, some OSes (macOS) completely freeze.
reject sends a flat out "no, you're not allowed". Devices will probably keep retrying anyway but they will not hang because they're getting an immediate response.

pfrickroll

@skilledinept So, i did block and reject. Didn't fix anything

senseivita

What I think it's happening is that your registration is being kept alive in the connection that has recently died and fails to recognize quick enough the new connection. In SIP this is very complex because you deal with things such as registration every X seconds, depending how are your phones configured. There are a lot of settings that even prevent the phone from sending registering if there's a call in progress, if another instance is already registered, and if there is, what will happen to the last instance, e.g; kill it, let it coexist, etc. Emphasis on etc. It's a very complicated protocol and it won't even handle audio, that's another nightmare.

In your specific situation if you are doing a failover gateway, doing the negate rules (that's what these are called) wouldn't actually do anything, negate rules are to limit traffic exiting elsewhere, but if your gateway group is already covering all the exits you don't have an "elsewhere" to go. You need to select a single gateway in the rule above the negate rule.

Please keep in mind that I'm far from a pfSense expert and I'm purposely avoiding one area that might be relevant because I don't want to misguide you. I'll study up and post it if I find it helpful, though.

I assume you want it to failover, then you'd probably want this option in System > Advanced > Miscellaneous (<firewall's URL>/system_advanced_misc.php):
Screen Shot 2020-10-09 at 13.04.17.png
Although, SIP usually gravitates towards UDP connections (stateless) but it can use TCP as well. (TLS, the OPUS codec and port 5061 are all hints it might be using TCP). It couldn't hurt anyway, it'll make switchover faster when a gateway goes down.

Double check your gateway groups and verify you're using failover and not load balancing, you can use both at the same time and in various orders:
Screen Shot 2020-10-09 at 13.12.52.png

Screen Shot 2020-10-09 at 13.16.34.png

In System > Routing > Gateways, in click on the little pencil to edit your gateways, add monitor IPs, you can enter public DNS servers here or any other server that you're sure will be always up, whitelisted and accepting ping packets (ICPM echo requests), you'll need a different one per gateway. At the end of the page click on the Advanced button and click it to reveal the Advanced section, increase the aggressiveness of which the gateway will be marked as dead (and thus the failover is triggered).

If you need guidance check this out: https://docs.netgate.com/pfsense/en/latest/routing/gateway-configure.html, part of The pfSense Book, which I have to say, you should give it another shot, I have mad respect for whoever wrote this, it helps you understand a lot of things regardless if you're using pfSense at all.

Alternatively if you don't set a gateway per rule you can do it globally in the default gateway section. Same page as before, System > Routing > Gateways. You can select gateway groups in there too.

Screen Shot 2020-10-09 at 13.31.02.png

Lastly, if you are not running a super secure corporate network, you might be able to fix it by enabling UPnP: Services > UPnP & NAT-PMP. Phones actually have a setting for this in the NAT section, it's rare the one that doesn't. If you use things like Xbox Live this is great for that too. The downside is that it makes it easy for a potential trojan-type malware to open tunnels into your network. If you're careful in your browsing habits though, you have nothing to worry about.

Good luck!

senseivita

I kinda got a little sidetracked; slight ADHD--I get easily distracted, sorry… :)

What I was referring to earlier was to the UDP timeouts, a recent conversation got me confused and I couldn't remember if this was for TCP, this helps with tunnels and VoIP which predominantly use UDP. But, if you've been able to place calls already it's proof the firewall is handling the datastreams fine and if it's the switchover where it hangs then this may actually worsen things as it would keep ports open longer than it needs to, prolonging convergence. If any pfSense pro or moderator could chime in *wavingArmsInTheAir* here, that'd be awesome.

I think the issue might be server-side. The server fails to update the connection source address. Trunk providers have a lot of preemptive measures for connection issues, specially is they provide hardware but it happens.

Here what I was talking about: https://docs.netgate.com/pfsense/en/latest/recipes/nat-voip-phones.html

Remember that you don't need to open any port on the WAN interfaces, whenever you create a rule to allow traffic somewhere (in this case those would be your standard firewall rules) the packets get slapped a reply-to address and tons of metadata to uniquely identify them so when they come back they're automatically allowed though. In other firewalls this would be the "related" traffic.

Most issues with SIP is that it's got headers like HTTP and uses its own reply-tos and and the addresses don't match the ones in layer 3. It's all the way up at layer 7, it shouldn't be messing with IP anyway. … I swear HTTP, SIP and SMTP (mail) are just versions of the same protocol, that's another crazy topic, though.

Anyway, good luck!

pfrickroll

@skilledinept So, here is what I've done. I decided to do a simple physical test. I took my phone from work which I was testing and it was working fine switching from WAN1 to WAN2 under 2 minutes time frame. It didn't switch back to WAN1 when it was back up but I was fine with it.
I brought my phone to one of the satellite offices where phones had one way audio or no audio at all during failover to WAN2. I plugged both phones mine and the one in the office straight into the cellular modem without any firewall and they both worked fine. Then I connected both ISPs to pfSense and plugged both phones with a switch into pfSense. After unplugging WAN 1 phones took about 2 minutes and were back up on WAN2. Except...my phone worked fine but the one from that office had the usual audio problems (if i call to extension inside of our company no audio at all, if i call outside number I could hear but no one could hear me.) Then I took both phones mine and the one from office with audio issues and brought them back to my office. I performed the same test and got the same result. After reading a bit more about SIP I remoted into the phone which had audio issues and switched SIP from UDP to TCP it began working as intended. However, it takes about 15 minutes for the phone to switch from WAN 1 to WAN 2. It also switches back to WAN 1 when its back up unlike with UDP that remains on WAN 2 until phone or firewall rebooted. Both phones have identical config but one needs SIP over TCP while other is fine with SIP over UDP. I called the company with whom we have service and rent the phones and they don't know what to say.

I would like to shorten the time between WAN1 - WAN2-back to WAN1 but I don't know how or if its possible for IP Phones SIP over TCP. There are some options in the phone but I don't which timers I have to adjust to speed it up or if its possible at all. I might have to look back again at scripts from the other post I made before this one.
SIP Advanced.PNG