Sip invite packets dropped after random time(fixed by reboot of pfsense)



  • Anyone have any idea how to increase debugging information from the NAT translation layer through the firewall layer? I have set all sip related rules to log all actions, and I have packet captures from the WAN port and the LAN port of pfsense. Something is dropping all sip invite packets after a random number of hours, and there is nothing in the logs. I have been able to reproduce this issue very easily by just waiting a couple of days.

    The only solution I have found is to add a firewall rule, apply it, then delete it. This must be done every night or else the next day the sip traffic dies.



  • I have packet captures from the WAN link from when the traffic worked, and from when it didn't. In wireshark they seem exactly the same except for the timestamps.



  • Not a solution but a confirmation of the problem.

    I too have INVITEs being dropped after the firewall has supposedly passed them.

    WAN tcpdump shows packets arriving, firewall rule is logged and shows a PASS for each one, LAN interface on pfsense and on my asterisk server show no trace of the packets.

    Curiously, normal SIP OPTIONS and OK traffic seems to work while all this is going on.

    Like you I would welcome help doing some debugging. Clearing the filters sorts it out for a while then it happens again. Most annoying!

    Jon



  • Can post all rules on the box?

    pfctl -sr
    pfctl -sn
    


  • I have put stars in my public ip address but it starts with 70...*

    pfctl -sr

    scrub in on xl0 all fragment reassemble
    scrub in on xl1 all fragment reassemble
    anchor "relayd/" all
    block drop in log all label "Default deny rule"
    block drop out log all label "Default deny rule"
    block drop in quick inet6 all
    block drop out quick inet6 all
    block drop quick proto tcp from any port = 0 to any
    block drop quick proto tcp from any to any port = 0
    block drop quick proto udp from any port = 0 to any
    block drop quick proto udp from any to any port = 0
    block drop quick from <snort2c>to any label "Block snort2c hosts"
    block drop quick from any to <snort2c>label "Block snort2c hosts"
    block drop quick from <pfsnortsamout>to any label "Block pfSnortSamOut hosts"
    block drop quick from any to <pfsnortsamin>label "Block pfSnortSamIn hosts"
    block drop in log quick proto carp from (self) to any
    pass quick proto carp all keep state
    pass quick proto pfsync all keep state
    block drop in log quick proto tcp from <sshlockout>to any port = ssh label "sshlockout"
    block drop in log quick proto tcp from <webconfiguratorlockout>to any port = https label "webConfiguratorlockout"
    block drop in quick from <virusprot>to any label "virusprot overload table"
    block drop in on ! xl0 inet from 192.168.100.0/24 to any
    block drop in inet from 192.168.100.1 to any
    block drop in on xl0 inet6 from fe80::250:daff:fecc:fee5 to any
    pass in on xl0 inet proto udp from any port = bootpc to 255.255.255.255 port = bootps keep state label "allow access to DHCP server"
    pass in on xl0 inet proto udp from any port = bootpc to 192.168.100.1 port = bootps keep state label "allow access to DHCP server"
    pass out on xl0 inet proto udp from 192.168.100.1 port = bootps to any port = bootpc keep state label "allow access to DHCP server"
    block drop in log quick on xl1 from <bogons>to any label "block bogon networks from WAN"
    block drop in on ! xl1 inet from 70.
    ../29 to any
    block drop in inet from 70...* to any
    block drop in on xl1 inet6 from fe80::2b0:d0ff:fe72:f90b to any
    block drop in log quick on xl1 inet from 10.0.0.0/8 to any label "block private networks from wan block 10/8"
    block drop in log quick on xl1 inet from 127.0.0.0/8 to any label "block private networks from wan block 127/8"
    block drop in log quick on xl1 inet from 172.16.0.0/12 to any label "block private networks from wan block 172.16/12"
    block drop in log quick on xl1 inet from 192.168.0.0/16 to any label "block private networks from wan block 192.168/16"
    pass in on lo0 all flags S/SA keep state label "pass loopback"
    pass out on lo0 all flags S/SA keep state label "pass loopback"
    pass out all flags S/SA keep state allow-opts label "let out anything from firewall host itself"
    pass out route-to (xl1 70...) inet from 70... to ! 70.../29 flags S/SA keep state allow-opts label "let out anything from firewall host itself"
    pass in quick on xl0 proto tcp from any to (xl0) port = http flags S/SA keep state label "anti-lockout rule"
    pass in quick on xl0 proto tcp from any to (xl0) port = https flags S/SA keep state label "anti-lockout rule"
    pass in quick on xl0 proto tcp from any to (xl0) port = ssh flags S/SA keep state label "anti-lockout rule"
    block drop in quick on xl1 reply-to (xl1 70.
    ..) inet from 212.117.179.211 to any label "USER_RULE: someone was brute forcing me."
    pass in quick on xl0 inet from 192.168.100.0/24 to any flags S/SA keep state label "USER_RULE: Default LAN -> any"
    pass in quick on xl0 inet proto tcp from any to 192.168.100.5 port = domain flags S/SA keep state label "USER_RULE: NAT internal dns"
    pass in quick on xl0 inet proto udp from any to 192.168.100.5 port = domain keep state label "USER_RULE: NAT internal dns"
    anchor "tftp-proxy/*" all
    anchor "miniupnpd" all

    pfctl -sn

    nat-anchor "natearly/" all
    nat-anchor "natrules/
    " all
    nat on xl1 inet from 192.168.100.0/24 to any -> 70...* port 1024:65535
    rdr-anchor "relayd/" all
    rdr-anchor "tftp-proxy/
    " all
    rdr-anchor "miniupnpd" all</bogons></virusprot></webconfiguratorlockout></sshlockout></pfsnortsamin></pfsnortsamout></snort2c></snort2c>



  • I don't see any SIP related filter/nat rules. Can you point them out for me please?



  • Ahh, I took those in the middle of dropping and restoring the rules.

    rdr on xl1 inet proto tcp from any to 70...* port 5060:5080 -> 192.168.100.20
    rdr on xl1 inet proto udp from any to 70...* port 5060:5080 -> 192.168.100.20



  • Could you please post full tables? just output of mentioned above commands without any editing… thanks.



  • Unedited rules sent as a PM.



  • As I understand your SIP gateway (or whatever you call it) is 192.168.100.20. I see rules for both UDP and TCP, when you see this issue with SIP packets - what protocol is used? do you see a pause in coming packets (let's say no traffic at all for some time) that would expire state? Can you send me a capture anyway?
    Thanks.



  • The sip traffic is coming in as only UDP. The TCP rule was for testing. When the traffic gets dropped there is no network outage, the only thing that I have seen stop working is the sip traffic.



  • Can you post output of the next when you do not have problem and when you do```
    pfctl -ss | grep 5060



  • I had the same filters exactly when it did work and when it did not work. I have a quick rule that I add and delete to bounce the firewall to fix the issue.



  • The reason I asked about filter: when troubleshoot things like this capture everything (even if it is quite a traffic) then when you open the file apply necessary filter rules if needed - this way you can see events that might indirectly affect SIP traffic, for example packets go through but nat does not happen properly or carp switches or … Even if examples look silly to you trust me very often capturing all traffic helps understand what is really going on.
    So, to troubleshoot further I'd like to have two things
    1. all packets captures from LAN and WAN  for 2 minutes when it is not working + output of```
    pfctl -sr
    pfctl -sn
    pfpfctl -ss

    2\. Add/delete a rule to fix the problem and again: all packets captures from LAN and WAN for 2 minutes when it is working + output of```
    pfctl -sr
    pfctl -sn
    pfpfctl -ss
    

    it's up to you whether you take this way or not -)
    Cheers!



  • Forgot to ask - is there a reason you use CARP IP to terminate SIP traffic? I do not see many interfaces…



  • There is no particular reason I am using carp ip.

    I will make the captures you asked for the next time I see the issue.



  • Is there a way to capture from both lan and wan from the command line on the pfsense router itself?


  • Rebel Alliance Developer Netgate

    You can capture on both by having two open ssh sessions and doing one capture on each interface in each session.



  • @jimp:

    You can capture on both by having two open ssh sessions and doing one capture on each interface in each session.

    … or running something like

    tcpdump -ni <lan> -s0 -w lan.cap &
    tcpdump -ni <wan> -s0 -w wan.cap</wan></lan>
    

    After you are done don't forget to kill the first session.


  • Rebel Alliance Developer Netgate

    That works too, but I've found that separate windows makes it a bit easier logically to follow what's being done.

    Also, you could use "fg" to get back to the backgrounded tcpdump in your example and then ^C it, rather than finding and killing the process.



  • Alright, I have been able to capture the issue. I am working on getting the info ready.



  • @quentusrex:

    Alright, I have been able to capture the issue. I am working on getting the info ready.

    I am sorry, this is what I asked some time ago:
    @Evgeny:

    So, to troubleshoot further I'd like to have two things
    1. all packets captures from LAN and WAN  for 2 minutes when it is not working + output of```
    pfctl -sr
    pfctl -sn
    pfpfctl -ss

    2\. Add/delete a rule to fix the problem and again: all packets captures from LAN and WAN for 2 minutes when it is working + output of```
    pfctl -sr
    pfctl -sn
    pfpfctl -ss
    

    You gave me just two captures from WAN interface (no captures from LAN) 15 minutes apart from each other. It is impossible to say anything here except that you are using CARP and probably at that moment there was a problem with Active node (switchover did not occur or whatever) -(
    Can you provide full details (see above) and plus

    ifconfig
    


  • After much digging it seems that the problem exists because pfsense now overwrites the outgoing source port for sip traffic. The problem exists when pfsense assigns a new outgoing source port to an existing connection, but before the sip device has reregistered with the remote server. This causes all sip traffic to be sent not to the new udp port, but to the old one. This is only the case when using 'rport' so that the remote sip server sends the traffic to the sip source port rather than to the sip port specified in the registration packet. Using rport only checks for the port at registration time, so any changes between registrations will cause new calls to fail.

    Where is some information about how long the sip source port is assigned to a udp session?



  • Do not rely on state timeout. As I advised in e-mail use Static port in NAT->outbound, this way you will be sure SIP packets that leave your WAN interface always have source port 5060 and even if the state expires remote end 'knows' that it has to communicate with you using port 506 and you have inbound NAT->port forward + rules for this port. So, should work.


Log in to reply