Some filtered states can only be killed individually, not in bulk



  • I have a VOIP phone connected via my backup 4G WAN that I'm trying to force back onto the recovered main WAN connection. VOIP phones are stubborn about holding their connections it seems.

    I went to DIAG > STATES and set a filter for the VOIP phone's IP address which showed 8 states, one in and one out for each of its 4 lines.

    I then hit the 'Kill States' button which should have killed all filtered states, but it left 4 of them there and the phone held its connection to the 4G WAN. Only when I went and clicked each of the remaining states individually by using their trashcan icon was I able to kill the states. After this the phone reconnected to the main WAN as per my firewall rules.

    I'm not sure if this is a bug or not - I would guess so since in my mind 'Kill States' should effectively be the same as clicking all the trash icons. But I'm no dev and have no expectation of this behaviour changing. I haven't tested recently but I believe I've seen similar behaviour when trying 'Reset States' for the entire firewall - i.e. that the phones hold their connections regardless.

    I'm just glad I now have a method to bump phones that is less drastic than rebooting the router, which was my former sledgehammer approach to this issue! Just reporting it in case it is useful to developers or other users.


  • LAYER 8 Netgate



  • Yes looks like a solid candidate. I'll test this after the next release.



  • I've tested this on 244_2 and it still fails to kill 4 of the states. I have to trash the remaining states individually.


  • Rebel Alliance Developer Netgate



  • Ok. I'll test on 245 when released and update.



  • This problem is still dogging me.

    I suspect that this:

    https://redmine.pfsense.org/issues/9270

    could be a dup of this:

    https://redmine.pfsense.org/issues/4674 ??

    I will test when 2.5 comes out.


  • Rebel Alliance Developer Netgate

    No, those are not related.



  • There's a number of posts on this issue. There are several scripts posted on this forum that kill the lingering connections. The problem with the scripts is that it will kill an active phone conversation. Not sure how to resolve that.



  • @Ximulate said in Some filtered states can only be killed individually, not in bulk:

    There are several scripts posted on this forum that kill the lingering connections.

    Are you referring to scripts like the following:

    pfctl -k 192.168.1.41
    pfctl -k 0.0.0.0/0 -k 192.168.1.41
    

    Because I've tried that as a potential quick workaround for staff when facing this situation, but it doesn't work - the same 'sticky' states seem to be left intact as when using the dialog's 'Kill States' button on a filtered state.

    In fact, even a full state table reset (Diagnostics -> States -> Reset States -> Reset the firewall state table) failed to kill the 'sticky' states in at least some of my tests.

    If you know of another script that does work, please let me know. I can live with killing active phone conversations in many circumstances. As of now the only two workarounds that actually work are:

    1. Manually kill the states via the dialog as outlined earlier in this thread. This scares and confuses non-technical staff way too much, and must be done one IP at a time so it is a bit click intensive and time consuming.
    2. Reboot the router. This is the path we generally take despite the outage, as it is easy to understand and execute. It has the distinct advantage of always resolving the issue.


  • I'm by no means an expert at any of this, just on a quest to solve my own similar situation.

    The Reset States states GUI command always worked for me, though its a bit like using a sledgehammer for a thumbtack. Wonder if the device(s) on your network are so quickly reestablishing the connection that it just appears to to not kill the state? Your Web GUI on the VoIP may have a setting to allow you to adjust the re-registration time of the phone. If you can access it, perhaps increase the time.

    The scripts I've used are run automatically by cron (install the crontab package to easily manage cron jobs). Still, not an ideal solution. You could also schedule a reboot via cron.

    You might look into if you can disable then reset the failover wan interface via a command (not sure if you can.)

    Look at your Firewall Optimization Options under System > Advanced > Firewall & NAT. If its not already, try Normal or Aggressive. This changes the state timeouts. Scroll to the bottom, and you can fine tune those even further.

    In Diagnostics > Command Prompt, run "pfctl -st" to see you actual State Timings. Changing the above from say Normal to Aggressive should reduce the time outs.

    Hope this helps.

    Here are my state times outs with Agressive:
    tcp.first 30s

    tcp.opening 5s
    tcp.established 18000s
    tcp.closing 60s
    tcp.finwait 30s
    tcp.closed 30s
    tcp.tsdiff 10s
    udp.first 60s
    udp.single 30s
    udp.multiple 60s
    icmp.first 20s
    icmp.error 10s
    other.first 60s
    other.single 30s
    other.multiple 60s
    frag 30s
    interval 10s
    adaptive.start 241200 states
    adaptive.end 482400 states
    src.track 0s


Log in to reply