pfSense periodically drops or misroutes packets



  • running pfsense 2.4.4. system has been stable for years now, but just in the last week users have been complaining that connections would hang on the order of every 20-30 minutes.

    I've traced the issue to packets coming in from an openvpn interface are periodically being lost. I see the packets coming into the firewall on the ovpns3 interface, but all of a sudden they stop being routed out of the LAN interface. I have not checked the other interfaces so they may have gotten misrouted instead of simply dropped. pfctl -s state shows ESTABLISHED connections on both interfaces. I don't see any relevant messages in the filter log for these packets. No other messages that I can see in any of the logs.

    What else can I check to debug this?

    Sample trace on the LAN interface. The responses stop abruptly, the server starts sending retransmissions. Finally after about 80 seconds a packet from the vpn client makes it out - but at this point the application running over ssh (X2Go) has timed out and closed the connection.

    11:59:16.121157 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295586:295638, ack 151058, win 1102, options [nop,nop,TS val 167817606 ecr 55251687], length 52
    11:59:16.217220 IP vpn22.cora.nwra.com.56384 > spock.cora.nwra.com.ssh: Flags [P.], seq 151058:151110, ack 295638, win 1424, options [nop,nop,TS val 55256802 ecr 167817606], length 52
    11:59:16.217524 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [.], ack 151110, win 1102, options [nop,nop,TS val 167817703 ecr 55256802], length 0
    11:59:21.223951 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167822709 ecr 55256802], length 52
    11:59:21.518343 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167823004 ecr 55256802], length 52
    11:59:21.813437 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167823299 ecr 55256802], length 52
    11:59:22.402351 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167823888 ecr 55256802], length 52
    11:59:23.582407 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167825068 ecr 55256802], length 52
    11:59:25.938359 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167827424 ecr 55256802], length 52
    11:59:30.658453 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167832144 ecr 55256802], length 52
    11:59:40.082418 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167841568 ecr 55256802], length 52
    11:59:58.930364 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167860416 ecr 55256802], length 52
    12:00:36.626489 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295638:295690, ack 151110, win 1102, options [nop,nop,TS val 167898112 ecr 55256802], length 52
    12:00:36.631278 IP vpn22.cora.nwra.com.56384 > spock.cora.nwra.com.ssh: Flags [P.], seq 151110:151162, ack 295690, win 1424, options [nop,nop,TS val 55337216 ecr 167860416], length 52
    12:00:36.631662 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 295690:296518, ack 151162, win 1102, options [nop,nop,TS val 167898117 ecr 55337216], length 828
    12:00:36.632521 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.56384: Flags [P.], seq 296518:296602, ack 151162, win 1102, options [nop,nop,TS val 167898118 ecr 55337216], length 84
    12:00:36.700730 IP vpn22.cora.nwra.com.56384 > spock.cora.nwra.com.ssh: Flags [.], ack 295690, win 1424, options [nop,nop,TS val 55337285 ecr 167898112,nop,nop,sack 1 {295638:295690}], length 0
    12:00:36.703210 IP vpn22.cora.nwra.com.56384 > spock.cora.nwra.com.ssh: Flags [R], seq 1390600783, win 0, length 0
    12:00:36.703281 IP vpn22.cora.nwra.com.56384 > spock.cora.nwra.com.ssh: Flags [R], seq 1390600783, win 0, length 0
    

  • Rebel Alliance Global Moderator

    @opoplawski said in pfSense periodically drops or misroutes packets:

    12:00:36.703210 IP vpn22.cora.nwra.com.56384 > spock.cora.nwra.com.ssh: Flags [R], seq 1390600783, win 0, length 0
    12:00:36.703281 IP vpn22.cora.nwra.com.56384 > spock.cora.nwra.com.ssh: Flags [R], seq 1390600783, win 0, length 0

    Sure looks like a RST got sent to me..



  • As I said, X2Go terminated the connection on the client after 30 seconds, so yes at that point the client sends RST. But for 80 seconds none of the client's message make it out.

    This is from an earlier trace run on the vpn interface:

    11:09:42.076670 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.53684: Flags [P.], seq 195034:195086, ack 100162, win 1019, options [nop,nop,TS val 164843562 ecr 52277646], length 52
    11:09:42.149987 IP vpn22.cora.nwra.com.53684 > spock.cora.nwra.com.ssh: Flags [P.], seq 100162:100214, ack 195086, win 1424, options [nop,nop,TS val 52282733 ecr 164843562], length 52
    11:09:42.150243 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.53684: Flags [.], ack 100214, win 1019, options [nop,nop,TS val 164843636 ecr 52282733], length 0
    11:09:47.162553 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.53684: Flags [P.], seq 195086:195138, ack 100214, win 1019, options [nop,nop,TS val 164848648 ecr 52282733], length 52
    11:09:47.310532 IP vpn22.cora.nwra.com.53684 > spock.cora.nwra.com.ssh: Flags [P.], seq 100214:100266, ack 195138, win 1424, options [nop,nop,TS val 52287894 ecr 164848648], length 52
    11:09:47.502898 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.53684: Flags [P.], seq 195086:195138, ack 100214, win 1019, options [nop,nop,TS val 164848989 ecr 52282733], length 52
    11:09:47.586072 IP vpn22.cora.nwra.com.53684 > spock.cora.nwra.com.ssh: Flags [P.], seq 100214:100266, ack 195138, win 1424, options [nop,nop,TS val 52288168 ecr 164848648], length 52
    11:09:47.589506 IP vpn22.cora.nwra.com.53684 > spock.cora.nwra.com.ssh: Flags [.], ack 195138, win 1424, options [nop,nop,TS val 52288173 ecr 164848989,nop,nop,sack 1 {195086:195138}], length 0
    11:09:47.843845 IP spock.cora.nwra.com.ssh > vpn22.cora.nwra.com.53684: Flags [P.], seq 195086:195138, ack 100214, win 1019, options [nop,nop,TS val 164849330 ecr 52282733], length 52
    11:09:48.121031 IP vpn22.cora.nwra.com.53684 > spock.cora.nwra.com.ssh: Flags [.], ack 195138, win 1424, options [nop,nop,TS val 52288704 ecr 164849330,nop,nop,sack 1 {195086:195138}], length 0
    

    It's not terribly obvious with the amount of packets listed here, but spock is retransmitting 195086:195138 while vpn22 is continually ACKing it. Those ACKs don't make it out of the LAN interface and back to spock.



  • Here is a synchronized capture. From the VPN interface:

    13:07:33.858446 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46049:46101, ack 91444, win 1424, options [nop,nop,TS val 108213657 ecr 258315305], length 52
    13:07:33.859514 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [.], ack 46101, win 798, options [nop,nop,TS val 258315348 ecr 108213657], length 0
    13:07:40.766124 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258322254 ecr 108213657], length 52
    13:07:40.807750 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108220606 ecr 258322254], length 52
    13:07:41.013343 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258322502 ecr 108213657], length 52
    13:07:41.050140 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108220849 ecr 258322254], length 52
    13:07:41.054111 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [.], ack 91496, win 1424, options [nop,nop,TS val 108220852 ecr 258322502,nop,nop,sack 1 {91444:91496}], length 0
    13:07:41.261255 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258322750 ecr 108213657], length 52
    13:07:41.301772 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [.], ack 91496, win 1424, options [nop,nop,TS val 108221100 ecr 258322750,nop,nop,sack 1 {91444:91496}], length 0
    13:07:41.758467 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258323247 ecr 108213657], length 52
    13:07:41.779369 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108221578 ecr 258322750], length 52
    13:07:41.799274 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [.], ack 91496, win 1424, options [nop,nop,TS val 108221598 ecr 258323247,nop,nop,sack 1 {91444:91496}], length 0
    13:07:42.266314 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108222065 ecr 258323247], length 52
    13:07:42.751199 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258324240 ecr 108213657], length 52
    13:07:42.792072 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [.], ack 91496, win 1424, options [nop,nop,TS val 108222590 ecr 258324240,nop,nop,sack 1 {91444:91496}], length 0
    13:07:43.241245 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108223040 ecr 258324240], length 52
    13:07:44.739248 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258326228 ecr 108213657], length 52
    13:07:44.779857 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [.], ack 91496, win 1424, options [nop,nop,TS val 108224578 ecr 258326228,nop,nop,sack 1 {91444:91496}], length 0
    13:07:45.189314 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108224988 ecr 258326228], length 52
    13:07:48.719258 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258330208 ecr 108213657], length 52
    13:07:48.759900 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [.], ack 91496, win 1424, options [nop,nop,TS val 108228558 ecr 258330208,nop,nop,sack 1 {91444:91496}], length 0
    13:07:49.081377 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108228880 ecr 258330208], length 52
    13:07:56.671203 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258338160 ecr 108213657], length 52
    13:07:56.711914 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [.], ack 91496, win 1424, options [nop,nop,TS val 108236510 ecr 258338160,nop,nop,sack 1 {91444:91496}], length 0
    13:07:56.873296 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108236672 ecr 258338160], length 52
    13:08:12.457246 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46101:46153, ack 91496, win 1424, options [nop,nop,TS val 108252256 ecr 258338160], length 52
    13:08:12.559240 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258354048 ecr 108213657], length 52
    13:08:12.599890 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46153:46309, ack 91496, win 1424, options [nop,nop,TS val 108252398 ecr 258354048,nop,nop,sack 1 {91444:91496}], length 156
    13:08:15.840437 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46309:46361, ack 91496, win 1424, options [nop,nop,TS val 108255639 ecr 258354048], length 52
    13:08:15.840964 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46361:46429, ack 91496, win 1424, options [nop,nop,TS val 108255639 ecr 258354048], length 68
    13:08:36.171369 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [FP.], seq 46429:46625, ack 91496, win 1424, options [nop,nop,TS val 108275970 ecr 258354048], length 196
    13:08:43.593439 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [FP.], seq 46101:46625, ack 91496, win 1424, options [nop,nop,TS val 108283392 ecr 258354048], length 524
    13:08:43.593754 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91496:92068, ack 46626, win 819, options [nop,nop,TS val 258385082 ecr 108283392], length 572
    13:08:43.598215 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [F.], seq 92068, ack 46626, win 819, options [nop,nop,TS val 258385087 ecr 108283392], length 0
    13:08:43.634439 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [R], seq 2624570332, win 0, length 0
    13:08:43.638893 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [R], seq 2624570332, win 0, length 0
    

    From the internal interface:

    13:07:33.858465 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [P.], seq 46049:46101, ack 91444, win 1424, options [nop,nop,TS val 108213657 ecr 258315305], length 52
    13:07:33.859502 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [.], ack 46101, win 798, options [nop,nop,TS val 258315348 ecr 108213657], length 0
    13:07:40.766107 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258322254 ecr 108213657], length 52
    13:07:41.013315 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258322502 ecr 108213657], length 52
    13:07:41.261241 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258322750 ecr 108213657], length 52
    13:07:41.758455 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258323247 ecr 108213657], length 52
    13:07:42.751177 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258324240 ecr 108213657], length 52
    13:07:44.739233 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258326228 ecr 108213657], length 52
    13:07:48.719243 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258330208 ecr 108213657], length 52
    13:07:56.671188 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258338160 ecr 108213657], length 52
    13:08:12.559225 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91444:91496, ack 46101, win 798, options [nop,nop,TS val 258354048 ecr 108213657], length 52
    13:08:43.593457 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [FP.], seq 46101:46625, ack 91496, win 1424, options [nop,nop,TS val 108283392 ecr 258354048], length 524
    13:08:43.593740 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [P.], seq 91496:92068, ack 46626, win 819, options [nop,nop,TS val 258385082 ecr 108283392], length 572
    13:08:43.598203 IP spock.cora.nwra.com.ssh > mani.nwra.com.54134: Flags [F.], seq 92068, ack 46626, win 819, options [nop,nop,TS val 258385087 ecr 108283392], length 0
    13:08:43.634456 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [R], seq 2624570332, win 0, length 0
    13:08:43.638908 IP mani.nwra.com.54134 > spock.cora.nwra.com.ssh: Flags [R], seq 2624570332, win 0, length 0
    

    Why did the packets from mani stop going out the internal interface? I ran captures on the other interfaces and didn't see them (not expected, but grasping at straws here). Is it significant that next packet to go out the internal interface from mani is a FIN packet? This leads me to believe that something is going wrong in the state table. Though I'm not sure why it's even necessary to keep any state for this at all.



  • I'm also seeing this on long lived tcp connections such as VNC to the internet that do not involve OpenVPN at all.


  • Rebel Alliance Global Moderator

    If your seeing FIN or RST, then those are connections being closed.. Which has nothing to do with the firewall... But yes when those are seen the firewall will close the state..



  • @johnpoz Ignore the final closing - that's done because the client gives up and closes the connection. Look above and see that there are packets that come in from one interface that are not forwarded out the other. Why aren't those packets forwarded? That's why the client keeps sending retransmissions - because they aren't seeing the packets coming from the server.



  • Is there anything that I can enable that would log why the packets are not being forwarded?



  • Here is something odd about the firewall states - this is an IRC connection that keeps timing out:

    igb0 tcp 149.56.134.238:6697 <- 10.10.41.27:36924       ESTABLISHED:ESTABLISHED
       [1386746598 + 182144] wscale 7  [2690438126 + 34432] wscale 7
       age 01:37:33, expires in 23:59:39, 881:669 pkts, 57802:169735 bytes, rule 263
       id: 000000005cc4f261 creatorid: e36a6de1
    igb1 tcp 208.187.183.84:40578 (10.10.41.27:36924) -> 149.56.134.238:6697       ESTABLISHED:ESTABLISHED
       [2690438126 + 34432] wscale 7  [1386746598 + 182144] wscale 7
       age 01:37:33, expires in 23:59:39, 881:669 pkts, 57802:169735 bytes, rule 48
       id: 000000005cc4f262 creatorid: e36a6de1
    

    Looking at the rules:

    @263(1453487662) pass in quick on igb0 inet proto tcp from <dns_int:2> port = domain to <nwra_boulder_pub:1> port = domain flags S/SA keep state label "USER_RULE: Internal DNS Server replies"
    @48(1000005911) pass out route-to (igb1 208.187.183.81) inet from 208.187.183.84 to ! 208.187.183.80/28 flags S/SA keep state allow-opts label "let out anything from firewall host itself"
    

    How is rule 263 associated with the igb0 state? Rule 48 for the outbound connection seems reasonable. I see a number of igb0 states that have odd rule #s associated with them. Perhaps this is related to issues with those states stopping to work periodically (and so the packets no longer make them out the LAN interface).



  • Okay, I think this is because I've been modifying the firewall so rule numbers are changing, but the states don't get modified to reflect that.



  • In an attempt to rule out issues with the pf state code, I created a set of stateless rules for SSH traffic in and out of the firewall via the LAN and OpenVPN interfaces. I verified that establishing an ssh connection did not create an entry in the state table. However, at some point the firewall still stops forwarding packets out the LAN interface.

    Packets on VPN interface:

    16:47:44.612931 IP spock.cora.nwra.com.ssh > vpn6.cora.nwra.com.34640: Flags [.], ack 33826, win 581, options [nop,nop,TS val 530726102 ecr 100216044], length 0
    16:47:49.619397 IP spock.cora.nwra.com.ssh > vpn6.cora.nwra.com.34640: Flags [P.], seq 38630:38682, ack 33826, win 581, options [nop,nop,TS val 530731109 ecr 100216044], length 52
    16:47:49.732568 IP vpn6.cora.nwra.com.34640 > spock.cora.nwra.com.ssh: Flags [P.], seq 33826:33878, ack 38682, win 982, options [nop,nop,TS val 100221164 ecr 530731109], length 52
    16:47:49.935345 IP vpn6.cora.nwra.com.34640 > spock.cora.nwra.com.ssh: Flags [P.], seq 33826:33878, ack 38682, win 982, options [nop,nop,TS val 100221367 ecr 530731109], length 52
    16:47:49.988355 IP spock.cora.nwra.com.ssh > vpn6.cora.nwra.com.34640: Flags [P.], seq 38630:38682, ack 33826, win 581, options [nop,nop,TS val 530731478 ecr 100216044], length 52
    16:47:49.989170 IP vpn6.cora.nwra.com.34640 > spock.cora.nwra.com.ssh: Flags [.], ack 38682, win 982, options [nop,nop,TS val 100221421 ecr 530731478,nop,nop,sack 1 {38630:38682}], length 0
    16:47:50.341554 IP vpn6.cora.nwra.com.34640 > spock.cora.nwra.com.ssh: Flags [P.], seq 33826:33878, ack 38682, win 982, options [nop,nop,TS val 100221773 ecr 530731478], length 52
    

    Packets on LAN interface:

    16:47:44.612683 IP vpn6.cora.nwra.com.34640 > spock.cora.nwra.com.ssh: Flags [P.], seq 33774:33826, ack 38630, win 982, options [nop,nop,TS val 100216044 ecr 530726091], length 52
    16:47:44.612902 IP spock.cora.nwra.com.ssh > vpn6.cora.nwra.com.34640: Flags [.], ack 33826, win 581, options [nop,nop,TS val 530726102 ecr 100216044], length 0
    16:47:49.619367 IP spock.cora.nwra.com.ssh > vpn6.cora.nwra.com.34640: Flags [P.], seq 38630:38682, ack 33826, win 581, options [nop,nop,TS val 530731109 ecr 100216044], length 52
    16:47:49.988324 IP spock.cora.nwra.com.ssh > vpn6.cora.nwra.com.34640: Flags [P.], seq 38630:38682, ack 33826, win 581, options [nop,nop,TS val 530731478 ecr 100216044], length 52
    16:47:50.490315 IP spock.cora.nwra.com.ssh > vpn6.cora.nwra.com.34640: Flags [P.], seq 38630:38682, ack 33826, win 581, options [nop,nop,TS val 530731980 ecr 100216044], length 52
    

    The TCP sequence packet 33826:33878 from vpn6 never makes it out the LAN interface.



  • I've also tried disabling PF scrubbing to no effect.



  • "pfctl -x loud" did not produce any useful output either during the time when this happens.



  • Perhaps spoke too soon about the loud setting - it seems that got reset at some point. I'm now in a state where replies to an outbound TCP connection (LAN -> WAN) to port 6697 are consistently not being passed back through the LAN interface. See the following kernel message, but they don't mean anything to me:

     Dec 12 15:40:35 inferno.cora.nwra.com kernel: pf: BAD state: TCP out wire: 130.185.232.126:6697 208.187.183.84:9693 stack: 130.185.232.126
    Dec 12 15:40:35 inferno.cora.nwra.com kernel: pf: State failure on: 1       | 5
    Dec 12 15:40:43 inferno.cora.nwra.com kernel: pf: BAD state: TCP out wire: 130.185.232.126:6697 208.187.183.84:9693 stack: 130.185.232.126:6697 10.10.41.27:39994 [lo=452368117 high=452397173 win=29200 modulator=0 wscale=7] [lo=2051634067 high=2051663379 win=227 modulator=0 wscale=7] 4:2 SA seq=2067272792 (2067272792) ack=452368117 len=0 ackskew=0 pkts=7:1 dir=in,rev
    Dec 12 15:40:43 inferno.cora.nwra.com kernel: pf: State failure on: 1       | 5
    Dec 12 15:40:52 inferno.cora.nwra.com kernel: pf: OK ICMP 3:1 208.187.183.84 -> 52.21.76.50 state: TCP out wire: 52.21.76.50:443 208.187.183.84:26369 stack: 52.21.76.50:443 10.10.41.27
    Dec 12 15:40:53 inferno.cora.nwra.com kernel: pf: BAD state: TCP out wire: 52.21.76.50:443 208.187.183.84:26369 stack: 52.21.76.50:443 10.10.41.27:52038 [lo=2657418702 high=2657445582 win=29200 modulator=0 wscale=7] [lo=80869102 high=80898414 win=105 modulator=0
    Dec 12 15:40:54 inferno.cora.nwra.com kernel: pf: State failure on:   2     |   6
    Dec 12 15:40:59 inferno.cora.nwra.com kernel: pf: BAD state: TCP out wire: 130.185.232.126:6697 208.187.183.84:9693 stack: 130.185.232.126:6697 10.10.41.27:39994 [lo=452368117 high=452397173 win=29200 modulator=0 wscale=7] [lo=2051634067 high=2051663379 win=227 modulator=0 wscale=7] 4:2 SA seq=2067272792 (2067272792) ack=452368117 len=0 ackskew=0 pkts=7:1 dir=in,rev
    Dec 12 15:40:59 inferno.cora.nwra.com kernel: pf: State failure on: 1       | 5
    Dec 12 15:41:02 inferno.cora.nwra.com kernel: pf_map_addr: selected address 10.10.41.2
    Dec 12 15:41:12 inferno.cora.nwra.com kernel: pf: dropping packet with ip options
    Dec 12 15:41:15 inferno.cora.nwra.com kernel: pf: OK ICMP 3:1 208.187.183.84 -> 130.185.232.126 state: TCP out wire: 130.185.232.126:6697 208.187.183.84:4543 stack: 130.185.232.126:6697 10.10.41.27:39994 [lo=452368116 high=452397173 win=229 modulator=0 wscale=7] [lo=4197729835 high=4197759147 win=227 modulator=0 wscale=7] 4:2 seq=4197729834
    
    # pfctl -s state | grep 6697
    #
    

    which is odd - I would have expected to see a NAT state. I've made the LAN rule stateless in an attempt to debug this.



  • State table was a timing issue - with no packets making into the LAN interface the connection doesn't get established and so the state times out. But it does appear briefly:

    igb1 tcp 208.187.183.84:30452 (10.10.41.27:56532) -> 38.229.70.22:6697       ESTABLISHED:SYN_SENT
    


  • And another weirdness - enabled logging on the LAN stateless rule for 6697:

    pass out log quick on igb0 inet proto tcp from any port = ircs-u to <vulcan7> no state label "USER_RULE: Test stateless rule"
    

    and it supposedly matches some packets (though not as many that come in):

    Dec 12 16:23:03 inferno.cora.nwra.com filterlog: 62,,,1544481968,igb0,match,pass,out,4,0x68,,54,0,0,DF,6,tcp,60,38.229.70.22,10.10.41.27,6697,59600,0,SA,497406726,3635760457,28960,,mss;sackOK;TS;nop;wscale
    

    But it does not appear on a tcpdump of igb0:

    16:22:31.114219 IP vulcan7.cora.nwra.com.59600 > card.freenode.net.ircs-u: Flags [S], seq 3635760456, win 29200, options [mss 1460,sackOK,TS val 95690528 ecr 0,nop,wscale 7], length 0
    16:23:03.210311 IP vulcan7.cora.nwra.com.59600 > card.freenode.net.ircs-u: Flags [S], seq 3635760456, win 29200, options [mss 1460,sackOK,TS val 95722624 ecr 0,nop,wscale 7], length 0
    16:25:31.589412 IP vulcan7.cora.nwra.com.38846 > wolfe.freenode.net.ircs-u: Flags [S], seq 1451269209, win 29200, options [mss 1460,sackOK,TS val 95871003 ecr 0,nop,wscale 7], length 0
    


  • Whatever is going on, it appears to be happening to multiple connections simultaneously. I'm getting reports from multiple users of connection hangs at or about the same time.

    Also, I cannot reproduce this issue on our 2.4.3 firewalls - which I'm not upgrading until I can sort this out.

    : netstat -s -s
    tcp:
            124959 packets sent
                    55620 data packets (8254657 bytes)
                    19 data packets (8604 bytes) retransmitted
                    65556 ack-only packets (0 delayed)
                    3766 control packets
            115960 packets received
                    56267 acks (for 8253673 bytes)
                    3 duplicate acks
                    63381 packets (20226905 bytes) received in-sequence
                    3 completely duplicate packets (0 bytes)
                    1 window update packet
                    33 packets received after close
            1761 connection requests
            300 connection accepts
            2 ignored RSTs in the windows
            2060 connections established (including accepts)
                    1927 times used RTT from hostcache
                    1927 times used RTT variance from hostcache
            2062 connections closed (including 225 drops)
                    1036 connections updated cached RTT on close
                    1036 connections updated cached RTT variance on close
                    2 connections updated cached ssthresh on close
            56202 segments updated rtt (of 55024 attempts)
            35 retransmit timeouts
                    1 connection dropped by rexmit timeout
            18 keepalive timeouts
                    16 keepalive probes sent
                    2 connections dropped by keepalive
            43475 correct ACK header predictions
            57803 correct data packet header predictions
            391 syncache entries added
                    34 retransmitted
                    1 dupsyn
                    300 completed
                    80 reset
                    10 stale
                    1 unreach
            391 cookies sent
            29 hostcache entries added
            13 SACK options (SACK blocks) received
            4 packets with ECN ECT(0) bit set
            5 successful ECN handshakes
    TCP connection count by state:
            2 connections in CLOSED state
            9 connections in LISTEN state
            9 connections in ESTABLISHED state
            1 connection  in TIME_WAIT state
    udp:
            16122957 datagrams received
            14469 with no checksum
            8915 dropped due to no socket
            147302 broadcast/multicast datagrams undelivered
            592 dropped due to full socket buffers
            15966148 delivered
            17339474 datagrams output
    sctp:
            Packet drop statistics:
            Timeouts:
    ip:
            275198040 total packets received
            16589745 packets for this host
            125160728 packets forwarded (46509457 packets fast forwarded)
            23425 packets not forwardable
            17837972 packets sent from this host
            5623 output packets dropped due to no bufs, etc.
            13 output packets discarded due to no route
            1134 output datagrams fragmented
            2845 fragments created
    icmp:
            22945 calls to icmp_error
            Output histogram:
                    echo reply: 175789
                    destination unreachable: 22649
                    time exceeded: 296
            Input histogram:
                    echo reply: 174637
                    destination unreachable: 397
                    echo: 175789
            175789 message responses generated
            ICMP address mask responses are disabled
    igmp:
    ipsec:
    ah:
    esp:
    ipcomp:
    pim:
    carp:
    pfsync:
    arp:
            7591 ARP requests sent
            24083 ARP replies sent
            407985 ARP requests received
            4219 ARP replies received
            412204 ARP packets received
            28325 total packets dropped due to no ARP entry
            1220 ARP entrys timed out
    ip6:
            12717 total packets received
            29 packets sent from this host
            15 output packets discarded due to no route
            Input histogram:
                    hop by hop: 1344
                    UDP: 10618
                    ICMP6: 755
            Mbuf statistics:
                    12 one mbuf
                    12705 one ext mbuf
                    0 two or more ext mbuf
            2 failures of source address selection
            source addresses on a non-outgoing I/F
                    2 addresses scope=0xf
            Source addresses selection rule applied:
                    2 same address
    icmp6:
            Output histogram:
                    neighbor solicitation: 7
                    MLDv2 listener report: 20
            Histogram of error messages to be generated:
    ipsec6:
    rip6:
    
    Status: Enabled for 1 days 00:32:48           Debug: Urgent
    
    Interface Stats for igb0              IPv4             IPv6
      Bytes In                     14744688471          1294985
      Bytes Out                   111339870599                0
      Packets In
        Passed                        59113235                0
        Blocked                           5687             7975
      Packets Out
        Passed                        94128979                0
        Blocked                             54                0
    
    State Table                          Total             Rate
      current entries                     7437
      searches                       372470510         4215.0/s
      inserts                          7112154           80.5/s
      removals                         7104713           80.4/s
    Counters
      match                            7440531           84.2/s
      bad-offset                             0            0.0/s
      fragment                              12            0.0/s
      short                                  0            0.0/s
      normalize                              1            0.0/s
      memory                                 0            0.0/s
      bad-timestamp                          0            0.0/s
      congestion                             0            0.0/s
      ip-option                           1287            0.0/s
      proto-cksum                            0            0.0/s
      state-mismatch                       637            0.0/s
      state-insert                           0            0.0/s
      state-limit                            0            0.0/s
      src-limit                              0            0.0/s
      synproxy                               0            0.0/s
      map-failed                             0            0.0/s
    

    Not sure what other statistics to look at.



  • igb0 interface info

    In/out packets
        59242918/94295700 (13.75 GiB/103.78 GiB)
    In/out packets (pass)
        59242918/94295700 (13.75 GiB/103.78 GiB)
    In/out packets (block)
        13713/54 (1.84 MiB/0 B)
    In/out errors
        0/0
    Collisions
        0
    


  • Okay, the ibg0 interfaces stats may be relevant. Checked again this morning and the number of out packets blocked has increased:

    In/out packets (block)
        20948/146 (2.87 MiB/2 KiB)
    

    And it is slowly increasing. Now the thing is - I don't see any ibg0 outbound blocked packets logged in the filter log. All rules on igb0 that would block should be logged - and the counts for those rules in the rule status are 0. Now to try to correlate that block number with when I see the dropped packets.