Firewall blocking reply packets intermittently



  • Hi All,

    I have 8 pfSense firewalls running on Alix boxes.  They are all running pfSense 1.2.3-RELEASE, except for my VA office which is running 2.0.2-RELEASE. I recently created IPSec VPNs between them all to replace a vendor supplied VPN.  The VPNs have been working great.  I mention the VPNs I setup because after setting them up I have a new problem.

    The problem I'm having now is that I have an Apache Web server in my main office (NYC).  It doesn't handle a tremendous amount of traffic, but it runs a website that my company uses mostly internally.

    The external IP address is 64.190.10.8 (www.instihire.com).  I have that mapped internally to 192.168.44.12 via a standard virtual IP and 1:1 NAT.  This has worked this way very well for a very long time.

    Since I switched over to my new VPN configuration, my outer offices use the external IP address to contact this web site.  Any traffic that needs to be encrypted is encrypted with SSL, so my thought was that this would be less load on the actual firewalls.

    Several times today, people in outer offices have been unable to connect to the web server.  They will be working along just fine, and it will stop working for a few minutes, then it will resume working again.  The baffling part is that a person sitting next to them will sometimes continue to work just fine.  Sometimes two or three people in the same office will go out at the same time.  A few minutes later it's like nothing happened.

    I decided to look in the NYC firewall logs and I found these entries the last time people in my VA office were blocked out (this is just a sample there were many many more):

    
    Nov 20 16:01:55 192.168.42.2 pf: 200965 rule 236/0(match): block in on vr1: (tos 0x0, ttl 63, id 54825, offset 0, flags [DF], proto TCP (6), length 52) 192.168.44.12.80 > 64.190.58.18.50974: S, cksum 0xdb35 (correct), 626588526:626588526(0) ack 3802417820 win 5840 <mss 7="" 1460,nop,nop,sackok,nop,wscale="">Nov 20 16:01:57 192.168.42.2 pf: 010364 rule 236/0(match): block in on vr1: (tos 0x0, ttl 63, id 37314, offset 0, flags [DF], proto TCP (6), length 52) 192.168.44.12.80 > 64.190.58.18.22039: S, cksum 0xe25e (correct), 676360166:676360166(0) ack 837086667 win 5840 <mss 7="" 1460,nop,nop,sackok,nop,wscale="">Nov 20 16:01:58 192.168.42.2 pf: 402187 rule 236/0(match): block in on vr1: (tos 0x0, ttl 63, id 34966, offset 0, flags [DF], proto TCP (6), length 52) 192.168.44.12.80 > 64.190.58.18.11373: S, cksum 0x94b4 (correct), 668526090:668526090(0) ack 4040175751 win 5840 <mss 7="" 1460,nop,nop,sackok,nop,wscale="">Nov 20 16:01:58 192.168.42.2 pf: 027810 rule 236/0(match): block in on vr1: (tos 0x0, ttl 63, id 6614, offset 0, flags [DF], proto TCP (6), length 52) 192.168.44.12.443 > 64.190.58.18.14833: S, cksum 0x4f20 (correct), 668007838:668007838(0) ack 2668274022 win 5840</mss></mss></mss> 
    

    64.190.48.18 is the external IP address of the (pfSense) firewall in my VA office.

    I think I'm hitting a connection, or rate limit, but I don't recall setting any when I created my firewall rules.

    One thing to note, I did raise the size of the state table in my NYC firewall from 10k to 25k as I found on Monday that I was hitting that ceiling.  That was causing all sorts of confusing behavior, including VPN connection drop outs.  Since I raised that limit, I've been graphing the used state count on each firewall with Cacti.  The NYC max has been 18k today, and the next closest one I've seen is my NJ office at 8k, so I think I have that problem solved.

    I'm not sure what to look at in my NYC firewall to diagnose/fix this problem.

    Any help or suggestions would be greatly appreciated.

    Thanks
    Tony Nelson



  • As a test, I'm going to create a VPN between the VA office and the network in NYC with this Apache server to see if that makes any difference.

    Does anyone have any other suggestions on what I might look at, I really need to get this fixed.

    Thank you again,
    Tony



  • This fixed the problem.  I still don't know what caused it in the first place.


Locked