Traffic just stops
-
Atm I would try swapping the nics as you get an error in the logs concerning the nic and this is the only thing that is common between these systems. I have not heard from anybody else with this problem yet.
-
I will have the techs at my office ensure that I take a Cisco 350 card with old enough FW that it doesn't error (they know which version works, I can't remember) and try it out and let you know :)
Thanks for working with me so much :)
-
I open this topic again, since I'm seeing exactly the same behaviour with the current snapshot (1.2 BETA).
Was there any solution back then?
So again the behaviour I'm seeing is: after some random time pfsense simply stops forwarding (new) packets and I get many firewall blocks of the form:
Jun 1 15:55:50 192.168.6.1 pf: 000017 rule 167/0(match): block in on em2: x.y.z.v.3727 > 85.75.180.33.11206: UDP, length 108 Jun 1 15:55:51 192.168.6.1 pf: 1\. 205282 rule 167/0(match): block in on em2: x.y.z.v.3727 > 89.79.20.136.17147: UDP, length 24 Jun 1 15:55:51 192.168.6.1 pf: 000018 rule 167/0(match): block in on em2: x.y.z.v.3727 > 84.252.27.137.9767: UDP, length 24 Jun 1 15:55:51 192.168.6.1 pf: 000019 rule 167/0(match): block in on em2: x.y.z.v.3727 > 83.226.157.41.8833: UDP, length 24 Jun 1 15:55:52 192.168.6.1 pf: 831579 rule 167/0(match): block in on em2: x.y.z.v.3727 > 85.75.180.33.11206: UDP, length 108
where x.y.z.v is my internal ip, em2 is the WAN interface and the rule, which suddenly applies is the default "block all" rule.
It seems that my packets can pass out, but, the related packets wich are beeing sent back are not recognized as related, and hence blocked.Any already existing or established connection STAY ALIVE; only no new connection can be established. Additionally I do not see any hardware error previous to this happening. The NICs are all intel.
After some minutes (say 5-10) everything is back to normal, and new connections are again possible. I looked through all the logs, and cannot find any hint why this happens.First I thought, this might be happening because I use a bridge between LAN and WAN, but seeing this topic makes me feel it is something else.
I running out of ideas what else i can test to get this working stable.
Regards
Arno -
Increase your state table size.
-
Thanks for the quick reply.
That was exactly what I was suspecting myself, since all the symptoms (I made a few more tests) point this way. However, the RRD graph show a max peek of about 2k states with a average arround 200-600. I used the default 10k states, and thought that this would be ok since it is way under the maximum limit.
Anyway I increased the states to 100k (machine has 2GB RAM), and until now the effect did not apear anymore - however, its also Friday evening and nobody is working anymore, so the real test will be on monday. But I got the feeling this solves it. I'll keep you informed.
best regards
Arno -
.. unfortunately increasing the states does not seem to do the trick: Today the effect reappeared, and I was blocked for about 10 minutes from the outside. I still could check the webinterface and saw that I had something arround 50 states… :( - I was completely alone in the whole office, hence the low state-number.
I updated now to the newest snapshot from today around 13:00 with the hope that the problem magically disappears :).
However, since i looked at the changes made, I doubt that. Hence, can sombody advise me what to check when the situation occurs again. What kind of commads would help to indentifiy the problem? (I checked the interfaces: they where all up.) I have remote sysloging running - however, there is nothing suspect to me when it occurs. Again is there something I should look for?
Best regards
Arno -
Sounds like hardware glitches of some sort. Maybe try replacing your interfaces with Intel nics if they are not already. Turn off all unneeded options in the bios. Ensure the bios is up to date. Turn off plug and play support in the bios.
-
Thanks for the quick replay..
however.. been there, done that: all NICs are intel. BIOS is stripped to the only necessary. the machine was actually purchased specifically as pfsense firewall and is brand new.
Actually I doubt hardware.. since from the webinterface i can still access all connected networks…(ping). Its just everything which goes through the filter is suddenly not allowed anymore (remeber: activated connections STAY ALIVE).
I even tried - when the effect happens - to deactivate the filtering bridge: no effect (besides, that of course now also the active connections broke). Also a "reset states" did not make any change. I still kept getting the "block default rule" messages in the syslog - in the same time it is possible to log on via ssh and ping in any direction. So for me, the states (i.e. pf itself) looked much more like the guilty one.
Hmmm.. if i change the maximum number ob states do I have to reboot? (The webinterface does not say anything, so i believe it is changed dynmically)
In order to check if pf is still working correct, when it happens - is there a command i can put, so we can draw conclusions later?
Best regards
Arno -
What you're seeing with that blocked traffic is normal out of state dropped traffic. You'll always see it, you're wrongly associating that with a problem.
Need more info. What kind of Internet connection? What's your WAN config, static, DHCP, PPPoE, …? When it happens, can you access the Internet from pfsense itself (ping google.com or something)?
-
Hi there,
first the good news: the problem really seemes to be gone - at least it did not occur again since updating to the snapshot mentioned in my last post.
However, the reason why that is so, is still in the dark.
I changed since then only two things:
- updated the states to 500000
- updated the image.
Regarding states, I see even in heaviest times like 1000-2000 states - this is still very, very far away from even the standard 10000.
Again here my setup:
Internet routing x.y.z.w/25 , GW x.y.z.129 –--- x.y.z.130 (WAN-IF)---pfsense (transparent bridge mode)-----x.y.z.135 (LAN_IF)------ clients in the range x.y.z.w/25 via DHCP (without the used ones)
I agree that the blocked traffic looks like out-of-state; however, when the situation occured the GUI showed me much less states than have beeen configured.
Anyway.. for now the problem is solved and i hope it stays like that. If it will re-occur, I'll try to give even more details.
Best regards and keep up the good work!
Arno