Lost LAN to Internet connectivity

rnmixon

OK. Does the jive with the fact that once I was ssh'd into pfSense I could get to outside web sites?

I'm pretty new to pfSense and did not find wget or curl, so I just tried "telnet xxxxxxx.com 80" for google and a couple of other sites I knew.

Thank you again - Richard

cmb

I mis-read that as you couldn't ping the gateway from that host itself. Since that VM can get out, clearly you have connectivity at the host level. It's obviously routing correctly as well. Next most likely cause is the NAT or firewall config was broken. Check Diag>Backup/restore, Config History, see what changed.

rnmixon

Hmm - thought this was a thing of the past, but it happened again today. This time after I tried to do an update to the latest (12/19/2014) snapshot.

The update appeared to go OK. I watched/waited for the package updates for optional packages that has been previously installed (ntopng, darkstat, etc) to complete. Afterwards not internet connection.

When I tried to go reply my LAN interface settings I got a message box with the following message at the top of the page:

Packages are currently being reinstalled in the background.
Do not make changes in the GUI until this is complete.

I finally tried reapplying the interface config (WAN hn0, LAN hn1) using the console. Still no luck.

I restored my VM from this morning's backup and all was well again.

I then went through the snapshot upgrade once again - same exact results.

Thank goodness for backups - but I'm a bit concerned about not being able to upgrade.

Any thoughts or ideas on why this is happening? I made and copied over some of the folders (/etc, /var/log,, /root) before I last restored - if that info might help.

Thank you - Richard

cmb

Exactly which packages do you have loaded? Does Internet work until the package reinstall finishes, or?

rnmixon

The packages I had loaded were:

AutConfigBackup
darkstat
ntopng
pfBlocker

I did not notice if the Internet was working before the packages re-installed. I will try to test this sometime later today or tomorrow

Or should I uninstall the packages before doing the update?

Thanks - Richard

cmb

I figured you had some package installed that would have an impact on Internet connectivity, like maybe Squid. I guess pfblocker could fall into that category, though I'd expect anything it seriously broke would have broken filter reloads, which would have been spewing alerts at you.

@rnmixon:

I did not notice if the Internet was working before the packages re-installed.

Ah, in that case don't worry about it, I thought the way you worded part of it you were stating that things were fine until packages were reinstalled.

No need to do anything beyond the normal upgrade process and let the packages handle themselves.

Try the upgrade again, once it's booted back up, start a packet capture on WAN with count 0 and all else at defaults. Try to ping out to IPs on the Internet, try to load web pages, attempt a variety of things then stop the capture. Download the resulting pcap. The summary text may suffice to see something, can paste that here.

rnmixon

Hi,

I have not had time to try the upgrade again - but will do it as soon as I have a chance and report back.

But in the meantime, we lost Internet connectivity again. Again - no workstations from the LAN can get out, but if I'm ssh'd into pfSense I can ping 8.8.8.8, do DNS resolution and connect to anyone on the LAN.

Based on some experience from a different install over the weekend I tried "pfctl -s nat" and sure enough no output at all for the nat configuration.

I next completely disabled NAT reflection, but that did not do any good.

I then restored a configuration backup from "2014-12-19 12:41:38" that I had made on Friday just after restoring a Hyper-V image and getting the system back working. But this did not fix things.

So I'm not sure what's going on if restoring a working config does not fix things.

I was able to restore from this morning's image backup of the virtual machine again … for the time being all is working fine again.

Does this suggest anything? If not I'll try the 2.2. upgrade as soon as I can, probably the next day or two.

Thank you - Richard

cmb

How is your outbound NAT configured?

rnmixon

I think this is what your asking for - so I think the answer is "automatic".


	 <nat><outbound><mode>automatic</mode></outbound> 
		...</nat>

Let me know if you need all of the NAT or other rules.

R

cmb

Yeah should be fine.

When "pfctl -sn" is empty, what do you get for "grep nat /tmp/rules.debug"? Are there any of the disk errors happening around the same time? I'm wondering if somehow it's failing to read the config, or failing to read the raw ruleset, because of the disk error. That's seemed to be cosmetic-only on my Hyper-V systems, but it's possible that's causing the problem. The NAT ruleset being empty is definitely the source of the issue, it's just not clear how it ends up that way. I'm strongly suspecting something specific to your Hyper-V environment like disk reads failing, as if that were a general issue, we'd have hit it internally in our testing and hundreds of people would be on this board griping.

mevans336

I am also seeing this under Hyper-V 3.0 and the 2.2 RC. It seems that every so often, apinger marks the WAN interface as down.


Dec 29 09:14:59	apinger: ALARM: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 09:15:21	apinger: alarm canceled: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 09:20:15	apinger: ALARM: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 09:20:31	apinger: alarm canceled: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 09:35:07	apinger: ALARM: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 09:35:28	apinger: alarm canceled: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 14:38:15	apinger: ALARM: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 14:38:35	apinger: alarm canceled: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 14:39:14	apinger: ALARM: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 14:39:30	apinger: alarm canceled: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 14:52:38	apinger: ALARM: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 14:52:54	apinger: alarm canceled: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 15:12:31	apinger: ALARM: WAN_DHCP(68.67.x.x) *** down ***
Dec 29 15:12:48	apinger: alarm canceled: WAN_DHCP(68.67.x.x) *** down ***

It also seems to be related to traffic or packet load, as the frequency is greatly diminished overnight when I am not using my network.

FWIW, I ran Smoothwall under this same Hyper-V config until a few days ago when I noticed that pfSense 2.2 went RC and it did not experience these issues, so this is something unique to Hyper-V + pfSense or Hyper-V + FreeBSD.

I'm going to disable gateway monitoring and see if that at least masks the underlying issue. Note, state killing on gateway failure is not enabled (the box is checked) so I don't think that's the cause.

I know Hyper-V is probably a low priority, but I am extremely excited to be able to run it with non-legacy NICs, so I'd really like to get this resolved and will help in any way possible. This is perfect for my 1Gbps connection at home (under Hyper-V I can hit 850Mbps, my Atom D2500 couldn't manage more than 500Mbps) and I'd like to start using it in our Hyper-V environment for my business in addition to our physical installations.

If you guys would like me to open a paid support case, I'd be more than happy. I can also provide you with access to pfSense installed in a Hyper-V VM if that would help.