WAN & LAN Connection lost, but can't figure out why.
-
I have two pfsense boxes. The one in my home is connected to a consumer grade cable modem. The one in my office is connected to a fractional T1. I have an IPSEC VPN setup between the two boxes and for the most part it is very reliable.
Recently, (seems like after I installed 1.0.1 on both machines), the office would periodically lose it's WAN & LAN connections, and at this point, I'm at a loss for the reasons why.
This happened once or twice when I was in the office, and it seemed to come back up after a few seconds. It has also happened when I have been out of the office. It seems like when this happens, it just stays down.
I believe this might have something to do with losing the connection at my home, which happens fairly frequently. It seems like when this happens, the VPN connection is cut and makes the office connection die. Oddly, it has no effect on the home connection once service is restored (at home).
I use the connection from my home to initiate the VPN.
Which logs can I look to start diagnosing the problem?
-
Sorry to reply to myself…but it looks like the whole machine is crashing. The RRD graphs show the nothing during the time the machine was down. I thought it was just the connection, but it looks like the whole thing was down.
Also, I logged into the console and fsck shows this:
fsck
** /dev/ad0s1a (NO WRITE)
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
INCORRECT BLOCK COUNT I=8572936 (4 should be 0)
CORRECT? no** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=8572954 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572955 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572956 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572957 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572958 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572960 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572961 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572962 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572963 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572964 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572967 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572968 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572969 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572970 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572971 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:24 2006
CLEAR? noUNREF FILE I=8572985 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:23 2006
CLEAR? noUNREF FILE I=8572986 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:23 2006
CLEAR? noUNREF FILE I=8572987 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:23 2006
CLEAR? noUNREF FILE I=8572988 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:23 2006
CLEAR? noUNREF FILE I=8572989 OWNER=root MODE=100644
SIZE=0 MTIME=Dec 29 10:23 2006
CLEAR? no** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? noSUMMARY INFORMATION BAD
SALVAGE? noBLK(S) MISSING IN BIT MAPS
SALVAGE? no3233 files, 35432 used, 37301987 free (899 frags, 4662636 blocks, 0.
0% fragmentation)Seems like I have a bad, or going bad HD...no? Its brand new, too.
-
Can anybody help? This happened again during the middle of the night. I can't even begin to diagnose where the problem is since the machine is completely down and when I reboot it, the logs are cleared. What am I doing wrong?
-
Can anybody help? This happened again during the middle of the night. I can't even begin to diagnose where the problem is since the machine is completely down and when I reboot it, the logs are cleared. What am I doing wrong?
Sounds like FreeBSD is panicing. Check your hardware.
-
Which component should I check? I can't get any useful information form the logs since they're overwritten when I reboot. How can I tell it not to do that? Shouldn't kernel panics be in /var/log/system.log. But it only goes back to the last boot.
-
Set up a remote syslog server. Search the forum for more info.
-
http://ultimatebootcd.com/ has some tests to run and check hardware.
-
It seems to be happening more frequently now, and I'm not sure it it's software related or some kind of hardware problem.
This happened again last night. When I came into the office moring, I checked the console and the machine was completely locked up. There was a warning at the bottom of the screen that said "Warning…pseudo-random number generator for IPSEC..." Unfortunately, I was not able to write the exact wording of the error down before someone else rebooted the machine.
At this point, I've turned off the VPN. I will see what happens.
-
This is getting really frustrating since this continues to happen on a frequent, but random basis. VPN has been off for a while now.
I've tried setting up a remote syslog, but it doesn't appear to be working:
1. Enabled remote syslog of all items on work pfsense box and sent to my home FreeBSD 6.2 server.
2. Opened up and NATed UDP 514 to home server.
3. Killed home syslogd server and started it "/usr/sbin/syslogd -a [work pfsense external IP address].I get nothing. What am I doing wrong.
The thing that kills me is that everytime this happens, both the WAN and LAN connections are dead…I can't even get to the pfsense box when I get back to work. I basically have to pull the power and let it reboot.
-
Can you try different hardware? Maybe freebsd has some hickups with the one you are using right now.
-
Not really. I specifically purchased this machine to be a pfsense firewall!
I've tested the hardware and it checks out OK. What I don't know is if there are some incompatibilities with PFSense. It is an emachines, barebones walmart special. Again, I only wanted a firewall machine!
I have the syslog working now via Kiwi…so we'll see what happens the next time it dies.
We'll see what happens.
-
Checking for Biosupdates sometimes helps with compatibility in edgecases too.