Can someone help me troubleshoot this simple setup?
-
When the system appears to be hung, can you get onto the console and ping anything?
I don't have a monitor or keyboard hooked up to it, but it looks like I should probably do that :) So no I have not tried that yet. I just try to access the web interface and try to ping it from a client computer inside the network and no luck. Funny part is it will send out emails when it sees a connection drop though! Happened while I was out of town the past 2 days.
-
Could be a NIC driver issue. I don't like using those Marvel Yukon ones. Perhaps someone else with those could comment on those.
-
I certainly hope someone chimes in because I'm as noob to PFSense/BSD as it gets. =\ Any direction/instruction would be great.
-
Which interfaces are you using with which NICs?
I have faced lockup problems with Marvell Yukon NICs though that was using the msk driver not the sk driver.
It was only a problem when the NIC/driver was under high load, when testing throughput. If I limited the speed to 100Mbps by inserting an old switch it was rock solid.
This is pure speculation at this point. ::)Steve
-
You have a variety of types of interfaces: 2 x sk, 1 x sis and 1 x vr. What are your interface assignments? (e.g. LAN is vr0, WAN is sk1, …)
When the system appears to be hung, can you get onto the console and ping anything?
I don't have a monitor or keyboard hooked up to it, but it looks like I should probably do that :)
Yes you should.
Funny part is it will send out emails when it sees a connection drop though!
What will send out emails? (pfSense system? a computer on the LAN? a computer on the OPT1 interface?) And what computer acts as the SMTP server?
Any direction/instruction would be great.
The already given suggestion would be a good start: @podilarius:
When the system appears to be hung, can you get onto the console and ping anything?
When the pfSense box appears to be hung can you ssh into it over one of the other interfaces? or access the web GUI over one of the other interfaces? It would be helpful to be able to distinguish between the box being hung (not responding to shell commands) and one of the interfaces being hung (not responding to incoming frames).
-
You have a variety of types of interfaces: 2 x sk, 1 x sis and 1 x vr. What are your interface assignments? (e.g. LAN is vr0, WAN is sk1, …)
"CenturyLink" is sis0
"Charter" is vr0
"LAN" is sk0sis0 or vr0 is an onboard nic
One thing I noticed is the marvell are gigabit cards and its only allowing 10/100 connections even though my equipment is all gigabit.When the system appears to be hung, can you get onto the console and ping anything?
I will check that next time it hangs
Funny part is it will send out emails when it sees a connection drop though!
What will send out emails? (pfSense system? a computer on the LAN? a computer on the OPT1 interface?) And what computer acts as the SMTP server?
It sends out to an external smtp server with an email service I have elsewhere. I believe its the mailreport package and the pfsense system because I can get my daily report from mailreport, and get notifications from the pfsense system "Gateways status could not be determined, considering all as up/active." and "MONITOR: CENTURYLINKGW has high latency, removing from routing group"
Any direction/instruction would be great.
The already given suggestion would be a good start: @podilarius:
When the system appears to be hung, can you get onto the console and ping anything?
When the pfSense box appears to be hung can you ssh into it over one of the other interfaces? or access the web GUI over one of the other interfaces? It would be helpful to be able to distinguish between the box being hung (not responding to shell commands) and one of the interfaces being hung (not responding to incoming frames).
I did not even think about trying to log in to my pfsense from an outside internet source and I do have that option as I have a 3g mobile broadband connection too. (I try to be well equipped haha!) I will also try that next time.
-
Which interfaces are you using with which NICs?
I have faced lockup problems with Marvell Yukon NICs though that was using the msk driver not the sk driver.
It was only a problem when the NIC/driver was under high load, when testing throughput. If I limited the speed to 100Mbps by inserting an old switch it was rock solid.
This is pure speculation at this point. ::)Steve
It does seem to happen more frequently when I have 4 games going on my pc at once and downloading bit torrents, doing speed tests. Pretty much anything that would soak up all the bandwidth possible.
-
I would guess that you are suffering from the LAN interface freezing. When you hook up a console or try to log in from the WAN side you will find out.
It's interesting that you are only getting 100Mbps connection to your sk0 interface. Faulty cable? Fussy switch? Exactly which chip does the d-link NIC use?You haven't said what speed your WAN connections are but I'm guessing they are probably less than 100Mbps in which case you may be better of using the sk0 interface as one of your WAN connections instead.
Steve
-
I would guess that you are suffering from the LAN interface freezing. When you hook up a console or try to log in from the WAN side you will find out.
It's interesting that you are only getting 100Mbps connection to your sk0 interface. Faulty cable? Fussy switch? Exactly which chip does the d-link NIC use?You haven't said what speed your WAN connections are but I'm guessing they are probably less than 100Mbps in which case you may be better of using the sk0 interface as one of your WAN connections instead.
Steve
I'm 10d/3u on cable and 8d/1u on dsl. It's did the problem a couple times today and all I have to do is unplug the LAN cat5 cable from the pfsense and plug it back in and it works again. I've been too addicted to skyrim today to actually try and troubleshoot it haha. =\ Today is my vacation..
-
I just switched the LAN interface to sis0 and put the centurylink connection on the sk0. We will see how it performs today. :) Wish me luck!
If it is a driver issue, how do I switch the driver and which one will work ?
-
You can't just switch the driver, the sk driver is the correct one for your card. However it may have been updated since FreeBSD 8.1 was released which is what pfSense 2.0 is based on. You may be able to recompile a newer driver with bug fixes. However that driver has been fairly stable for while, I've been using it no problems with Marvel 88E8001 based NICs.
There are some tuning options you can try with the driver.
Steve
-
You can't just switch the driver, the sk driver is the correct one for your card. However it may have been updated since FreeBSD 8.1 was released which is what pfSense 2.0 is based on. You may be able to recompile a newer driver with bug fixes. However that driver has been fairly stable for while, I've been using it no problems with Marvel 88E8001 based NICs.
There are some tuning options you can try with the driver.
Steve
can you elaborate on the tuning options?
-
after switching LAN to the other adapter I have not been disconnected once yet :) Going on a good 12+ hours so far.
-
So it probably was the NIC locking in some way. Presumably you are using the Marvel (sk) NIC for one of your WAN interfaces, are you checking that that hasn't locked?
In my testing the msk driver would usually show a 'watchdog timeout' in the system logs when it locked.Steve
-
Actually the connection I've had connected to that interface has been going up and down non-stop all the time, its been down 90% of the time. WTF. I'm going to try the 2nd marvell adapter I have in this thing. I have 2 of them totaling 4 nics. I just never mentioned the 4th since its not being used for anything.
-
Hmm,
Well the fact that it has been going up and down, whilst not good, does indicate that if it's a problem with the NIC or driver it is at least able to recover. My own experiences were that a locked card will never recover until the interface it brought down and back up.
Do you have any idea if that WAN connection was stable before? It's Sunday do perhaps your ISP are doing maintenance?
Can you see which Marvel IC is on that card?Steve
-
Just wanted to finalize this thread out by saying I ended up swapping out both the nics. Their chipset numbers are: 88E8001-LKJ1 AJ476A.2 0714 A4P TW Marvell of some kind. Hardware version: B2
Now everything works fine except dealing with havp and squid now :)