Problems with 2.0 on CybertronPC Quantum QJA1221
-
I have used pfsense for almost 2 years now on an old Dell desktop as a firewall appliance to a medium sized non-profit organization(this build worked great). We have a 16/2 Cable Wan connection and a bonded T1 pair(for our 80 line VoIP System) which I use as a backup. Three months ago I purchased a CybertronPC Quantum QJA1221 (specs: http://www.cybertronpc.com/Customkititems~kc~SVQJA1221~Group~BUSINESS~Cat~SERVERS.htm) and loaded versions 2.0 RC 2 i386 and then tried 2.0 RC3 amd64. Updated both builds several times, no installed packages. This is the problem I continue to have:
The unit will run fine for anywhere to a few hours to about a week, then it becomes unresponsive to pings, no longer routes traffic or assigns addresses, and I am unable to access the web interface. On the console everything appears fine, no errors listed and can be rebooted by keyboard. Once rebooted the units picks right back up and works normally until it crashes again.
Please does anyone have a clue why this could be happening? -
Does this happen with i386 builds AND amd64 builds?
The unit will run fine for anywhere to a few hours to about a week, then it becomes unresponsive to pings,
What sort of pings? ping by hostname? ping by IP address? ping from system on LAN to pfSense? ping from system on LAN to internet?
When this happens, please do the following on the pfSense console and post the response here:
-
ping internet host by name
-
ping internet host by IP address
-
ping LAN host by name
-
ping LAN host by IP address
-
vmstat -i ; sleep 5; vmstat -i
-
netstat -m
(don't wait more than about 5 seconds for the pings.)
-
-
I am pinging the pfsense lan inteface ip address from my desktop. And yes this happens on both i386 and amd64 builds.
-
I keep getting this System Log entry:
dhcpd: parse_option_buffer: malformed option vendor-class. <unknown>(code 1027): code tag at end of buffer - missing length field.
It happens over and over again, several times a minute and it is the only entry. Not sure if this helps.</unknown>
-
Does this happen with i386 builds AND amd64 builds?
The unit will run fine for anywhere to a few hours to about a week, then it becomes unresponsive to pings,
What sort of pings? ping by hostname? ping by IP address? ping from system on LAN to pfSense? ping from system on LAN to internet?
When this happens, please do the following on the pfSense console and post the response here:
-
ping internet host by name
-
ping internet host by IP address
-
ping LAN host by name
-
ping LAN host by IP address
-
vmstat -i ; sleep 5; vmstat -i
-
netstat -m
(don't wait more than about 5 seconds for the pings.)
Okay, it just happened… I was able to ping both WAN Hostnames and WAN IPs from the console during the "crash". However I was not able to ping any LAN device or address. I didnt know what you meant by vmstat -i ; sleep 5 ; vmstat -i or netstat m... are those commands to run in the shell? From these results it looks to me like an issue with the LAN interface, is this correct?
-
-
Does look like an issue with your LAN interface.
The
vmstat -i ; sleep 5; vmstat -i
netstat -m
are shell commands. The first gives interrupts counts by IRQ sleeps 5 seconds then displays interrupt counts again (so may show if a device is still interrupting). The second displays various counter related to network buffers so may show a resource exhaustion problem. Since you can get ping response on your WAN interface you are unlikely to have a network buffer exhaustion problem. I'm still interested in the vmstat output, not so interested in the netstat output though there would be no harm in collecting it.
Suggestion: swap your WAN and LAN interfaces and see its the WAN interface that is "gummed up" next time the problem occurs.
It appears your motherboard has two Realtek (re) interfaces. You mention three interfaces. What are their FreeBSD names? (e.g. WAN1:re0, LAN:re1, WAN2:sk0). You will be able to get this information from the Interfaces -> (assign interfaces) page in the web GUI.
-
I have re0(LAN) re1(WAN1) and re2(WAN2). The third interface is a PCI Gigabit Card from Trendnet(assuming they use Realtek chips). Last night I switched the LAN interface to the re2(PCI Card) just to see if this fixes it. It hasn't crashed yet, will keep posted.
Still getting this error in the System log though:
dhcpd: parse_option_buffer: malformed option vendor-class. <unknown>(code 1027): code tag at end of buffer - missing length field.</unknown>
This error happens several times per minute.
-
Still getting this error in the System log though:
dhcpd: parse_option_buffer: malformed option vendor-class. <unknown>(code 1027): code tag at end of buffer - missing length field.</unknown>
This error happens several times per minute.
Probably some machine is generating badly formed DHCP. You might need to do a packet trace to find out the IP address or MAC address of the offender. Its a pity the message doesn't report that.
However, if I recall correctly, there have been problems with enabling some of the NIC offload options on some types of NICs (including Realtek) and this might be relevant to your problem. Under System -> Advanced click on Networking tab, scroll down to Network Interfaces and check the three offload options are disabled. I don't know what else needs to be done for that to take effect but a reboot should do it.
-
Okay no interface crashes since I swapped the WAN and LAN connections, however, I will give it a little more time before I jump up and down with excitement. I disabled the checksum option and rebooted, but am still getting that error over and over again: dhcpd: parse_option_buffer: malformed option vendor-class. <unknown>(code 1027): code tag at end of buffer - missing length field</unknown>. I don't care so much though as long as it stays up.
-
. . . am still getting that error over and over again: dhcpd: parse_option_buffer: malformed option vendor-class. <unknown>(code 1027): code tag at end of buffer - missing length field</unknown>. I don't care so much though as long as it stays up.
Suggestion: on a console or SSH session to pfSense, give the shell command # tcpdump -i IF -c 10 -e -vvv udp port 67 (replace IF by the FreeBSD name of the interface with the DHCP server). This should provide at least the MAC address of the system(s) sending the offending message.