Successful Install on Watchguard Firebox X700!
-
For benchmarking use a setup like this:
host1–---pfSense1------(ipsec)----bench-pfsense----host2
- host1 and host2 have to be able to generate traffic that can keep the ipsec encryption busy (more traffic than it actually can handle)
- pfSense1 has to be faster than the bench-pfsense or you will measure the wrong machine
- only use crossovercables between all the machines to reduce othe factors like switches or loaded networks
Once you have set this up use tools like netio or iperf to pump traffic from host2 to host1 and modify the hardware of your banch-pfsense. You also can play around with different encryptions as some are faster and some are slower.
-
After reading this topic I also revived 2 old WG Firebox X series boxes I had lying around. Install went very smooth, and everything seemed to be working very well.
Untill I found out the box runs very unstable…
Network throughput is very unpredictable, sometimes up to 6Mbit but mostly about 512kbit where it should be close to 100Mbit (other 100Mbit network devices connected through cat5e)
Network traffic often comes to a complete stop for either a few seconds or untill I reboot the box, and strange enough I can trigger this in a few ways like 'trying to open the webgui from the WAN side of the firewall' or "start a large download". Once this occurs, the box stops answering all network traffic, I can't even ping it anymore. Most of the times, once I stop the download or close the browser that is trying to open the webgui, the box starts answering to ping requests right away again.Like in the other Watchguard threads on this forum, this seems to be related to the "kernel: re0: watchdog timeout" error that shows up in the logs at the time the symptoms occur.
From what I've read, it has to do with hardware issues concerning the network cards ?I've searched the forums and google, and found a few 'solutions' suggested by other people with the same issues:
- disable ACPI using 'echo hint.acpi.0.disabled="1" >> /boot/loader.conf' -> didn't help
- enable device polling -> after this, once the watchdog issue occures the firewall always stops answering network traffic untill a power cycle, quite anoying
- throw out the NICs, replace them with NICs using other chipset -> unfortunately, on this Watchguard hardware that is quite difficult, these are 6 onboard realtek based nics
- disable "plug and play OS" in the BIOS -> unfortunately the watchguard mobo doesn't have a keyboard connector, so I can't get in the bios.
- use the SMP kernel in stead of the uniprocessor kernel -> I'm running on the embedded kernel because the firebox only has serial input/output in stead of vga/kb.
I have three questions actually:
-
How do I switch to the SMP kernel without losing the serial console ? How can I do this from within pfsense, without reinstalling the entire device ?
-
Callout to the other people using watchguard hardware for pfsense, does this watchdog timeout error occur on all fireboxes with pfsense ?
-
Does anybody have any other suggestions I can try ?
(That's 4 questions actually :) )
This is a big problem since the box is just unusable now, and I'd really like to get pfsense on it.
Small Update: This occurs on both fireboxes with pfsense on them, in completely different environments, connected to different switches, different cabling.
-
Right now I have 3 x700s running, so far have not had a problem. I used to receive the watchdog timeout errors when it was connected to any netgear hub or switch(tried 4 different ones). Once I removed the netgear switches(plugged directly into cable modem and an HP procurv switch) the errors were gone. However when I was getting the watchdog timeout error it still never caused any problems(currently my 3rd x700 is still plugged into a netgear switch and continually gets those errors). So maybe if we find some differences between your install and mine it will help you narrow down the problem.
Did you do a full install onto a HDD or embedded on CF?
What switches are you using? -
Did you do a full install onto a HDD or embedded on CF?
What switches are you using?I'm having these issues on 2 firebox units, let's call the firebox 1 and firebox 2 for now.
Firebox 1 is a full install on a 1GB CF card using a usb cf reader and a laptop with the livecd, installed through option 99, selected the embedded kernel at kernel selection, plugged the CF card into the firebox after install.
It is connected to a cisco catalyst 3550 on one of the FastEthernet 100Mbit ports.Firebox 2 is a full install on a 2.5" hard drive, installed identically the same way.
This one is connected to a cheap 16port table switch but I can't remember the brand right now (I'll try to get back on that later tonight)Both units have the lcdd process installed to show cpu and memory stats on the firebox lcd display, installed as described in http://forum.pfsense.org/index.php/topic,7920.msg46356.html#msg46356 . Would this be causing an issue ?
Strange thing is I ran firebox1 on my home network for about 3 hours while installing and configuring it, and did not notice these errors at that time. However, I can't say for sure they weren't there, maybe I just didn't notice them since I didn't run any traffic through the box.
Thanks a lot for helping me on this !
UPDATE: Testing stuff is a bit difficult since both boxes are on remote locations (which is exactly why this is such an annoying problem :) ) but I'll try to drive over there tomorrow and see if the problem persists when I plug my laptop straight into firebox 2 in stead of through the desktop switch. We should be able to confirm/rule out a switch problem then.
-
I am also running lcdd, but not the way "ridnhard19" did it(check my post a few down: http://forum.pfsense.org/index.php/topic,7920.msg46902.html#msg46902).
Is your 1GB CF card a microdrive or regular flash mem?
Looks like we installed them the same way, did you have to change the partition information in /etc/fstab? If so did you also change your swap partition to the correct drive?
I was thinking the watchdog error might have been from just using cheap switches, but the cisco rules that out.
I don't have many other ideas, 2 of mine have been up for over a month running a couple ipsec tunnels, carp, squid, squidguard, and handling quite a bit of traffic(about 3-4 voip calls, 4-5 terminal server sessions, large file transfers, and some web browsing all at the same time) and it doesn't seem to slow down at all.
-
Is your 1GB CF card a microdrive or regular flash mem?
Looks like we installed them the same way, did you have to change the partition information in /etc/fstab? If so did you also change your swap partition to the correct drive?The CF card is a normal CF card, not a microdrive. I know this is not recommended and I plan to change this over time. The reason I installed it like this is that I originally thought picking the 'embedded kernel' during install also meant having a read only filesystem like on the embedded images. I found out after installation that it is not, but didn't bother reinstalling yet. Figured it'd be a nice test to see how long the CF card lasts. (CF cards are cheap nowadays anyways, this one was 6 euro)
On the partition information, yes I had to change them after install, and I also changed the swap partition info on firebox 2.
The correct setting on firebox 2 (with the 2.5" hard drive) was /dev/ad2s1a for the root fs and /dev/ad2s1b for the swap.
Firebox 1 needed /dev/ad0s1a since it is running from a CF card.
Firebox 1 however doesn't have a swap partition since I manually removed that during install. (I figured using swap on a CF card would be really overdoing it :) )So far I can't really think of anything we did different. Tomorrow evening I'm going over to firebox 2 to test whether the problem also occurs using just a network cable, without being connected to the switch. I'll post an update while testing it.
-
I'm on an x500 with 2.5" laptop drive, full install, not embedded kernel.. also do have the LCD driver
I've had the watchdog timeouts, on re1 LAN interface only… about 4 or 5 times I think, sometimes its able to recover other times required a hard reboot. I can pretty confidently say this problem only occurred to me while I was clicking around in the webgui configurator, while under minimal network load. Haven't had them happen when not logged into gui even when maxing out my dual wan bandwidth for hours at a time/lots of states, torrents etc
when it happens I can't help but get the (probably incorrect) feeling that the http server or php is running away with the cpu or network card or something... thought about trying a tweak i read about on these forums about increasing the priority of the http server, an update supposedly adjustable in upcoming 1.3
-
Ok, some more debugging info from the aforementioned firebox 1, but this one is a tricky one, I have absolutely no logical explanation for it:
First a little network schematic:
LAN (re1) <--- 172.20.2.1/24 ---> Firebox 1 <--- WAN (re0) using public x.x.x.x/28 subnet ---> SWITCH <---> The internet <---> Laptop at remote location |_> linux server in same /28 subnet
(I hope this is clear, the linux server is connected to the same switch as the WAN port of the firebox, the /28 subnet consists of public internet ip addresses)
Yesterday I was doing some tests on when the watchdog errors occur, from my laptop at the remote location.
pfsense Webgui is running on HTTPS, port 443 so I opened up the HTTPS port and icmp ping replies on the WAN side of the firewall.
I'm continuously running a ping from the laptop to the WAN IP of the firebox.
As soon as I told firefox on my laptop to connect to the firebox WAN ip through https, firefox shows me the http authentication dialog, I fill in the fields and press OK, and the firebox WAN ip immediately stops responding to ping requests.
Firefox tries to load the page but stays blank (since the firebox obviously stopped sending data), as soon as I press the STOP button in firefox the firebox WAN ip address starts replying to PING requests again.So far you'd think the http process is causing the problem, but now it starts to get really strange:
I open up an SSH session to the linux box in the same public /28 internet range as the firebox, and tunnel a https connection through the SSH connection.
In other words, I mapped a tcp port on my laptop to one on the linux server so that from the firebox point of view my requests to open up the web interface come from the linux server in its own WAN subnet.
And now, opening up the web interface works perfectly right away. I see the pfsense interface, can click around in all menu's, don't get any timeouts and don't get any watchdog errors in the logs… The public WAN ip also keeps responding to the still running ping requests.So I tried to open the web interface from my laptop over the internet through the real public IP again, and again the watchdog errors occurs and the firebox stops responding to network traffic...
I'm not making any sense of this...
-
Thanks for the help trying to solve this, but I'm afraid I'm running out of time to fix this.
I've replaced both fireboxes with 'normal' pc's, exported and imported the config, and all problems have disappeared.I'm not giving up on the watchguard hardware yet, but I don't have the time to keep looking for a solution right now.
-
ugh.. just had to add a few rules to the firewall and it ended up being a multi-reboot network outage due to watchdog timeout freezeups.. box had been up for a couple weeks, transferring tens of gigs of data per day, but soon as I need to poke around the webgui, the lan interface decides to puke all over itself
-
ugh.. just had to add a few rules to the firewall and it ended up being a multi-reboot network outage due to watchdog timeout freezeups.. box had been up for a couple weeks, transferring tens of gigs of data per day, but soon as I need to poke around the webgui, the lan interface decides to puke all over itself
Hey Valhalla1,
This is what i've been battling with now for a number of weeks. There is an issue with the driver support in Freebsd and with the 8139C+ RealTek chip used for the network card's in this box. Its been a real headache trying to figure out. I'll be pulling mine out of the main network and setting it off to the side soon but have been trying a number of things.
What I've done so far:
-Update pfSense to use the 6.3 RELEASE of freebsd didn't help
-Update the pfSense to use the 6-Current (latest 6.3) drivers for the realtek network controller (re)
-Disabled ACPI - as I had thought this fixed it a while ago but I was fooled
-I'm out of ideas. -
ugh.. just had to add a few rules to the firewall and it ended up being a multi-reboot network outage due to watchdog timeout freezeups.. box had been up for a couple weeks, transferring tens of gigs of data per day, but soon as I need to poke around the webgui, the lan interface decides to puke all over itself
Hey Valhalla1,
This is what i've been battling with now for a number of weeks. There is an issue with the driver support in Freebsd and with the 8139C+ RealTek chip used for the network card's in this box. Its been a real headache trying to figure out. I'll be pulling mine out of the main network and setting it off to the side soon but have been trying a number of things.
What I've done so far:
-Update pfSense to use the 6.3 RELEASE of freebsd didn't help
-Update the pfSense to use the 6-Current (latest 6.3) drivers for the realtek network controller (re)
-Disabled ACPI - as I had thought this fixed it a while ago but I was fooled
-I'm out of ideas.well I appreciate the efforts, I got into pfsense really due to seeing this original post and just had to get me one of these pretty red boxes. managed to snag one on ebay for $51.. a fantastic value considering I paid twice that a few years ago for a 486 cpu soekris with half the network interfaces. fwiw it works great as long as I don't need to use the webgui a lot. also I (usually) dont have to reboot pfsense when it happens if I hit stop on the browser and close the tab and wait a few second
are the realtek drivers in freebsd 7 any different than the 6.3 RELEASE drivers?jmcentire said he stopped getting the watchdog timeouts after switching to hp procurve switches. mine is connected to just a 24 port d-link unmanaged switch on the lan interface I get the timeouts on. maybe I'll pick up one on the cheap, wanted to play with vlans anyway
-
My timeouts went away when I upgraded my switch to one of the higher end netgear managed ones and a cisco one on my other one. I got 3 of these running now. Its sad One is a v60 that I paid over 10k for in 2002. the others are x1000 and x700 i picked up off ebay for under 200 each.
i working on doing a full install on one now using cf and a laptop drive for the paritions that write alot . Like /var/log and swap
-
Hum, you guys bring up a good point. I have an SMC managed switch which always had seemed to work well. I just changed the port configurations on the switch which the firebox is plugged into, disabling the autonegotiation for those set of ports and manually set then to full duplex 100.
I hope this will do the trick.
-
Well I tried manually setting the port speeds to 100 full-duplex in pfsense and on the SMC switch and got the same watchdog timeouts. I'm going to try a net gear switch i've got laying around and see if that does anything. I'll post back the results of that.
-
Hello, would it be possible to snap some pictures of the inside of the x700? Id like to see the layout of the board, clearance, and such. Thanks.
Nevermind, I found a high res picture on another site.
-
Anyway, if anyone else is interested in these watchguards, the X500, X700, X1000, and X2500 are all the same hardware, they just have different licenses to allow higher throughput.
It looks like the mini pci slot could be used for a wifi card, so you could choose if you wanted a vpn card or wifi card in that slot. Granted, both have to be recognized by pfsense.
-
Well I tried manually setting the port speeds to 100 full-duplex in pfsense and on the SMC switch and got the same watchdog timeouts. I'm going to try a net gear switch i've got laying around and see if that does anything. I'll post back the results of that.
any updates on that SMC switch on the watchdog timeouts ? I'm looking to buy a managed switch that works with these watchguards without the timeouts, preferably as cheap as I can get away with.
there are reports in this thread that some cisco switches had the errors, but someone else said theirs don't on cisco switches (what models?)
also netgear has been reported with the problems, but not a higher end model which got rid of the errors (which higher end model?)
my d-link causes them.. hp procurve has been mentioned as error free but not sure which models (any managed procurve switch?)looking for something I can find under $200 on ebay preferably.. what about a dell 2716 'webmanaged' switch those are cheap but support vlan, link aggregation, port mirroring..
and back to a different subject, the VPN accelerator card. anyone tried 1.3 ALPHA ALPHA pfense on their watchguard boxes ? I would but mine is in production use at the moment. maybe since its freebsd 7 it might add support for the vpn card in these watchguards?
-
The watchguard seems to have a bunch of Realtec NICs onboard.
That's where I'd expect the troubles to start…The DeLL 5224 or 5324 switches basically are rebranded SMCs.
They can be found on eBay regularly if you don't mind the additional ports. -
there are reports in this thread that some cisco switches had the errors, but someone else said theirs don't on cisco switches (what models?)
also netgear has been reported with the problems, but not a higher end model which got rid of the errors (which higher end model?)
my d-link causes them.. hp procurve has been mentioned as error free but not sure which models (any managed procurve switch?)Im using a couple Netgear GSM712's I bought off ebay. Ive tested my watchguard v60 and x700's and no timeout problems. Also I used them thru my cisco 3548xl with no problems. I do have Dell 5324's at work . which is were these firewall are headed in the end but I havent had a chance to test them on the dells.