Very very strange problems. [SOLVED]

  • After having connection issues with at&t DSL I decided to reduce the fault points by building an open source firewall to replace my 10 year old wireless router.  I am using…

    An old Dell Pentium 1Ghz with 512 ram

    2 intel pro 100 nics (brand new from plastic wrapped boxes, purchased just for the project)

    An old 13gig hard drive

    A netgear wga311 wireless card (used, purchased for project)

    An old 16 port netgear hub (not a switch, just a hub)

    About a week and a half ago I installed the most recent version of monowall on this box, and everything worked great.  Wireless was spectacular, internet connection was solid, and all around I was very pleased.  Then yesterday after finding pfsense I decided to give it a shot since it has a TON of features I really wanted that weren't in monowall.

    Here is the sequence of events leading up to now.  I installed pfsense last night, and for the first hour and a half or so everything was going great.  I was playing with all of the features, and I installed several packages to play with, I even set up squid.  I figured if I broke something I could just wipe the drive and reinstall since its such a quick procedure.

    Anyway, I think at one point I may have installed a package that logged traffic or something and it ended up using all my hard disk.  I know this because one time when I logged in pfsense was showing 100% disk usage on the main screen.  I figured no problem, I would just reinstall.

    I then formatted the drive during the install and installed again.  This is where the strangeness starts.  pfsense ran for awhile but then all of a sudden I couldn't access the web interface and my internet connection dropped.  I went over to the console, and everything looked fine.  TOP showed a marginal amount of cpu usage.  I then rebooted, and again everything chugged along for awhile, and again the web interface died, and no internet.  I then noticed that every time I started a large file transfer on my internal network that pfsense would die.  In fact after a third reboot pfsense ran for a bit while no major traffic was on the network, and I started a large file transfer testing to see if it would kill pfsense, and it did.  One of the times I went to the console I saw the following message repeated over the whole screen....

    discard frame w/o leading ethernet header (number gibberish here)

    At this point I was tired, so I switched to my old wireless router and went to bed.  I started on round two with pfsense today.  I did another fresh install, and did the basic configuration, and then once I set it up almost immediately after reboot I couldn't access the web interface and my internet connection dropped.  I tried it again with the exact same results.  I installed, configured, and immediately after configuring it I can't access the web interface and my internet connection dies.  Now remember, this doesn't happen right after I configure it.  It takes a few minutes, so its not like I'm making a firewall rule and locking myself out or anything.  I configure it, it works for a few minutes, then it explodes.  When I go to the console, everything looks fine.  The first couple times I believe a member of the family was transferring a show between tivo's so that may have killed but, but a little later once that transfer was done I tried it again, and pfsense still died without any real traffic on the network.  I'm completely stumped.

    The basic config I'm doing is the following as best I can remember....

    1. At the console I assign ports by letting it auto detect them as I plug network cables in, assign lan address, turn dhcp on.

    2. Go to the web gui, change admin password, set up the wan with my pppoe information.

    3. Add the wirless card to interfaces, then bridge it to the lan, set it as an access point, set it to use wpa+wpa2, I add my key for wpa2, I set authorization to open authorization, and I also set that one thing that protects wireless g packets from b packets, I don't know what it is but it sounded like a good idea.

    4. Go into the advanced setup and enable ssh so I can view the console from my computer.

    5. I then set firewall rules that allows all traffic from any source on the WLAN, and make a rule that allows any traffic from any source on the LAN.

    6. Enable DynDNS and put in my account info.

    And I tried that general config three times but the web gui just kept dieing and my internet connection kept dropping.  I would reboot the box and I could get back into the web interface and get internet access for a minute or two after the reboot and it would die again.  I then reinstalled, reconfigured, and the same thing happened, and I then did it all again for a total of 3 reformats and reinstalls.

    Finally I reinstalled monowall, and monowall is chugging along like a champ as I type this.  I even have several HUGE files transferring right now maxing out that poor 16 port hub trying to replicate the problem, but whatever it is monowall doesn't seem to be affected at all.

    The only hunch I have is that I have noticed that pfsense always puts the LAN interface in promiscuous mode, and perhaps because I'm using a hub and not a switch that somehow pfsense is trying to read or process all the network traffic that its seeing on the LAN interface and somehow thats killing it.  However, even if that is the case, its still strange because at the console TOP shows marginal cpu usage even after I've lost the web gui and internet connection.

    I was a network administrator for 4 years about 10 years ago, so while I'm no expert now, I'm capable, and this situation has me completely stumped.  Any help would be greatly appreciate because pfsense gets my nerd hormones raging, and I'd really love to use it.

  • Are your NICs fxps? There is a known problem in pfSense 1.2.3 - the fxp driver mistakenly thinks some type of fxps have hardware checksum offload capability.

    Suggest you disable hardware checksum capability from web GUI (System -> Advanced functions, scroll down to Hardware Options.)

    Have you checked out the state of your 13GB drive?. Its probably well past its normal expected life.

  • Yep, the nics show up as fxp0 and fxp1.  Thanks for the tip about the potential hardware problem and the setting to fix it.  I'll give that a shot and report back.  Also, I did think of the hard drive as being a potential problem but when I went to the console I accessed the file structure and was browsing around with no problems, and in my experience if you can access the computer, and if theres no clicking, you can usually trust that the drive is working.  However, I will try replacing it if nothing else works.  I have an old 10gig I could swap in.  :)

  • So far so good.  It looks like the checksum thing was it.  Thanks so much for your help Bob.  I really appreciate it!

    My luck with these things never ceases to amaze me.  The main reason I bought the intel 100's was due to their practically legendary status in the nix world, and then when I go to use them they are one of the broken ones. lol

Log in to reply