BETA BLOWS, WANT TO DOWNGRADE ASAP



  • I have no idea what is going on, but since I upgraded to the snap shot, my PFSense keeps freezing up, and loosing the internet connection. I am using a LAGG for my lan (two 10/100 nics) and can still access my server, but shows that the internet interface has lost connection. This is fixed once you reboot. Also my captive portal, keeps crashing or reporting a crash on the main page, funny thing is, I m not even using the captive portal at all…. none of these issues exists in the stable release. Is there anyway to go back to the stable release with out reinstalling as I am not in the same city as my pfsense box remotely?

    Thanks again, I hope that the RC is much better then the BETA.

    Best wishes,
    Matt



  • Just an idea for a work around, could I setup a reboot every night at lets say 5 am, this would keep the internet interface up and running, atleast till this moves out of beta to RC?

    PS: just a little more about my setup, i had two nics for the internet at one point, one wan has been disabled as I no longer have two services, so I did switch nics to see if that was the problem, but it seems to lose internet still :( In short, it looks to be software related.

    UPDATE: is CPU on a i386 dual core, P4 3 Ghz @ 51% normal? I am using traffic shape, as well a limiters, with a PPPoE server.



  • It doesn't "blow", it's running numerous multi-million dollar businesses' networks with perfect uptime. Might FreeBSD 8.3 have some driver regression that impacts your NICs, possible. Might there be some weird combination of things you're using that's broken, possible. Might something else have gotten screwed up on your network and you're blaming it on the firewall when it isn't, very possible. A majority of the "I upgraded my firewall and it broke X" reports that we encounter with customers actually have 0 relation to the firewall.

    There is no way to go back short of reinstalling and restoring your pre-upgrade config.

    The info here is so vague it's impossible to suggest anything.



  • @cmb:

    It doesn't "blow", it's running numerous multi-million dollar businesses' networks with perfect uptime. Might FreeBSD 8.3 have some driver regression that impacts your NICs, possible. Might there be some weird combination of things you're using that's broken, possible. Might something else have gotten screwed up on your network and you're blaming it on the firewall when it isn't, very possible. A majority of the "I upgraded my firewall and it broke X" reports that we encounter with customers actually have 0 relation to the firewall.

    There is no way to go back short of reinstalling and restoring your pre-upgrade config.

    The info here is so vague it's impossible to suggest anything.

    I have googled, and some people are having the same issues as I, I do agree with the lack of information, so I will post my logs and maybe that will help. I think your right about drivers, because I saw on the logs before I cleared them, that the internet interface was "DOWN" and then back up at random, so I don't know. As far as the claims that pfSense runs million dollar networks, I wouldn't doubt that, I do doubt that they would touch "BETA" for production, and I have used BETA in the past, just this time around, seems to be alot more issues, as you said before, that could be related to the latest release of BSD, and drivers. I do believe they are Reltek.

    Thanks again,
    Matt

    PS: is there a script that could install updates with out human intervention?



  • @mcrook:

    I have googled, and some people are having the same issues as I

    And what are those? You can't even explain the issues in your post, much less actually find something specific enough to know it's the same. The symptoms of issues people report with a firewall are relatively limited. People claim all the time here they have "the same issues" when really all they have are the same symptoms and they're incapable of actually troubleshooting to the real cause, which many times isn't related to the firewall at all.

    @mcrook:

    As far as the claims that pfSense runs million dollar networks, I wouldn't doubt that, I do doubt that they would touch "BETA" for production

    You'd be way wrong. I'm on such a network right now where we strictly use 2.1 and have for months. If the Internet is down, the entire business is down. Other significantly larger companies have been using 2.1 for years for their IPv6 networks. People have trusted our beta releases for a long, long time, and by working with us as commercial support customers, can very safely take advantage of new features they need and do so in a 100% stable manner in their network. We've had 2.1 running for 1.5-2 years on all our production networks and have had 0 firewall-induced outages.



  • Well I stand corrected sir  ;D

    I will report back with logs, but I just did a simple google for "pfsense 2.1 no internet reboot" and lots of stuff comes up. Thats how I know it might be related to drivers, since 2.0 worked just fine. I am not tryin to bash pfsense in anyway, its a great product by far and has a great community to help support it. I am just expressing my feelings on the lastest beta and it may or may not be a pfsense firewall issue, could be a BSD issue. I have heard Reltek are a pain in the @$$ sometimes.

    Thanks for everything,
    Matt


  • Rebel Alliance Global Moderator

    Not trying to bash?  Strange way to title a post then

    "BETA BLOWS,"

    You have given NOTHING to help you with..  You state snap - which one did you install?  What hardware are you on?

    You state it "keeps crashing or reporting a crash on the main page" - And can you post this info?  Whenever pfsense reports a crash, you can view the details of it..



  • "BETA BLOWS"

    Here let me correct this for you.

    No beta does not blow.  It sucks.  :-*  LOL, just kidding.

    But really, it's beta.  What did you expect.  Let me guess.  You went from "stable" release to beta with no backup or backout plan.  That's on you.

    But people here are pretty good and when provided with the right info can probably get you squared away.



  • I know beta has its risks, as I said before, I have ran the beta and RC for 2.0 and was very pleased.
    The issues I am having with 2.1 may or may not be related to pfSense or BSD for that matter.

    As for bashing, its only a beta, its gonna have bashing and banging, and some bugs until it even gets close to RC. I am thankful however its not Alpha :D

    Yes yes, I did the classic upgrade with no backup  :o

    What and where can I get the information that will help me, help you, help me?

    I looked over the logs and don't see anything that could cause these problems. I think at this point, we can all agree that I should have made a backup, and just need to move forward with what I have and figure out the issues.



  • I've run pfsense beta in production in the past knowing that there is a small risk, and there is always a risk in all stages of development that something might break, though the risk decreases at each stage of course, having said that, it is my responsibility to check what changes have been made on redmine provided by pfsense and then evaluate if those changes affect my setup and whether its worth the risk to upgrade or not, bearing in my mind my system is in production.

    Ive seen some interesting features available on 2.1 thats not on 2.0 and I'm willing to take the risk, I have a small network that consists of over 50 users, loads of application servers all on top of esxi hosts and remote locations running pfsense with vpn tunnels back to my pfsense vm in my esxi's, if I decided to upgrade my environment to 2.1 and it breaks as frustrating as it is, its my fault. :0)

    It all depends on how much of an urgency you need a feature available in the beta thats not available in the stable versions of pfsense.

    If it aint broken stick with stable, otherwise keep tabs on redmine to see what changes are applied in the snapshots and if something breaks dont post inflammatory headings, it doesnt solve your problem.

    And the general consensus is try to use Intel NIC's if possible.



  • @mcrook:

    I will report back with logs, but I just did a simple google for "pfsense 2.1 no internet reboot" and lots of stuff comes up.

    Really?  ::) Replace "pfsense 2.1" with any other software or hardware product /vendor and you will also find "lots of stuff". I'm sorry but your post could be used as the poster child of "what not to post in a support forum".



  • @mcrook:

    Thanks again, I hope that the RC is much better then the BETA.

    Best wishes,
    Matt

    There is almost no information or log reports for anyone to attempt to even look at your problem.

    2.1 is BETA and may have some issues especially with Free BSD drivers or interfacing with other packages. Apart from a few issues with packages (one of them BETA as well) which have been addressed, 2.1 is running fine on our home network.

    We standardize on Asus motherboards and INTEL Dual port NICS and this cuts down tremendously on any other issues we may have had in the past.



  • How would I go about getting the logs? It hasn't crashed since, but the internet just stops working… the other side, lan side (lagg) is fine?

    Do you think a factory reset is in order or what? I have changed nics and same thing?

    By the way, telling me what I did wrong isn't helping either. I know I screwed up, I just want to move forward. If I put out a bounty for support, would there be any takers?

    Thanks.

    Best wishes,
    Matt

    $100 okay?



  • @mcrook:

    the internet just stops working.

    Please post the output of pfSense shell command:```

    /etc/rc.banner ; ifconfig



  • I don't have physical access to the server, could this command be ran from the WEB GUI?

    Thank you for your help :)

    Best wishes,
    Matt



  • Go to /exec.php in the WebGUI, then enter the command there and execute it.



  • @mcrook:

    I don't have physical access to the server

    SSH to pfSense from Linux/Unix system.
    Putty to pfSense from Windows.

    Or use Diagnostics -> Command Prompt in pFsense web GUI, type the command in the Command box and click on the Execute button (essentially what gderf suggested).



  • I used the Web UI and now the Web UI is frozen lol

    Guess putty would have been a better choice.

    lol

    Will keep you posted



  • were running pfsense 2.1 Beta on our Network for MANY MANY months in our Multi-million dollar
    company. other than the brief time my boss was a moron and switched us to Cisco. that came
    back to bite him in the a** and has since been fired and right back to pfsense we went..

    we standardized on Supermicro Servers with Dual Port PCI-e Gig-E intel Nics.

    we have AT last count 60 of these pfsense servers in production in Colocation as well as Warehouses
    and our offices.

    at our warehouses/colo sites, we run at pretty close to 75% utilization of Gig-E bandwidth.

    Downtime???? what downtime? 0… nada... even on EARLY 2.1 Snapshots.... (other than the brief 2 month stint my boss
    did with Cisco but that wasnt a pfsense issue)

    we have a HUGE mix of VOIP and Data... and lots of servers spread out...



  • @SunCatalyst:

    other than the brief time my boss was a moron and switched us to Cisco. that came
    back to bite him in the a** and has since been fired and right back to pfsense we went..

    So much for the old "nobody ever got fired for buying Cisco" mantra.  ;D

    What came back to bite you? Email response fine if you prefer not posting publicly (cmb at pfsense dot org), I'd like to know even if it's not something I can share.



  • Oh right make public accusations but only share in private.

    If not backed up in public.  It did not happen.  Or the cause was actually something else and blamed on Cisco.

    I want to know too.



  • @NOYB:

    Or the cause was actually something else and blamed on Cisco.

    Cisco does have its strengths and weaknesses, but based on human nature I find it a bit hard to believe that a "multi-million dollar" company's IT manager would be fired for choosing "enterprise-grade" Cisco gear (unless the company has a very tech-savvy management that really understands the issues involved, which usually means that said company is itself in telecoms or IT).



  • what we ran into when we switched to Cisco…

    no unbound, no radius,  and other packages which we run on pfsense, plus UniFi controller
    software for the Wireless Access Points in some of our offices, warehouses.
    our Links were congested to start with at 75% link usage and it went to
    almost 90%. (which in turn forced him to order another Gig-E drop to everywhere)

    which in turn forced him to spend LOTS more money on servers to run services on, which
    in turn took up more rack space, more man hours to deploy , time to send techs to every
    place we have routers in. etc etc.

    as far as Cisco hardware itself. Works great BUT the incured EXTRA costs every
    month surely didnt help is ALREADY crazy amount of money he dropped on cisco
    hardware... and then the servers. when he ordered the Cisco routers , he didnt order
    ones rated at passing Multi Gig-e worth of traffic.. and that caused problems of its own..
    (heard some of the purchases WERENT approved and he ordered this stuff anyways)

    all in all , management was pissed we had some downtime during the what should
    have been a 6 hour maint window per site to cut over (on different days according to when
    our utilization was at the lowest) and in some cases it was BEYOND 24 hours....
    (boss was shipping hardware that HADNT been config'd to places, and techs didnt realize
    what happened until they tried to cut over) , CEO found out what was going on and
    they called him in the office and it was game over... think it was the combined mess
    that ultimately got him fired....



  • Sounds more like a planning, process, procedure, and MANAGEMENT issue to me.


  • Netgate Administrator

    Exactly. Switching significant parts of your network infrastructure is probably going to cause problems no matter what two things you're switching between. You can minimise those problems by careful planning and testing, something it sounds like this guy didn't do (or not carefully enough anyway).

    Steve

    Oh and this thread probably wouldn't be attracting nearly as much attention had it been titled:
    BETA BLOWS ON MY HARDWARE, WANT TO DOWNGRADE ASAP

    Or even better.

    Beta is not working well on my hardware, is it possible to downgrade?  ;D



  • Steve, he didnt listen to reason and test in the Lab before deploying…. EVERYONE in the dept is glad he is gone.
    things have changed tremendously for the better after he was fired.

    i could have fixed all the messes with the Cisco hardware . but was easier to cut back over to a system that works and pull the other hardware.
    were currently looking at 10GE for places that need more than 1 Gig-E drop. 2 times Gig-E seems to be more expensive than 10GE.
    and were looking at which 10GE adapters are supported and work well in FreeBSD and then order and Test extensively in the Lab.
    all of our stuff is on Extensively tested Supermicro Xeon servers (2 different models) and have onboard intel nics. unfornately NOT 10GE.

    back to the subject... Downgrade... just backup your config and reinstall... takes me all of about 10 minutes to
    have a working system from the time the CD goes in the drive til i have a working config.



  • Here is the output finally from that command:

    rl0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=3808 <vlan_mtu,wol_ucast,wol_mcast,wol_magic>ether 00:04:e2:06:65:1d
            inet6 fe80::204:e2ff:fe06:651d%rl0 prefixlen 64 scopeid 0x7
            nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (100baseTX <full-duplex>)
            status: active
    rl1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=3808 <vlan_mtu,wol_ucast,wol_mcast,wol_magic>ether 00:04:e2:06:65:1d
            inet6 fe80::2e0:29ff:fe94:cb6a%rl1 prefixlen 64 scopeid 0x8
            nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (100baseTX <full-duplex>)
            status: active
    re0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic>ether d8:5d:4c:d0:74:c9
            inet6 fe80::da5d:4cff:fed0:74c9%re0 prefixlen 64 scopeid 0x9
            inet 75.157.237.26 netmask 0xffffff00 broadcast 75.157.237.255
            nd6 options=1 <performnud>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    re1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic>ether d8:5d:4c:d0:76:ad
            inet6 fe80::da5d:4cff:fed0:76ad%re1 prefixlen 64 scopeid 0xa
            nd6 options=1 <performnud>media: Ethernet autoselect (none)
            status: no carrier
    fxp0: flags=8802 <broadcast,simplex,multicast>metric 0 mtu 1500
            options=4219b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,tso4,wol_magic,vlan_hwtso>ether 00:07:e9:bc:61:42
            media: Ethernet autoselect (none)
            status: no carrier
    enc0: flags=0<> metric 0 mtu 1536
    pfsync0: flags=0<> metric 0 mtu 1460
            syncpeer: 224.0.0.240 maxupd: 128 syncok: 1
    lo0: flags=8049 <up,loopback,running,multicast>metric 0 mtu 16384
            options=3 <rxcsum,txcsum>inet 127.0.0.1 netmask 0xff000000
            inet6 ::1 prefixlen 128
            inet6 fe80::1%lo0 prefixlen 64 scopeid 0xe
            nd6 options=3 <performnud,accept_rtadv>pflog0: flags=100 <promisc>metric 0 mtu 33200
    lagg0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=3808 <vlan_mtu,wol_ucast,wol_mcast,wol_magic>ether 00:04:e2:06:65:1d
            inet6 fe80::204:e2ff:fe06:651d%lagg0 prefixlen 64 scopeid 0x10
            inet 192.168.25.17 netmask 0xfffffc00 broadcast 192.168.27.255
            nd6 options=1 <performnud>media: Ethernet autoselect
            status: active
            laggproto lacp
            laggport: rl1 flags=1c <active,collecting,distributing>laggport: rl0 flags=1c <active,collecting,distributing>poes10: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes11: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes12: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes13: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes14: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes15: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes16: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes17: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes18: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes19: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes110: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes111: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes112: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes113: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes114: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes115: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes116: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500
    poes117: flags=8890 <pointopoint,noarp,simplex,multicast>metric 0 mtu 1500

    Thank you for your help!</pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></pointopoint,noarp,simplex,multicast></active,collecting,distributing></active,collecting,distributing></performnud></vlan_mtu,wol_ucast,wol_mcast,wol_magic></up,broadcast,running,simplex,multicast></promisc></performnud,accept_rtadv></rxcsum,txcsum></up,loopback,running,multicast></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,tso4,wol_magic,vlan_hwtso></broadcast,simplex,multicast></performnud></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic></up,broadcast,running,simplex,multicast></full-duplex></performnud></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic></up,broadcast,running,simplex,multicast></full-duplex></performnud,accept_rtadv></vlan_mtu,wol_ucast,wol_mcast,wol_magic></up,broadcast,running,simplex,multicast></full-duplex></performnud,accept_rtadv></vlan_mtu,wol_ucast,wol_mcast,wol_magic></up,broadcast,running,simplex,multicast>



  • BUMP

    Any ideas?


  • Netgate Administrator

    Nothing jumps out. You are running quite a few PPPoE connections though, it's possible you are testing this further than other users.
    Unfortunately your ifconfig output is so long it has obscured the output of /etc/rc.banner. If you could run just that part and paste the output here that might show something.

    Steve



  • @mcrook:

    BUMP

    Any ideas?

    I'm also using re* NICs and had a problem when going from 2.0RC3 to 2.0REL.  So for a long time I was running 2.0RC3.  I figure it would work itself out from a newer release.  Once 2.02REL came out I tried it again and I had the same issue where the WAN interface wouldn't work with DHCP.  What fixed it for me was manually setting the WAN interface to force 100BASET full duplex.  I'm not saying that's your problem since it appears you are getting an IP address (not sure if you're using static).  Wouldn't hurt to give it a try.


Locked