[Solved] pfsense is not making sense
-
[2.4.1-RELEASE][admin@pfsense.telebyte]/root: ps -aux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 11 200.0 0.0 0 32 - RL 16:59 2659:43.17 [idle]
root 0 0.0 0.0 0 208 - DLs 16:59 0:00.19 [kernel]
root 1 0.0 0.0 5024 908 - ILs 16:59 0:00.01 /sbin/init –
root 2 0.0 0.0 0 16 - DL 16:59 0:00.00 [crypto]
root 3 0.0 0.0 0 16 - DL 16:59 0:00.00 [crypto retur
root 4 0.0 0.0 0 32 - DL 16:59 0:00.01 [cam]
root 5 0.0 0.0 0 16 - DL 16:59 0:00.01 [soaiod1]
root 6 0.0 0.0 0 16 - DL 16:59 0:00.01 [soaiod2]
root 7 0.0 0.0 0 16 - DL 16:59 0:00.01 [soaiod3]
root 8 0.0 0.0 0 16 - DL 16:59 0:00.01 [soaiod4]
root 9 0.0 0.0 0 16 - DL 16:59 0:00.00 [sctp_iterato
root 10 0.0 0.0 0 16 - DL 16:59 0:00.00 [audit]
root 12 0.0 0.0 0 272 - WL 16:59 4:41.33 [intr]
root 13 0.0 0.0 0 32 - DL 16:59 0:00.00 [ng_queue]
root 14 0.0 0.0 0 48 - DL 16:59 0:00.01 [geom]
root 15 0.0 0.0 0 256 - DL 16:59 2:36.05 [usb]
root 16 0.0 0.0 0 16 - DL 16:59 0:24.10 [pf purge]
root 17 0.0 0.0 0 16 - DL 16:59 0:13.27 [rand_harvest
root 18 0.0 0.0 0 16 - DL 16:59 0:02.78 [acpi_thermal
root 19 0.0 0.0 0 16 - DL 16:59 0:00.32 [acpi_cooling
root 20 0.0 0.0 0 16 - DL 16:59 0:00.07 [enc_daemon0]
root 21 0.0 0.0 0 48 - DL 16:59 0:04.35 [pagedaemon]
root 22 0.0 0.0 0 16 - DL 16:59 0:00.00 [vmdaemon]
root 23 0.0 0.0 0 16 - DL 16:59 0:00.00 [pagezero]
root 24 0.0 0.0 0 16 - DL 16:59 0:00.40 [bufspacedaem
root 25 0.0 0.0 0 32 - DL 16:59 0:02.04 [bufdaemon]
root 26 0.0 0.0 0 16 - DL 16:59 0:00.38 [vnlru]
root 27 0.0 0.0 0 16 - DL 16:59 0:07.44 [syncer]
root 60 0.0 0.0 0 16 - DL 16:59 0:00.08 [md0]
root 300 0.0 0.7 282676 29264 - Ss 16:59 0:02.47 php-fpm: mast
root 338 0.0 0.1 19436 4400 - INs 16:59 0:00.02 /usr/local/sb
root 340 0.0 0.1 19436 4216 - IN 16:59 0:00.00 check_reload_
root 353 0.0 0.1 9556 5516 - Ss 16:59 0:00.04 /sbin/devd -q
root 4772 0.0 0.1 19324 3196 - Ss 17:00 0:00.37 /usr/local/sb
root 5504 0.0 0.1 13084 2776 - IN 00:01 0:00.00 /bin/sh /etc/
root 5543 0.0 0.0 6172 1928 - IN 00:01 0:00.00 sleep 81230
root 7987 0.0 0.2 20348 6116 - Ss 16:59 0:10.19 /usr/local/sb
root 8940 0.0 0.1 12696 2392 - Ss 16:59 0:06.17 /usr/local/sb
root 12193 0.0 0.2 53488 6968 - Ss 16:59 0:00.00 /usr/sbin/ssh
root 12368 0.0 0.1 10580 2180 - Is 16:59 0:00.00 /usr/local/sb
root 14985 0.0 0.1 15076 2384 - Is 16:59 0:11.32 /usr/local/bi
root 19768 0.0 0.1 13084 2844 - IN 13:29 0:01.18 /bin/sh /var/
root 33534 0.0 0.0 8224 2004 - Is 17:00 0:00.00 /usr/local/bi
root 33889 0.0 0.0 8224 2020 - I 17:00 0:00.03 minicron: hel
root 34129 0.0 0.0 8224 2004 - Is 17:00 0:00.00 /usr/local/bi
root 34552 0.0 0.0 8224 2016 - I 17:00 0:00.00 minicron: hel
root 34737 0.0 0.0 8224 2004 - Is 17:00 0:00.00 /usr/local/bi
root 35020 0.0 0.0 8224 2016 - I 17:00 0:00.00 minicron: hel
root 37355 0.0 0.0 6172 1928 - IN 15:39 0:00.00 sleep 60
root 37366 0.0 0.2 78836 8140 - Ss 15:39 0:00.03 sshd: admin@p
root 48169 0.0 0.2 25416 6724 - Is 17:00 0:00.00 nginx: master
root 48399 0.0 0.2 27464 7768 - I 17:00 0:00.59 nginx: worker
root 48521 0.0 0.2 27464 8188 - I 17:00 0:01.90 nginx: worker
root 48884 0.0 0.1 12496 2368 - Is 17:00 0:00.50 /usr/sbin/cro
root 49416 0.0 0.3 24604 12424 - Ss 17:00 0:04.41 /usr/local/sb
root 60609 0.0 0.7 282676 29268 - I 15:37 0:00.00 php-fpm: pool
root 65254 0.0 0.1 10368 2088 - Ss 17:00 0:11.20 /usr/sbin/pow
root 70050 0.0 0.1 10580 2308 - Ss 17:00 0:00.00 /usr/local/sb
root 71912 0.0 0.0 10288 2012 - Is 13:37 0:00.00 /usr/local/sb
dhcpd 74470 0.0 0.2 16648 7836 - Ss 15:22 0:00.06 /usr/local/sb
root 78540 0.0 0.2 41504 7588 - I 13:34 0:00.00 /usr/local/sb
root 78860 0.0 0.2 52880 9108 - Ss 13:34 0:01.14 /usr/local/sb
unbound 79886 0.0 0.8 64468 33648 - Ss 09:58 0:17.38 /usr/local/sb
root 80737 0.0 0.1 10472 2532 - Ss 17:00 0:09.21 /usr/sbin/sys
root 68908 0.0 0.1 39432 2836 v0 Is 17:00 0:00.01 login [pam] (
root 70053 0.0 0.1 13084 2924 v0 I 17:00 0:00.00 -sh (sh)
root 70341 0.0 0.1 13084 2800 v0 I+ 17:00 0:00.00 /bin/sh /etc/
root 69122 0.0 0.1 10388 2128 v1 Is+ 17:00 0:00.00 /usr/libexec/
root 69382 0.0 0.1 10388 2128 v2 Is+ 17:00 0:00.00 /usr/libexec/
root 69546 0.0 0.1 10388 2128 v3 Is+ 17:00 0:00.00 /usr/libexec/
root 69647 0.0 0.1 10388 2128 v4 Is+ 17:00 0:00.00 /usr/libexec/
root 69652 0.0 0.1 10388 2128 v5 Is+ 17:00 0:00.00 /usr/libexec/
root 69953 0.0 0.1 10388 2128 v6 Is+ 17:00 0:00.00 /usr/libexec/
root 70040 0.0 0.1 10388 2128 v7 Is+ 17:00 0:00.00 /usr/libexec/
root 37841 0.0 0.1 13084 2800 0 Ss 15:39 0:00.00 /bin/sh /etc/
root 40476 0.0 0.1 13392 3632 0 S 15:39 0:00.01 /bin/tcsh
root 42749 0.0 0.1 21104 2716 0 R+ 15:39 0:00.00 ps -aux -
The "idle" process is using way too much processor… (kidding)
Don't see anything odd. I'd reinstall and test again.
-
haha tech humor. I'm going to hold off a reinstall for now since it's not a show stopper, but I have a feeling that may be the only option. I'll have to find a good time to get it done.
Thanks for the help.
Raffi
-
Yeah - I'd wait for a good time. It could take seconds or perhaps minutes to hit the "default settings" button in the console.
Might work as well as a fresh install.
-
lol good idea, I'll try that first.
Have you had any experience with a reinstall when an issue came up? I wonder if restoring my config on a fresh install would also "restore" the issue? I guess, I'll only know by trying.
-
Likely so. I've noticed that when I screw up my settings, save them and then restore them, they are still screwed up. Maybe its just me.
-
It turns out it's not my settings. A factory reset didn't help either. Is a factory reset the same as a fresh install? Could there still be some files that are corrupt or not quite right?
I'm beginning to think it could be due to the jump from 2.3.x to 2.4.0. I think that's when it also changed the freeBSD version to 11? I won't know for sure until I try a fresh install of 2.3.x and see if that fixes it or not.
-
Id try a fresh install before I blamed the new version. I think that even a factory reset could leave some stray code, depending on whats been done to it.
-
I'll have to wait for a time when the office is nearly empty before I do a fresh install. I may not be able to get that done for a while since I won't be in the office again till Tuesday. I guess the bit of good news is that it looks like it's not my settings. If it is due to some bit of bad/left over code, doing a fresh install of 2.4.1 will hopefully take care of that. I could run a test right after the install. Then, restore my latest config and it should get me back up and running, hopefully without issues. We shall see… but that is the game plan for now.
-
I just happened to be searching around tonight as I'm embarking on my own pfsense installation.
You description seems like it somewhat matches that of this video on Youtube: https://www.youtube.com/watch?v=v2rK5F461aM
He upgraded the processor and problems went away. You may be under powered since you turned a bunch of stuff on.
Roveer
-
Since then, the network topology has not changed. I have installed pfsense OS updates along the way, Snort, squid (with cache and AV), and pfblocker. I have been running speed tests recently and my upload is consistently fine. The issue is with my download speeds. I can't get above ~97 Mbps.
Snort, Squid, ClamAV and pfBlockerNG means you were turning your pfSense into a fully acting UTM device and this
on a small Atom based board with 1.6GHz so it could really be that you are not right sorted with enough horse power.He upgraded the processor and problems went away. You may be under powered since you turned a bunch of stuff on.
Could be also that the memory system gets saturated. To small footprint or to lame RAM.
-
I'm in alignment with roveer's post, your box is underpowered.
Per the PFsense hardware requirements page (https://www.pfsense.org/products/#requirements), for your bandwidth you should be running:
"No less than a modern Intel or AMD CPU clocked at 2.0 GHz. Server class hardware with PCI-e network adapters, or newer desktop hardware with PCI-e network adapters."
I would also double your ram at a minimum.
-
His box may technically be underpowered, but it is not showing any usual load.
@OP: Run "ps -aux" while you're doing a speedtest. We need to see what's using CPU, if any, under load.
-
on a small Atom based board with 1.6GHz so it could really be that you are not right sorted with enough horse power.
Geez, guys! The celeron 1017U is an Ivy Bridge gen. Notebook CPU. Not a small-time old-school Atom.
"No less than a modern Intel or AMD CPU clocked at 2.0 GHz. Server class hardware with PCI-e network adapters, or newer desktop hardware with PCI-e network adapters."
What for? That recommendation is really old-school, even the pfSense hardware doesn't match that ;) Not even their own SG-2440 would match that description and is described as running IDS and Proxies just fine. I agree with Harvy, the screens don't show high CPU load and if the box should be that underpowered you'd see that in the 5 or 15m load values. The Celeron is a dual core, so a load of 2 would still be acceptable at peaks.
-
Thanks for the replies. I wish it were as simple as my hardware being under powered. I have no beast under the hood, but I have several points to squash that argument.
1. My CPU load has never been max out even under the heaviest of use.
2. My CPU load is almost always sitting close to 0% usage. The biggest load is probably me accessing the GUI/graphs.
3. The idle process uses most of the processor.
4. I disabled all the mentioned services which are known to be a burden and still have the issue.
5. I did a factory reset and still had the issue.
6. I have 4GB of newish laptop ram. It is not fully utilized.
7. There is no use and never has been any use of swap space.I did not have this issue when I originally ran the system on 2.3.x, so I'm beginning to think it could be due to the jump to 2.4.x. It could also be that I have a botched install which happened somewhere along the way. I'm pretty sure the factory reset simply restores a config file with all the defaults from a fresh install. It's not re-imaging the partition from a recovery partition. I realized this when I saw my custom WPAD files still in the /usr/local/www/ directory even after the factory reset. I deleted those files as well just to be sure they had no part in the problem, but this made me think, if those files were untouched, what if a potentially corrupted file was also untouched. I think the only thing that makes sense at this point is a fresh install. I'll keep you all posted.
Thanks.
-
It will be interesting to see what a fresh install does.
-
It sounds like you have a bad Network Card, maybe not necessarily bad, but not a good supported driver. HAVP and Squid will kill your network speeds if you have a bad or unsupported driver.
-
I noticed you said that Windows shows a 1Gb connection but what does the speed show as connected in pfSense? Also, anything in the logs? I've seen where it flaps so that every couple of seconds the link goes down for a couple of milliseconds and comes back causing issues like this. Doubtful since it is a VM but just an idea. Since it is a VM, how about just building a second VM and swapping over for a few minutes to test?
-
Thanks for the replies.
scottdam, a bad NIC/driver could also be a possible reason. I will only know for sure if I do a fresh install. I may also have to try going back to a fresh install of 2.3.x if it is a driver issue with 2.4.0. I did disable all the packages such as squid, snort and pfblocker. That didn't help.
Stewart, both the interfaces are 1 Gb. pfSense only shows the WAN as "Media 1000baseT <full-duplex,master>" under Status > Interfaces. It doesn't show that same line for the LAN, but I do know it's gigabit. Plus, if they weren't I wouldn't have gotten 120 Mbps when connecting a non-native PC to the network on the same exact cabling. What should I look for in the logs specifically? I don't see anything indicating a dropped connection on the system tab. Would it be there or elsewhere? Do I have to change the verbose mode of the logging to see it maybe? Right now it's set to the default.
I'm not running a VM, I have it running on actual hardware.</full-duplex,master>
-
Alright… so I spent several hours on this again last night while the office was quiet.
Here is what I'm 100% sure of now... it is the pfsense box. How did I come to that conclusion? In addition to everything else, my last resort was to disconnect the WAN/LAN cable from pfsense and plug it into my old Netgear which it replaced. With the same exact network topology/IP's, I was getting the full 120 Mbps down. I plugged the pfsense box back in and was getting over 100 Mbps, but still not a solid and consistent 120 Mbps I should be getting.
What did I do before plugging in the Netgear? I did a fresh install of 2.4.1. I was still not getting full down speeds. I then decided to do a fresh install of 2.3.5, but still not luck. I then swapped out the one NIC I was a little weary of, my USB 3.0 to GBE adapter. I plugged in a brand new one, but still no solution. I know... not the best NIC to be using, but I have no real choice on this box I'm running. Besides, that NIC was giving me my full download speed at one point, so I don't believe that is the issue.
During all these trials mentioned above, I was using factory default settings with no additional packages installed. The only thing I did was configure the WAN and LAN IP's. The same IP's I've been using forever.
I have one last idea which I will try hopefully tonight. I have hardware checksum offloading enabled. I'm almost certain my USB 3.0 NIC is not up to par for that feature, and according to the pfsense book that feature is broken in some NICs and will cause problems with corrupted packets and throughput. I'm suspecting I have both problems. So I'm gonna try to disable that, cross my fingers, and then reboot the box.
-
Generally it's a bad idea to use the usb NICs, I see no one have luck with this crap. If it's impossible to install pci-e intel card, but you have one embedded then use VLANs and VLANs capable switch, otherwise you will need different hardware setup to make things work as desired.
-
Yea, the USB NIC is not intended to be used the way I'm using it on a firewall. I was hoping I could get away with it. I thought I did for a while, but maybe I was wrong. I'm not giving up on it just yet though. Call me stubborn, but I really want to be able to make use of this tiny PC that was collecting dust, especially since it has the same footprint as my old Netgear so it fits right in. The VLAN approach might be a good solution if it does turn out to be a bad NIC. That could also be a good excuse to justify purchasing a managed switch. I do have another PC laying around with actual PCIe slots. It a has MUCH bigger desktop footprint. If this becomes a real big issue, I may end up switching over to that.
-
After all this you're using USB NICS LULZ. You can't compare a Netgear (with no USB NIC) to a PFSense with a USB NIC. That's like tying one hand behind the back of the PFSense.
The USB NIC is your problem. Even IF it worked well on a prior version there's no way I'd put my life in the hands of a USB anything (besides a keyboard and mouse or my phone chargers LOL).
You're getting 100 Mb/s so who cares about the 20 Mb/s…? Sure it's annoying BUT are you ever hitting all 100 Mb/s being consumed on your network? Is Internet "slow" because you're missing 20 Mb/s...? It seems like you're missing 20% of your bandwidth but if you're only consuming say... 50 Mb/s you actually have 50% utilization you're not even using so you're not even missing the 20 Mb/s / 20%.
I'd stop the madness and so something more productive like drink beer :P
-
The 20 Mbps is what raised the red flag. I'm not losing sleep over the 20 Mbps because like you said, I'm not even using close to full bandwidth. At this point I want to understand why I'm losing it, not because I actually need it. This issue has helped me learn a lot about pfsense (I'm a newbie). The education is well worth the cost of 20 (unused) Mbps and a few forum posts.
By the way, disabling the hardware check sum offloading didn't help either. I know everyone on these forums hates the USB NICs, but hatred alone wouldn't hold up in court. I'm still trying to understand how to definitively diagnose if it is a NIC issue. There must be dropped/corrupted packets if that's the case. The attached packet graphs are not clear to me. Is the WAN inpass supposed to be close to LAN outpass? The WAN to LAN would be my downstream. It looks like some packets are not making it out onto the LAN. For example, in the average, I have 983.73 pps coming into the WAN but only 915.54 pps making it out of the LAN. That's roughly 7% loss? Are there other factors such as packets not allowed due to filtering? Or would those go under the in/out block category and have nothing to do with it?
Thanks all for the help.
-
@raffi30:
During all these trials mentioned above, I was using factory default settings with no additional packages installed. The only thing I did was configure the WAN and LAN IP's. The same IP's I've been using forever.
Did you restore your settings as part of the factory default? Or did you go into the UI and manually create default settings for this test?
I am very paranoid because of issues I've had in the past doing pfSense upgrades (since 1.2.3). I input my settings from scratch after each upgrade. Why? Paranoia, and I don't have issues after upgrades.
So I'd be keen to understand if you did minimal settings manually to run the tests or upload your previous settings prior to testing.
-
Hi tim.mcmanus,
Sorry for not being clear on my post. The settings were factory default because I did a fresh install of 2.4.1. All I did after the fresh install was configure WAN and LAN IP's. I then ran my test again and found no difference. So I then repeated the same process of fresh install with 2.3.5, configured IP's, and ran the test. Neither made any difference.After all the tests not making any difference, I decided I might as well upgrade to 2.4.1 again and restore all my settings. I haven't had any new issues with it. I did have to reconfigure my WPAD file (as expected) and also my snort disable.conf SID management file (unexpected).
Raffi
-
Finally got it solved! There were a number of issues, some of which I'm still dealing with.
I ended up replacing my setup with an unused Dell desktop with PCIe slots. The hardware is slightly better than my tiny Lenovo box, so no harm there. I installed two PCIe EXPI9301CT Intel NIC's. Did a fresh install, restored my config and was back up and running. After that I ran another test and I was getting 150 Mbps down and 50 Mbps up!! I'm pretty sure we're paying for 120/40, so I can't complain about those numbers. So as many suspected, I am now pretty convinced the issue was with the USB NIC. The other hint is the fact that on the dashboard both LAN and WAN are showing as 1000baseT <full-duplex>. On my old Lenovo setup with the USB NIC as my LAN, the LAN did not show that information at all on the dashboard or under the interface info.
After solving that, I still had sub 100 Mbps speed on some PC's. In some cases, it turned out being bad cabling, in another case a bad switch, so by going though it on a case by case basis, I'm slowly getting my network up to speed, no pun intended.
Thanks for all the responses and help!
Raffi</full-duplex>