GUI almost stalling (minutes) on ALIX with Gateway Groups set up?
-
I've been experimenting with pfSense 2.2 on a Netgate ALIX 2d2 with a pair of wifi cards (EMP-8602Plus S) for a "road warrior" setup - this is also my first experimenting with multiple WAN interfaces, so it's quite possible I'm not configured perfectly.
This has been consistent with a Dec 20, Dec 23, and Dec 29 snapshot of 2.2.
One NIC is WAN, one is LAN
One Wifi is WANWifi, one is LOCALWifi. LOCALWifi has a captive portal set, vouchers only.
Once I put WAN and WANWifi into a Gateway Group, with the WAN at a higher tier than WANWifi (since I want hard lines to take precedence over wifi hotspots, like a hotel or cafe hotspot), and set up firewall rules that use that Gateway Group in the Advanced option of the rules, I noticed that the GUI gets very, very slow - literally minutes can pass for a page to refresh, or to move to another pfSense GUI page.
Eventually, I even had the GUI crash entirely, returning only an error message, though routing was still working.
Disabling the WAN interface and forcing WANWifi to be "always up" (disabling monitoring) didn't help.
Is this normal for gateway groups and captive portal on an ALIX with 2.2-RC? There's no packages installed yet, either.
-
The Alix does take some time to process a failover/failback event when a WAN is detected down or up. But it is about 1 minute total to regenerate the rules, restart OpenVPN links that need it, reconfigure DNS…
The nearest I have to your description is my home system :- WiFi card operating as an AP in an Alix 2D13.
- WAN1 wired to internet
- WAN2 is a cable to a TP-Link device that has a 3G dongle in it and connects to 3G mobile data network
- LAN for the occasional wired home device
Normally I do not have the 3G dongle connected, WAN2 is sitting waiting to get DHCP from the TP-Link 3G device. In current snapshots to and including 1-Jan-2015 the gateway monitoring has an issue when a WAN is physically available but still waiting for DHCP:
https://redmine.pfsense.org/issues/4094
I have fixed that on my system with pull request:
https://github.com/pfsense/pfsense/pull/1414
That might help your case also? It needs some review by others in case it causes other side effects, or there might be a better way of fixing Redmine 4094.
But my system does not use lots of CPU all the time. I do not have Captive Portal. -
What is the error message in the logs or from the GUI when it doesn't work?
I haven't noticed anything unusual on my ALIX running 2.2 aside from an occasional panic.
Something you might try setting is to add this to /boot/loader.conf.local and then reboot:
hint.ata.0.mode=PIO4 kern.cam.ada.write_cache=0
Those will (1) disable DMA, and (2) disable write caching. These were off on 2.1.x and before but the sysctl OIDs changed. They may or may not even be necessary these days. As far as I can tell on my ALIX it is already using PIO4 without any extra settings in place.