XG-7100 HA setup questions
Hello. XG-7100 HA user here with a couple of questions for the experienced users out there. I am excited to utilize the awesome capabilities of this hardware and attempting to not overburdening it/trying to make it do too much so it remains fast! My main problem is I can't do an "evolutionary" type of deployment easily where I can try/experiment with/configure settings under real-time loading often because this system is for a venue/location that goes from nearly "zero to 100" just a few days a year, so I've got to try to "get it right" (or at least mostly) the first time, so I'm asking for help trying the get the best setup I can before its big test later this year.
The core network of this non-profit has three internet connections, currently utilizing separate gateway devices and distributed around the venue by fiber and Ethernet switches into 9 VLANS. There are no traditional "servers" on the network, just basically NAS devices and workstations. Most days of the year there are a handful (2-5) users of the network, but more during the "main event" which happens once per year for about a week.
During that event there are about 20 computer users on the venue administrative VLAN, about 25 users on a special VLAN for electronic ticketing (cloud based), 12 ATMs on their own VLAN, about 3,000 to 4,000 WiFi users (vendors) on two VLANs set up to handle vendor credit card transactions (no cell coverage at this location), a limited "public" WiFi connection available at specified locations, and a couple of VLANs reserved for special needs that change from day to day. All WiFi access is bandwidth limited per device and excessive users/abusers are banned from the system. There are 37 WiFi access points which support VLAN-based SSIDs. We see peak crowds at 45,000 to 55,000 people on some days, so during these crazy busy times there is lots of activity and stuff has "got to work".
My objective is to balance the available internet bandwidth from the sources to the VLANs in order of priority with automated fail over if there's an internet or hardware outage and provide dynamic allocation of bandwidth resources based on usage and availability. The VLAN priority is (Default WAN in parentheses):
Ticketing (WAN 1, bandwidth reservation 10M/10M)
Administration (WAN 1, bandwidth reservation 15M/15M)
ATM (WAN 1, bandwidth reservation 3M/3M)
Vendor WiFi (Browser Capable Devices) (WAN 2 and 3)
Vendor WiFi (Dumb devices with no UI) (WAN 2 and 3)
Special VLANs (WAN 1, 2 and/or 3)
Public WiFi (Max bandwidth for public VLAN 20M/5M) (WAN 2)
The WiFi management system does a pretty good job handling/controlling those devices so I don't need to worry about that with the Netgate device. The main reason for the Netgate gear is to monitor/manage the WANs and VLANs, provide basic firewall and traffic shaping and hopefully with the HA capability make any provider failures nearly seamless.
My specific questions:
--> My plan is to have the XG-7100 device provide DHCP and DNS for the each of the VLANs so it knows everything about each device connection and can adjust quickly as needed for loss of a connection or hardware. I think this is the best option, but suggestions from the experienced?
--> 3 WAN configuration --- I have had trouble getting this to work the way I want it to for failover and have reset to default Netgate configuration a few times after screwing things up. Luckily I have some time to "perfect" this setup as best I can, so ... The default Netgate configuration allocates 1 WAN (VLAN 4091) and ports 2-8 as LAN (VLAN 4092) in a switch configuration. When adding WANs I typically remove the odd numbered ports 3 and 5 from the LAN switch and assign them as gateways, then put them into a gateway group. I seem to run into problems when I do this, either the additional WAN connections are not recognized or sometimes don't pass traffic when they should be, or don't fail over reliably. Suggestions please?
--> Traffic/Bandwidth Shaping - I want to provide bandwidth limiting/traffic shaping per VLAN based on priority (queue?) The top tier VLANs need to have full access at all times for the minimum bandwidth reserved for them regardless of which internet connections are available. Other (leftover) bandwidth would be available for use by all other devices. In the event of an internet outage of one or more connections the lower tier VLANs can dwindle down to virtually no bandwidth. Also WAN 3 is a "high cost" connection that I would like to minimize us of. I was planning to use limiters, but have not been able to test this much because of the WAN problems mentioned above. I've read about different problems/limitations based on how this capability is implemented, but a lot of what I was reading was based on older software. So what is the best way the implement multi-WAN control on current software?
--> Logging/Monitoring - Although I would like to be able to see more extensive real time stats and logging of traffic, I do not want the XG-7100 to have to use resources to write log entries. Suggestion on how to best do this hardware and software-wise, anything from an external PC, linux box, Raspberry PI, or something like that getting traffic from a mirrored port on the XG-7100 or from one of my managed switches? Suggestions for data analysis software?
--> High Availability - With the above subjects in mind, please forward any trouble areas/"gotchas" that might be brought on by the the use of HA capability. I have experience with HA systems from another vendor and there are certain things that need to be accounted for in the setup, so anything I should watch out for?
I also have similar questions that i need help with. Help is highly appreciated.
Thanks in advance.
I would prioritize your configuration issues from layer 1 up and ask one specific question at a time. There is no reason for someone to answer how to configure 10 VLANs, for instance. Ask about 1 and apply it to the rest.
Once you get outside of the XG-7100-specific issues (like setting up VLANs), you would probably be best-suited moving later questions to the appropriate categories (like traffic shaping, etc).
It would take an hour or two for someone to answer all of that at once. That is almost certainly not going to happen which is probably why you have yet to receive a response.
The advice I am looking for most is getting a multi (3) WAN setup configured so it properly fails over (and restores after repair) if there's an outage. I have made a few attempts so far trying different configurations and I have been successful getting 2 WANs to fail over properly, but have had trouble with 3. This is probably more of a pfSense configuration issue rather than a hardware issue, so this is probably not the correct forum for the question.
Most of the other network configuration is already up and running, just on different (separate) gateway devices. I'm trying to merge all the resources together to allow for better "situational awareness" fail over capabilities.
3 is the same as 2. You just make a gateway group with Tier 1, 2, and 3 gateways.