CPU load on Firebox (X550e)
I thought I would start a new thread to ask another question and I am not sure it belongs here, but It will ask it anyways and ask for forgiveness if I am mistaken.
In reading the posts regarding the Fireboxes, it seems the CPU is taxed when routing packets from LAN to WAN. In the case of just nat'ing and nothing else I do not understand why; is it just a Firebox HW thing or is that the norm for all routers/firewalls? Am I mistaken and misunderstood what I read? To me, nat'ing is I/O bound not CPU bound.
I would expect there be some load; but from what I was reading it gets up there (75%+). I have been in the IT industry 20+ years and never really came across this CPU load "issue"; mind you I have always dealt with Cisco/IBM/F5 stuff at a distance - but enough to understand it, I thought 'til now :o).
Which input/output subsystem would you expect to be the limiting factor?
The X550e has four Gigabit NICs which are, unfortunately, all PCI and all on the same BUS. That can be a limiting factor under some circumstances.
This guy has some great test results that demonstrate how CPU bound it really is:
Thanks for the link. I will read it and digest it.
I would think that the nic subsystems would be the limiting factor. From the quick glance at the table in the link it shows the CPU is doing a lot of work. Here is my view of what is happening (it is at a high level).
Packet comes in on the LAN side, box does the nat'ing, sends it out on the WAN side
Packet comes in on the WAN side, box does the nat'ing, sends it out on the LAN side
If the firewall is active then an additional check is made to ensure no rules are violated
I know data is being transferred in memory (a few times probably for one packet) during the above actions from nic buffers etc; but I just cannot fathom why the CPU is working so hard. I also know context switches are happening, and resource management going on.
I assumed (apparently wrongly) that the routing/nat'ing was not a CPU intensive activity. I would have guessed the CPU would be waiting on memory or nic I/O. Still puzzling to me. The fact the CPU is doing a fair amount of work I can see the CPU is not waiting all that much.
I am still intrigued…
I'm sure people have done plenty of work on this subject, there's probably more reading material than you could get through in a rational time!
It's interesting to look at other hardware. Many 'hardware' firewalls use a dedicated ASIC/FPGA to speed up routing duties also many newer system on chip style devices have dedicated hardware for doing so. In pfSense we are using all software and since it's built on FreeBSD there is little hope of seeing support for obscure hardware acceleration devices. Also the FreeBSD Marvell driver used here does not seem to perform as well as the equivalent Linux driver used by Watchguard. Even so if you look at those figures you'll see that on the X550e the CPU is not the limiting factor. It's just not possible to read and write the data across the PCI bus fast enough. I did spend some time trying to decide what the maximum theoretical throughput would be between two devices sharing the same PCI bus but failed to reach any useful conclusion. The maximum bus speed is (33MHz x 32bit) 1056 Million bps, ~1Gbps. The data path depends on reading the data, processing it, and writing it out again which can not be done simultaneously. Thus I expect the maximum speed, given an infinitely fast CPU, to be ~512Mbps. That seems to line up with the results above but I've seen other figures stating more than that. Perhaps the bus speed is >33MHz or wider than 32bit? I'm unsure how to find out.
Slightly off topic but all the calculations you ever see on this state 133MB/s calculated by 4bytes (32bit) x 33MHz. However that seems to completely ignore the fact that Mega is 33MHz is 1x10^6 where as the Mega in 133MBps is 2^20. ???
If you come up with some definitive numbers on this I'd love to know. ;)