IPFS Behind PFSense with NAT - Poor performance/too many connections
-
I am running 2.5.2 on decent (ish) commodity hardware - a 4-core Celeron J3160 with 8gb RAM and NVME SSD.
This has performed well for me - I have a 1Gb fiber internet connection and on most of my wired LAN I get close to 1Gb/s throughput when I run a speed test. For once in my life, I've not had my two internet-addicted teenagers and teleworking wife complain about the internet.
However, lately I've been trying to run IPFS behind my firewall and I'm experiencing problems which appear to be correlated to the IPFS server opening large numbers of connections. Others have posted before in this forum on issues running Torrent behind a pfsense NAT firewall and my experience is similar.
Most of the time, performance is perfectly fine, but every so often the whole network seems to slow down. This is 100% correlated with a substantial rise in the size of the state table on the firewall (from around 5000 states to ~30,000 states).
However, so far as I can see, this is nowhere near the maximum number of states which should be supported on my hardware. Memory and CPU utilization appear healthy, and the total throughput is well under 1Gb/s.
The problem is annoyingly transient. It seems to fix itself as the state table drains.
Any advice on solutions, or the steps I should take for further diagnosis, gratefully appreciated.
Jon
-
My current working theory is that the performance degradation is actually to do with the number of blocked packets caused by IPFS peers attempting to connect.
I have set up a 'static ports' in the NAT rule per: https://docs.netgate.com/pfsense/en/latest/nat/outbound.html#static-port as I gather this may help with STUN. I also ensured that port forwarding is occurring for both TCP and UDP.
This has greatly reduced the number of IPFS-related packets being blocked.
However, I'm still seeing quite a lot of blocked packets with 'TCP:SA', 'TCP:RA', 'TCP:PA' logged in the protocol column, which otherwise I would have thought should be passed by my forwarding rule.
Accordingly, I set the 'any Flags' option in the forwarding rule, but this does not seem to make any difference. Am I misunderstanding what this option does?
-
Gosh. This is dispiriting.
~3d of fiddling about trying to resolve this issue.
Actions taken:
- much messing about with rules
- Set firewall optimization options to aggressive
- Set even more aggressive adaptive timeouts
- Upgrade to 2.6
Observations:
- Periodic spikes in size of state table and number of destination addresses associated with spikes in dropped packets
- This correlates to a transient spike in CPU utilization to 12% (one virtual CPU)
- System has plenty of free memory
- System has plenty of free mbuf
The conclusion: - just don't use a pf based firewall for NAT of peer-to-peer applications where large numbers of connections may be opened simultaneously. Something about the handling of the state table is single-threaded and will likely not scale no matter how much hardware you throw at it.
-
I take it all back. This is nothing to do with NAT. After purchasing a second static IP address and creating a set of completely stateless rules (a story in itself) I find that the problem still occurs. Rapidly opening many connections, to many destinations causes huge spikes in packet loss and latency which correlate only to a rise in system CPU usage. Though this rise is only from a base level of ~1% utilization up to a maximum of 7% utilization, so it should not itself be causing the issue.
I can't find any other metric that is even remotely stressed.