2.4.5 High latency and packet loss, not in a vm

SteveITS

@jwj said in 2.4.5 High latency and packet loss, not in a vm:

Accommodations made to set repositories to the 2.4.4 versions make it a reasonable option.

Does that repo/branch choice also affect packages update/installation?

A Former User

Yeah, there are two drop down menu choices under System->Update->System Update and System->Update->Update Settings.

The base OS/pfsense and the package repo should be correct. As always backup your configuration, make a snapshot if your in a virtual env, and have a plan to recover if you end up FUBAR.

It is too bad the download link for 2.4.4-p3 has not been restored. You can open a ticket and ask (nicely :) for one even if you do not own Netgate HW or have a support contract.

SteveITS

@jwj said in 2.4.5 High latency and packet loss, not in a vm:

System->Update->Update Settings.

Thanks. I got around to testing and this affects what package updates are detected, e.g. Suricata 4.1.7 vs 5.x. So that's good to know. Would be handy if they left the previous version there all the time (and/or had a warning on the package page if you're checking the wrong repo for your version) but nice it's there now.

Yamabushi

So any updates on this issue? I've been checking in here regularly. Three days have elapsed since the last post in this thread. My apologies if I have missed something, but are there any solid mitigations or upcoming updates to address this?

getcom

@Yamabushi said in 2.4.5 High latency and packet loss, not in a vm:

So any updates on this issue? I've been checking in here regularly. Three days have elapsed since the last post in this thread. My apologies if I have missed something, but are there any solid mitigations or upcoming updates to address this?

No, the root cause is still unknown. Netgate cannot reproduce this issue which means the test conditions are different to the affected systems.
At the moment all my systems are back to 2.4.4-P3. I wiped the disks with dd and reinstalled the system from scratch. After basic installation I set the repository to the previous version to avoid the installation of packages of the 2.4.5 release.
Additionally I switched to ZFS.
After that I restored the backup, which does not contain any package information and after this step I manually installed the needed packages.
Now all systems are back to normal working condition.
I wanted to run some more tests on a spare part hardware (an original Netgate system) to get an idea what is the root cause. But we have a strange time and not all is running as expected which means that I did not find a time slot for that...I assume that I`m not allone...

Yamabushi

Thank you for your prompt and detailed response! I guess I will have to continue to wait and see what happens. Thank you, again!

stephenw10

If any of you have a test system that is hitting this and you can allow us to access it please open a ticket so we can set something up: https://go.netgate.com/
I've tried all sorts of things here to replicate it and it just stubbornly behaves perfectly.

Steve

Krisbe

@stephenw10
Done!

A Former User

@stephenw10 Ticket submitted. As per murphys law, my power is out at the moment.

stephenw10

Thanks guys. Hopefully we can get some data there.

Steve

A Former User

I was doing some thinking about this issue last night at 3am.

I know I hit it (on a VM) and I was thinking "What have I changed from the defaults that maybe some other users have also) and I figured maybe

net.isr.dispatch = deferred

I know I set that to try and get a PPPoE performance increase. Have others who are hitting this bug set that too?

stephenw10

No, net.isr.dispatch = deferred does not appear to be common to system hitting this. Good thought though.

Steve

Rico

Hmmm someone with a test system hitting this issue could maybe share his config.xml so we can try with swarm intelligence?

-Rico

A Former User

@Rico Already shared config and other information with Netgate. @stephenw10 has been immensely helpful coordinating that.

q54e3w

@stephenw10 said in 2.4.5 High latency and packet loss, not in a vm:

https://go.netgate.com/

Just opened a support ticket with my config.xml attached, INC-49525.
Not a virtual instance, X11SDV Xeon-D 2100 series motherboard, 16GB RAM.

wernsting

Had the same issue yesterday when I upgraded. Have since reverted to 2.4.4-p3 and the issue disappeared completely.

I run it on a Eglobal Braswell Fanless Mini PC AES-NI Intel N3160/J3160 Qaud Core Pfsense Computer Server 4K 2HDMI 2LAN(RJ-45) 300M Wifi.

A Former User

@wernsting Do you have any large aliases or huge lists of IP's in any firewall rules? Have you modified the max table entries (and if so, to what)?
Do you use PPPoE?

q54e3w

@muppet can you define "large"? One mans "large" is another mans "small"! :-) 1000? 10000? 1000000?

A Former User

It's total entries not individual table size that counts from my experiments.

100000 and up the issue is very noticable. 100k and a bit is the bogonsv6 table. 200000 and up filter reloads can basically freeze the system (unresponsive GUI and packet loss) even with powerful HW. On my supermicro 5018D-FN4T (XG-1541) it becomes unresponsive at around 300000 total table entries for minutes if the filters are reloaded.

max table entries isn't relevant other than you can prevent too many entries from loading if you set it small. In FreeBSD 11.3Stable it was hard limited to 65k. Netgate submitted a patch to make it tunable.

I would be interested in knowing why that 65k hard limit showed up in 11.3?

wernsting

Hi,

I'm just a small household that suffers my nerdy-ness that hated the ISP provided crapware— so no, nothing like that. My set up is hardly configured beyond the base installation :)

Cheers,