Problem WAN



  • Hi everyone,
    I have a problem with my network when the connection reaches 80-100mb/s the WAN gateway is interrupted with a relative loss of packets.
    The WAN is connected directly to the motherboard.
    This is the log:

    Nov 18 02:21:46	php-fpm	41201	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use GW_WAN.
    Nov 18 02:21:46	php-fpm	41201	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. 'RW_VPN_VPNV6'
    Nov 18 02:21:45	check_reload_status		Reloading filter
    Nov 18 02:21:45	check_reload_status		Restarting OpenVPN tunnels/interfaces
    Nov 18 02:21:45	check_reload_status		Restarting ipsec tunnels
    Nov 18 02:21:45	check_reload_status		updating dyndns GW_WAN
    Nov 18 02:21:45	rc.gateway_alarm	86592	>>> Gateway alarm: GW_WAN (Addr:192.168.1.254 Alarm:0 RTT:1.112ms RTTsd:9.905ms Loss:0%)
    Nov 18 02:21:36	php-fpm	83692	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use GW_WAN.
    Nov 18 02:21:36	php-fpm	83692	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. 'RW_VPN_VPNV6'
    Nov 18 02:21:35	check_reload_status		Reloading filter
    Nov 18 02:21:35	check_reload_status		Restarting OpenVPN tunnels/interfaces
    Nov 18 02:21:35	check_reload_status		Restarting ipsec tunnels
    Nov 18 02:21:35	check_reload_status		updating dyndns GW_WAN
    Nov 18 02:21:35	rc.gateway_alarm	19334	>>> Gateway alarm: GW_WAN (Addr:192.168.1.254 Alarm:1 RTT:528.535ms RTTsd:1814.689ms Loss:7%)
    Nov 18 02:20:45	php-fpm	22501	/rc.newwanip: rc.newwanip: on (IP address: 192.168.1.250) (interface: WAN[wan]) (real interface: re1).
    Nov 18 02:20:45	php-fpm	22501	/rc.newwanip: rc.newwanip: Info: starting on re1.
    Nov 18 02:20:45	php-fpm	22501	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use GW_WAN.
    Nov 18 02:20:45	php-fpm	22501	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. 'RW_VPN_VPNV6'
    Nov 18 02:20:44	check_reload_status		rc.newwanip starting re1
    Nov 18 02:20:44	php-fpm	75269	/rc.linkup: Hotplug event detected for WAN(wan) static IP (192.168.1.250 )
    Nov 18 02:20:44	check_reload_status		Reloading filter
    Nov 18 02:20:44	check_reload_status		Restarting OpenVPN tunnels/interfaces
    Nov 18 02:20:44	check_reload_status		Restarting ipsec tunnels
    Nov 18 02:20:44	check_reload_status		updating dyndns GW_WAN
    Nov 18 02:20:44	rc.gateway_alarm	57938	>>> Gateway alarm: GW_WAN (Addr:192.168.1.254 Alarm:1 RTT:627.326ms RTTsd:1956.185ms Loss:19%)
    Nov 18 02:20:43	kernel		re1: link state changed to UP
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	check_reload_status		Linkup starting re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:43	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:42	php-fpm	75269	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use GW_WAN.
    Nov 18 02:20:42	php-fpm	75269	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. 'RW_VPN_VPNV6'
    Nov 18 02:20:42	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:41	check_reload_status		Reloading filter
    Nov 18 02:20:41	check_reload_status		Restarting OpenVPN tunnels/interfaces
    Nov 18 02:20:41	check_reload_status		Restarting ipsec tunnels
    Nov 18 02:20:41	check_reload_status		updating dyndns GW_WAN
    Nov 18 02:20:41	rc.gateway_alarm	12562	>>> Gateway alarm: GW_WAN (Addr:192.168.1.254 Alarm:1 RTT:4.780ms RTTsd:43.061ms Loss:22%)
    Nov 18 02:20:40	check_reload_status		Reloading filter
    Nov 18 02:20:40	php-fpm	83692	/rc.linkup: Hotplug event detected for WAN(wan) static IP (192.168.1.250 )
    Nov 18 02:20:39	kernel		re1: link state changed to DOWN
    Nov 18 02:20:39	kernel		re1: watchdog timeout
    Nov 18 02:20:39	check_reload_status		Linkup starting re1
    Nov 18 02:20:34	check_reload_status		Reloading filter
    Nov 18 02:20:34	php-fpm	75269	/rc.newwanip: rc.newwanip: on (IP address: 192.168.1.250) (interface: WAN[wan]) (real interface: re1).
    Nov 18 02:20:34	php-fpm	75269	/rc.newwanip: rc.newwanip: Info: starting on re1.
    Nov 18 02:20:33	check_reload_status		Reloading filter
    Nov 18 02:20:33	check_reload_status		rc.newwanip starting re1
    Nov 18 02:20:33	php-fpm	22501	/rc.linkup: Hotplug event detected for WAN(wan) static IP (192.168.1.250 )
    Nov 18 02:20:32	kernel		re1: link state changed to UP
    Nov 18 02:20:32	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:32	check_reload_status		Linkup starting re1
    Nov 18 02:20:32	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:31	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:31	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:31	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:31	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:30	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:30	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:29	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:29	kernel		arpresolve: can't allocate llinfo for 192.168.1.254 on re1
    Nov 18 02:20:29	check_reload_status		Reloading filter
    Nov 18 02:20:29	php-fpm	75269	/rc.linkup: Hotplug event detected for WAN(wan) static IP (192.168.1.250 )
    Nov 18 02:20:28	kernel		re1: link state changed to DOWN
    Nov 18 02:20:28	kernel		re1: watchdog timeout
    Nov 18 02:20:28	check_reload_status		Linkup starting re1
    

    I can not understand what it is due to, I replaced several cables but the result does not change.



  • What is your CPU utilization when this occurs? Are you using dedicated on-chip encryption or is the CPU doing all of the encryption?



  • @tim-mcmanus
    On average 30% and AES-NI cryptography



  • How many cores in the system?

    I am thinking you're experiencing a resource issue and am trying to determine where. Since you're at 30% it could mean that you've saturated one core in a 4 core system, and that could where the resource issue is.

    You could connect to the console and run something like TOP and recreate the issue to see if you can get some better insight to see if you can identify the RC there.



  • @tim-mcmanus
    I have already tried several times to create the problem, even keeping the connection constant at 100 mb/s seems to happen casually. As for the resources I had tried to disable many services to recover resources.


  • Netgate Administrator

    Yes, if your system is mulicore/CPU then 30% total could be 100% on one core.
    Run at the command line:
    top -aSH

    That will show you the per-core usage.

    If you only have one WAN you could set "Disable Gateway Monitoring Action" by editing the gateway in System > Routing.

    You will still see high latency but it won't have the same latency. It's worth doing at least as a test.

    Steve



  • @stephenw10
    I thank you, I will do a test and I'll let you know. I read a little while ago about another user who has a problem similar to mine and who solved by updating the realtek drivers.

    EDIT: I use an amd fx 6300 4.2ghz
    EDIT2: I think it might be Snort to saturate a core or more by having it on the WAN and LAN.


  • Netgate Administrator

    That's unlikely at 100Mbps with that CPU. But the top output will show it.

    Steve



  • I just did a test causing the problem, as I thought snort uses the cpu enough, but each of the six cores remains at least 36% in idle.
    Could it be a problem of the realtek drivers?


  • Netgate Administrator

    It probably is. Looking at your logs again:

    Nov 18 02:20:28	kernel		re1: watchdog timeout
    

    That is almost certainly the issue. The only thing you can try there (other than swapping out the NICs) is the alternative Realtek driver:
    https://forum.netgate.com/topic/135850/official-realtek-driver-binary-1-95-for-2-4-4-release

    Steve



  • @stephenw10
    Okay then I'll try.
    Last thing, what can I use to place the file in /boot/kernel?
    Through the interface I placed the file in /tmp, but I do not know how to move it to the right position.


  • Netgate Administrator

    You can copy it to there from the command line.

    You can use SCP to access the filesystem directly over SSH. WinSCP for example, if you're running Windows.

    Steve



  • @stephenw10
    I managed to move the file, now to change the permissions can I use winSCP?
    In the attached image, what should I select?
    link text


  • Netgate Administrator

    It needs to be executable. Check the three X boxes there.

    Steve



  • @stephenw10
    I think I was able to load the driver.
    Here is the output:

    Shell Output - kldstat
    Id Refs Address            Size     Name
     1   12 0xffffffff80200000 2d9a7d0  kernel
     2    1 0xffffffff82f9c000 7d2c0    if_re.ko
     3    1 0xffffffff83711000 10a0     cpuctl.ko
     4    1 0xffffffff83713000 72b8     aesni.ko
     5    1 0xffffffff8371b000 11a0     amdtemp.ko
     6    1 0xffffffff8371d000 648      amdsmn.ko
    

    I'm testing


  • Netgate Administrator

    Yup, looks like it. You should also see the version in the boot log against your NICs if you check.

    Steve



  • @stephenw10
    After three hours of constant download at 100mb/s there was no loss of connection, I hope it continues like this.
    At the next restart I will check.
    Thanks again for the help.

    EDIT: I confirm that I have solved the problem by replacing the realtek drivers included in pfsense.