WAN-link "randomly" disconnects. pfSense 2.1



  • Hi!

    I guess some/most of you are getting tired of these kinds of topics. I apologize in advance, but this is a problem for me and I do not have the knowledge to fix it myself.

    The reason topic says "randomly" is because maybe it isn't very random. My internet connection seems to go down sometimes when I start a new torrent download. These (single) files are a few GB in size.
    The weird thing is that it has worked perfectly for quite some time. It then just appeared out of thin air a few months back. I was hoping it would magically go away by itself… :)

    When it disconnects I can not reach anything on the internet, and my friends are disconnected from my TeamSpeak server (maybe that isn't too weird).

    Interestingly though, I sometimes lose my WAN IP entirely - and sometimes I don't lose it, but I don't have internet connection; according to the pfSense dashboard. And as you understand now, I can reach my pfSense box running on 192.168.1.1 while disconnected. I stay disconnected for around a minute.

    I don't know how to error search this in a proper way. I imagine monitoring packet loss would be a start - but I can't find any tools in pfSense that can help me with this. Ping only gives me 10 pings. Packet capture doesn't seem to monitor packet loss.
    Sadly though, I am not sure it would tell me much anyway.

    If I go to "Status - Gateways" during a DC, "loss" is 40-70% - otherwise 0%.

    I also found a log from there:

    
    Jan 20 19:19:13	apinger: ALARM: WAN_DHCP(X.X.X.X) *** down ***
    Jan 20 19:20:40	apinger: alarm canceled: WAN_DHCP(X.X.X.X) *** down ***
    Jan 20 19:20:42	apinger: SIGHUP received, reloading configuration.
    Jan 20 19:33:28	apinger: ALARM: WAN_DHCP(X.X.X.X) *** down ***
    Jan 20 19:35:03	apinger: alarm canceled: WAN_DHCP(X.X.X.X) *** down ***
    Jan 20 19:35:04	apinger: SIGHUP received, reloading configuration.
    Jan 20 20:51:32	apinger: ALARM: WAN_DHCP(X.X.X.X) *** down ***
    Jan 20 20:52:58	apinger: alarm canceled: WAN_DHCP(X.X.X.X) *** down ***
    Jan 20 20:53:00	apinger: SIGHUP received, reloading configuration.
    
    

    As you can see my connection went down two times this evening. First time whilie initiating a new torrent download. Second time I had a steady connection for 10GB of data at 9.6 MiB/s. (I'm on a 100/100 connection)

    from DHCP:

    
    Jan 20 20:51:38	dhclient[10209]: DHCPREQUEST on em0 to X.X.X.X port 67
    Jan 20 20:51:39	dhclient[10209]: DHCPREQUEST on em0 to X.X.X.X port 67
    Jan 20 20:51:40	dhclient[10209]: DHCPREQUEST on em0 to X.X.X.X port 67
    Jan 20 20:51:43	dhclient[10209]: DHCPREQUEST on em0 to X.X.X.X port 67
    Jan 20 20:51:48	dhclient[10209]: DHCPREQUEST on em0 to X.X.X.X port 67
    Jan 20 20:51:56	dhclient[3034]: connection closed
    Jan 20 20:51:56	dhclient[3034]: exiting.
    Jan 20 20:51:58	dhclient: PREINIT
    Jan 20 20:51:58	dhclient: Starting delete_old_states()
    Jan 20 20:51:58	dhclient: Comparing IPs: Old: New:
    Jan 20 20:51:58	dhclient[6288]: DHCPREQUEST on em0 to 255.255.255.255 port 67
    Jan 20 20:52:00	dhclient[6288]: DHCPREQUEST on em0 to 255.255.255.255 port 67
    Jan 20 20:52:05	dhclient[6288]: DHCPREQUEST on em0 to 255.255.255.255 port 67
    Jan 20 20:52:16	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1
    Jan 20 20:52:17	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1
    Jan 20 20:52:18	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1
    Jan 20 20:52:19	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1
    Jan 20 20:52:20	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1
    Jan 20 20:52:21	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1
    Jan 20 20:52:22	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 1
    Jan 20 20:52:23	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 2
    Jan 20 20:52:25	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 4
    Jan 20 20:52:29	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 6
    Jan 20 20:52:35	dhclient[6288]: DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 13
    Jan 20 20:52:36	dhclient[6564]: connection closed
    Jan 20 20:52:36	dhclient[6564]: exiting.
    Jan 20 20:52:40	dhcpd: Internet Systems Consortium DHCP Server 4.2.5-P1
    Jan 20 20:52:40	dhcpd: Copyright 2004-2013 Internet Systems Consortium.
    

    I have also attached a few RRD graphs.

    I have tried replacing the WAN network card. I had one lying around. It could be that one is also toast - but the behavior has been the same with both NICs. I'm not ruling anything out though.

    Can you tell me what is wrong?
    If you need more information, please do tell.

    Thanks!












  • Thank you for your reply!

    I just tried

    
    kern.ipc.nmbclusters="131072"
    hw.em.num_queues=1
    
    

    Should i also disable flow control?

    I also edited etc/rc.newwanip accordingly

    Then I rebooted.

    Initially I tried to reproduce the issue by triggering some heavy downloads. No disconnects as of yet. Although it can work fine for a few days and then die. Most often it seems if I can't reproduce the issue I most likely won't have any that day. When I have started disconnecting, all new downloads thereafter are likely to disconnect me again. So I generally wait until the next day and then everything is fine.

    Even though that seems natural: "I don't do anything heavy - I don't disconnect". I have a feeling some days are worse than others - even when my download pattern remains the same.

    Anyway, I'm holding my thumbs hoping the matter has been resolved. I will be back in a few days when I know for sure. Regardless, I'm very grateful for whatever help I get. Thank you webroy!


  • Netgate Administrator

    The fact that this occurs when you initiate a torrent download implies that it's caused by a large number of connections.
    Often this can cause the system to run out of mbufs, that would be indicated on the dashboard and in the system logs.
    The tuning options given above should help with that.

    What hardware are you running? What sort of WAN connection do you have? How is it connected?

    Steve



  • Of course finding root to the problem is the best ting… next best might be to just reboot. Thats what I do ;)
    I too had problems with system disconnecting from time to time.. WAN connection is lost.
    So I made an automatic reboot of the pfsense box.
    https://forum.pfsense.org/index.php/topic,71335.0.html

    and if you need a script to automatic reboot cable modem (if that is what you have ?) take a look here:
    (check code - Motorola and Cisco modem)
    https://forum.pfsense.org/index.php/topic,69879.msg381954.html#msg381954



  • LOL at auto reboot scripts.. what are we dealing with here, IIS?



  • @stephenw10:

    The fact that this occurs when you initiate a torrent download implies that it's caused by a large number of connections.
    Often this can cause the system to run out of mbufs, that would be indicated on the dashboard and in the system logs.
    The tuning options given above should help with that.

    What hardware are you running? What sort of WAN connection do you have? How is it connected?

    Steve

    I'm running on a dual core of some kind, 2GB ram. I guess it's a bit overkill but it's what I had :)
    It's connected to my city network. So it's a RJ45-cable (TP?) directly from my wall to the pfSense box.

    After the changes I haven't experienced any disconnects yet. It can work fine for a few days though so I don't know for sure yet. Looks promising though!

    Of course finding root to the problem is the best ting… next best might be to just reboot. Thats what I do ;)
    I too had problems with system disconnecting from time to time.. WAN connection is lost.
    So I made an automatic reboot of the pfsense box.
    https://forum.pfsense.org/index.php/topic,71335.0.html

    and if you need a script to automatic reboot cable modem (if that is what you have ?) take a look here:
    (check code - Motorola and Cisco modem)
    https://forum.pfsense.org/index.php/topic,69879.msg381954.html#msg381954

    I have tried rebooting. It doesn't help very much (/at all)

    Thank you for your replies!


  • Netgate Administrator

    @Damned:

    I have tried rebooting. It doesn't help very much (/at all)

    If it's a problem that can be solved by tuning the NIC options then I would expect that rebooting the machine would at least temporarily resolve it (until it runs out of resources again). If the WAN does not come back up after rebooting then I might suspect something at the ISP end objecting to your torrenting.

    Steve


Log in to reply