Multi-WAN Issue with one Link always



  • hi again , well  i have a weird issue . i am using the latest build of pfsense 2.0.1-RELEASE (i386) on dualCore PC . 4 RAM . with 3 NICs . two NICs from same ISP ( one ADSL 20 MB and one LL 4 MB )  and NIC for LAN .
    i am using These 2 WAN NICs in Load Balanced way ( having both in same tier ) and the LAN GW in the firewall routing rule  is the tier group ( meaning i do it right ) ;)
    the group trigger level is Member Down .
    in the two WAN GW ( non of them is default ) and both having their monitor IP to different IPs in the ISP far farm . and every thing work perfect . i make the ADSL have weight 2 and the LL have weight 1 .
    again as i said every thing work in great way , and if link goes down it shift all traffic to the other link and also it utilize both links as the weight goes.. but every some hours the ADSL link go offline , but when i ping from pfsense through the adsl interface to any IP no problem it reply . also i see that there is traffic go through the ADSL interface . but still it is offline . when i want to trace rout it always use the LL ( sure after the adsl go offline ) , also showmyip stick with the LL IP - before the adsl card go down it go round-robin ..
    So to make this clear the ADSL modem didn't in fact  went down , neither the ADSL link . and to fix this i had to change the monitor IP to any other IP and let it gather data and in 2 seconds it go online again and start load balancing .
    All logs show nothing except the normal thing like it is pining the monitor IP and get delays but after a while it will mark it as down . but again i confirm it is not down , and that IP didn't have any kind of protection to block me ( as a matter of fact i tried it with lots of IPs like 8.8.8.8 8.8.4.4 and other and always have this issue )

    My Q is . does any one face these issue . i am still investigate this .. i will update you ASAP ..

    PS i wonder why it is with the ADSL only ( which is  the main link according to the weight )



  • hmm till yet i didn't find any clue , but i will keep looking  ???



  • Did you check the RRD graphs and why the link went down ? Is it packet loss or high latency ? You can increase these values on each gateway.

    Sure that your modem isn't disconnecting after some idle time ?

    Can you try with your modem as monitor IP ?



  • Thanks for reply
    Yep i tried all that . and modem didn't went idle or down and yes i monitor with PRTG to check the modem connectivity ( another machine connected to another port on the modem ) . and the latency some times reach high percent but i am not using it , i am using to check the member down only … anyway NOW i am putting  the monitor IP is the modem LAN interface ! and this must be Online all the time ... ( just for testing ) i will update here ..



  • @MedoZero:

    (…)anyway NOW i am putting  the monitor IP is the modem LAN interface ! and this must be Online all the time ... ( just for testing ) i will update here ..

    That would be the next thing to check if it is pfsense or something else.



  • well , it seems to be something related to the RTT , and latency for sure , after i change the monitor IP to internal modem IP , it didn't go offline for 2 days  with heavy load over the line . the RTT is less than 1ms off course . so i think it is related to hight latency when i put the monitor IP as ISP outside IP ( or google DNS or any other IP outside )
    according to RRD Graph Quality  it reached 800-1000ms for the ADSL line . so i think i can put the Latency thresholds to be between 1500m and 2000m and change the  Trigger Level to high latency

    What do you think?



  • With latency that high, you probably have some kind of problem unless you have a satellite connection. That's generally indicative of a problem. Maybe pick a closer monitor IP if you're using something that's on the other side of the planet. Otherwise, yeah just increasing the latency thresholds will work.



  • yes right . i changed the IP to another IP which sometimes stuck in gathering data , i am kinda sure it is related to RTT . any i will test some stuff and i will update here , to see if this will be closed due to ADSL high latency issue .

    Thanks



  • Ok this topic can be closed . seems to be nothing related to pfsense ;)



  • OK a minor update
    i think it is something related to my link and also pfsense .
    see when the link is offline ( red in the main portal ) and i know it is not . i just go to routing/adsl route interface and just press save changes ( without changing any thing ) same IPs , same monitor IP and every thing . then when i press apply changes , the interface come online
    my fix would be make a cron job to check the interface every 10 mins
    so what do you think ?
    my idea is something related to ifstat may be ?



  • just to confirm this . i thinf pfsense didnot recheck the GWs if it goes offline.
    i face this again in another site with same design , and now one link of them is Gathering data , and the link didnot is working fine. but had some hight RTT is some period of time .



  • The gateways are constantly checked whether online or not, and are brought back online as soon as they're up and functioning again.


Log in to reply