PPPOE Dropping When WAN Upload Bandwidth Is At Maximum!



  • I'm currently have some issues with my PPPOE Zen ADSL connections dropping. My setup is the following:

    2.0-BETA4 built on Tue Oct 5 22:38:40 EDT 2010 FreeBSD 8.1-RELEASE-p1
    2x ADSL PPPOE Connections via Zen Internet (WAN, OPT1)
    1x LAN Connection (LAN)

    My PFSense box is very happy through the day and night when there is just usual user activity on the nextwork. Web Browsing, Downloading, Email, Web Hosting, MS RDP use by external users, FTP etc….

    The issues comes when i send an Email Shot to 4000 of our customers on a night which maxes out the WAN Upload bandwidth and after about 30mins the WAN connection drops which also messes up the OPT1 ADSL connection even though it still shows as online on the home page. When the WAN connection drops the IP address dissapears on the home page but the OPT1 IP Address is still showing but no one can connect in remotely and all the websites we host are inaccessable.

    This issue only seems to happen when the PFSense box WAN connection is maxed out from the Email Shot. The only way to get the 2x ADSL connections back online and open again is to restart the router.

    Does anyone have any ideas on what the cause and fix might be?

    Cheers,

    Andy
    andy.hughes@info-trader.co.uk
    07944988702



  • These are the log entries when the connections go down:

    Oct 9 16:50:00 check_reload_status: check_reload_status is starting.
    Oct 9 16:50:39 apinger: ALARM: WAN(62.3.82.19) *** loss ***
    Oct 9 16:50:40 check_reload_status: Rewriting resolv.conf
    Oct 9 16:50:47 check_reload_status: Rewriting resolv.conf
    Oct 9 16:50:49 check_reload_status: reloading filter
    Oct 9 16:50:50 php: : Could not find gateway for interface(wan).
    Oct 9 16:50:50 php: : Could not find gateway for interface(wan).
    Oct 9 16:50:51 apinger: ALARM: WAN(62.3.82.19) *** down ***
    Oct 9 16:51:01 check_reload_status: reloading filter
    Oct 9 16:51:02 php: : Could not find gateway for interface(wan).
    Oct 9 16:51:02 php: : Could not find gateway for interface(wan).



  • How many concurrent connections are used to send the email? How big are the emails? Do they all go to the one server?

    Have you talked with your ISP to see if anything happens to their equipment when you do a bulk mailout?

    Have you talked with your email service provider to see if anything happens to their equipment when you do a bulk mailout?

    Depending on how your bulk mailout happens, you may be placing extraordinary demands on some of the equipment upstream of your mailer system. If your mailer opens hundreds (or more) concurrent TCP connections and the mail message is "large enough" then there may not be enough available bandwidth for apinger to verify that your upstream gateway is still operational because data for all the email connections is queued up in front of apinger's data.

    @senate014:

    The only way to get the 2x ADSL connections back online and open again is to restart the router.

    What else have you tried? Disable/enable the two "WAN" interfaces on pfSense?



  • Hi Wallabybob,

    Thanks for responding!  :) The mail out goes to around 4000 contact from a contact database (ACT!). Each mail out varies in size from 80k to 600k per email which is obviously a lot when it all adds up. I've been through nearly every scenario I.E. ISP issues, Exchange Server issues etc… but when then I keep remembering that PFSense 1.2 never skipped a beat using the same ISP and setup.

    Any ideas?  ???



  • @senate014:

    but when then I keep remembering that PFSense 1.2 never skipped a beat using the same ISP and setup.

    pfSense 2.0 has a different FreeBSD kernel from previous versions and other mechanisms may have changed as well. Its not always useful to extrapolate from the behaviour of previous versions.

    @senate014:

    Any ideas?  ???

    See my previous post for some ideas. Additional suggestions follow:

    Have a look at the RRD graphs at the time of the mailouts. Maybe there is a spike then in the usage of some resource, even to exhaustion. pfSense will probably buffer some TCP data for each connection. If all available memory is used for that buffering then apinger may not be able to do its job.

    If the mailout application has a way of reducing the maximum number of concurrent connections or otherwise restricting its bandwidth demands invoke that mechanism.

    Use traffic shaping to reserve some bandwidth for apinger OR to limit the bandwidth used by the mailout. If I recall correctly there is a HOWTO for the pfSense 2.0 traffic shaper in this forum. I haven't used the traffic shaper. It is apparently different from that in previous versions and a number of people seem to have found it hard to configure so you might not want to take this step unless you are confident about what you are doing.


Log in to reply