Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Inbound WAN traffic stops every three hours

    Scheduled Pinned Locked Moved General pfSense Questions
    14 Posts 3 Posters 3.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      cfitz
      last edited by

      For the past several days we've been having a problem where every three hours inbound traffic stops on the WAN interface. When it happens I disable the interface, then re-enable which makes everything fine for three more hours. There is only an Exchange server with OWA behind this interface, so the traffic should be coming in on 443 and 25. We have had this setup for years with no problems. When this happened I upgraded to 2.0.3 from 2.0.1. Other than that, no changes have been made. Any ideas?

      This is on a Dell PowerEdge 1850 with an Intel PRO/1000MT NIC. pfSense is running in VMware ESXi, 4.1.0.

      1 Reply Last reply Reply Quote 0
      • C
        craigduff
        last edited by

        Very interesting. I remember this issue happening to me, and i seem to remember changing the cat 5 cable.. Have you also rebooted esx? And even try installing or upgrading open vm tools in the packages???

        Kind Regards,
        Craig

        1 Reply Last reply Reply Quote 0
        • C
          cfitz
          last edited by

          Good ideas, thanks! I'm going to change the cables out and see what happens at 4:50CST (my next estimated outage). If there is still an issue I'll take down the ESXi box and install the tools.

          1 Reply Last reply Reply Quote 0
          • C
            craigduff
            last edited by

            Well i believe you could install the open vm tools whist its live, but that's you your call. You nor upgrade esx as well? 5.1 is amazing by the way!

            Kind Regards,
            Craig

            1 Reply Last reply Reply Quote 0
            • C
              cfitz
              last edited by

              Since yesterday, I've changed the cables, installed vm tools and rebooted the ESX server. Still no luck. Called our ISP (Comcast Business) and baffled them. They suggested changing the IP, which also means our MX record. At least it's the weekend again.

              1 Reply Last reply Reply Quote 0
              • C
                craigduff
                last edited by

                Just a thought. Could it be your State table getting over loaded with the amount of data its sending and receiving? Related to NAT

                Kind Regards,
                Craig

                1 Reply Last reply Reply Quote 0
                • C
                  cfitz
                  last edited by

                  Spent most of the the day trying things with this and just got to the end of the three hour window again with unfortunate results. I did get a look at the state table size while it was occurring and it says 464/98000. I also tried resetting the state table, but it didn't help. Had to reset the interface again to get it going.

                  One of the things I did today was build a new VM from a 2.0.3 OVA. Exported the settings and then imported to the new. Did some tweaking to getting it working smoothly, but still have the same three hour outage. Also switched ports on the Comcast modem.

                  Next step is assigning a new IP to the interface changing my MX record.

                  1 Reply Last reply Reply Quote 0
                  • C
                    craigduff
                    last edited by

                    I don't think it should come to that. Nothing to do with hardware you think? Its so odd!

                    Kind Regards,
                    Craig

                    1 Reply Last reply Reply Quote 0
                    • C
                      craigduff
                      last edited by

                      What about drivers? You got any usb devices attached.

                      Kind Regards,
                      Craig

                      1 Reply Last reply Reply Quote 0
                      • W
                        wallabybob
                        last edited by

                        Please post an extract from the system log from a couple of minutes before the WAN link goes down to about 5 minutes after. Most recent entries in the system log can be found in Status -> System Logs. The complete system log can be displayed by pfSense shell command```
                        clog /var/log/system.log

                        
                        Please also post output of pfSense shell command```
                        /etc/rc.banner
                        ```to show what sort of NICs you are using and their configuration.
                        1 Reply Last reply Reply Quote 0
                        • C
                          cfitz
                          last edited by

                          I went ahead and changed my MX record and moved to a different IP. No change in results. There are no USB devices. It's really a pretty basic setup. A single ESXi server with two Intel dual port NICs. pfSense is the only VM on the box. Below is the rc.banner result:

                          *** Welcome to pfSense 2.0.3-RELEASE-pfSense (i386) on pfsense ***

                          LAN (lan)                -> em0        -> 192.168.1.100
                            WAN (wan)                -> em1        -> 74.XX.XX.117
                            OPT1 (opt1)              -> em2        -> 74.XX.XX.115
                            OPT2 (opt2)              -> em3        -> 74.XX.XX.116

                          WAN is used like a typical WAN and has no issues. OPT2 is not used. OPT1 is just used for Exchange mail is the one I'm having problems with.

                          The last outage was at about 20:43:00. Here is the system log from around that time. (Note 192.168.1.84 is an unrelated server to Exchange) -

                          May 31 20:30:40 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                          May 31 20:30:40 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0
                          May 31 20:36:09 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                          May 31 20:36:09 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0
                          May 31 20:41:37 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                          May 31 20:41:37 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0
                          May 31 20:46:56 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                          May 31 20:46:56 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0
                          May 31 20:48:09 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                          May 31 20:48:09 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0
                          May 31 20:53:37 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                          May 31 20:53:37 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0
                          May 31 21:00:09 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                          May 31 21:00:09 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0

                          Thanks for the help!

                          1 Reply Last reply Reply Quote 0
                          • W
                            wallabybob
                            last edited by

                            @cfitz:

                            Below is the rc.banner result:

                            *** Welcome to pfSense 2.0.3-RELEASE-pfSense (i386) on pfsense ***

                            LAN (lan)                 -> em0        -> 192.168.1.100
                              WAN (wan)                 -> em1        -> 74.XX.XX.117
                              OPT1 (opt1)               -> em2        -> 74.XX.XX.115
                              OPT2 (opt2)               -> em3        -> 74.XX.XX.116

                            WAN, OPT1 and OPT2 are on the same subnet? bridged?

                            @cfitz:

                            The last outage was at about 20:43:00. Here is the system log from around that time. (Note 192.168.1.84 is an unrelated server to Exchange) -

                            May 31 20:30:40 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:2a t                                                o 00:19:b9:f9:b7:29 on em0
                            May 31 20:30:40 pfsense kernel: arp: 192.168.1.84 moved from 00:19:b9:f9:b7:29 t                                                o 00:19:b9:f9:b7:2a on em0

                            So what is going on causing 192.168.1.84 to wander from one interface to another then back again?

                            This MIGHT be related to your problem.

                            You described your problem as "inbound WAN traffic stops every three hours". What evidence lead you to that conclusion? Perhaps your "WAN traffic" is getting to its intended destination but responses are going awry.

                            Have you looked in the VMWARE logs around the time the traffic stops? Any events reported from the NICs? Maybe your WAN link is going down and VMWARE is not reporting it to pfSense.

                            Do you have a WAN gateway with monitoring enabled? If so, are the "outages" visible on the appropriate Status -> RRD Graphs, Quality tab?

                            1 Reply Last reply Reply Quote 0
                            • C
                              cfitz
                              last edited by

                              Well, it's gone almost 20 hours without an outage now. I ended up changing the MX record again to point to 74.XX.XX.117, which put all of my traffic on the WAN interface, allowing me to remove the OPT1 and OPT2 interfaces. With it working correctly now the only things I can think of is there was either a problem with that port on the NIC or maybe the possible wandering problem wallabybob mentioned.

                              In regard to wallabybob's questions in the previous post -

                              The interfaces are using three external addresses provided by our ISP. I was using them mainly to sort web traffic to multiple web servers. We consolidated web servers awhile back, so that was no longer needed. The only problem I could foresee now with consolidation is that we may one day need to route 443 traffic to multiple servers. Surly there is a way, I just don't know it yet.

                              The three hour traffic stop was on OPT1. It only received traffic on ports 25 and 443 which were port forwarded directly to an exchange server. Incoming traffic would stop, but I could still send mail out.

                              I have not been able to find any suspects in the VMware system logs.

                              I was going to grab the monitoring data, but it doesn't appear to have more than 1 days worth of data in it.

                              Thanks wallabybob and craigduff. Hopefully some of this will help someone else from having nine days of checking in every three hours.

                              1 Reply Last reply Reply Quote 0
                              • C
                                craigduff
                                last edited by

                                Good luck mate.

                                Kind Regards,
                                Craig

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.