Bizarre Webserver Blockage
-
I'm asking here only because the issue is cleared by rebooting our pfSense router. (2.3.4-RELEASE-p1 (i386) on a Dell PE 1850) And please forgive me for the length.
On March 15, 2018 our Internet service went down because of an ISP equipment failure at the entrance to our office park. When it came back up all was well.
Note that nothing was changed on any of our servers or infrastructure in our office.
Then starting at 10:30 pm I started getting alerts 4-6 times a day that my web sites were not responding. I eventually determined that rebooting the pfSense router would temporarily fix the problem.
The bizarre part is that while the websites weren't reachable from the Internet, I could VPN into our network and connect to the router to reboot.
The ISP replaced our modem "just because" and found a bad connection in their line coming into the office. Didn't fix anything. I bought a new 64bit computer as the pfSense router, and nothing changed. They have come back several times, each time finding something else wrong in their cables/equipment between here and their office, but the problem persists.
The only change is that since April 8, the outages only occur on the half hour (but not the same half hour each time - eg: 7:30 pm, 1:30pm, 3:30pm, 4:40am, 8:30pm etc), and usually only twice a day. And on May 1-3 we had a glorious 50 hours between outages. But since yesterday it has gone out 3 times.
The ISP has no clue, I can find nothing in the router or web server logs that correspond to these times (except the log of me rebooting).
My reason to keep involving the ISP is that for 10+ years (pfSense since 2009) we never had this issue until after their outage on March 15. If it something in our equipment that is a really big coincidence, that is leaving no clue.
Also note that I haven't rebooted the web server to correct the issue, just the router.
Does anybody have any clue as to what is causing this issue.
Suggestions as to what and where to look in the logs for any clues?
My best suggestion so far is to accelerate moving all our web services to Amazon. One site down and 5 to go…Thanks for whatever help you can give.
Dave
-
Many details left out- I'd guess the web servers are on private addresses, which are nated to public addresses on the firewall. Is the firewall on the same block as the servers? What kind of vips are you using? My first thought would be the ISP has a configuration error and they are advertising your block somewhere else. Try doing a traceroute when the servers are unreachable, and see how far the traffic goes. Another thought- are you using carp and the isp VRRP?
-
I am guessing you have multiple IPs and your having a problem with the vip your forwarding..
I would allow for icmp on your vip, can you ping the vip when your website goes down? You say you can get into pfsense when having the problem… My guess a different IP.. If so then do a sniff to you see traffic hitting pfsense for your IP and port your trying to forward?