Troubleshooting help needed
-
The box is working fine for 8-10 days and at certain point I can't reach my WAN. No ping, no SSH, no Internet via LAN - nothing, but the box is alive. After cold restart - everything is OK for another 8-10 days and so on…
How are you sure the box is alive? Can someone at the remote site test it via the LAN? See if the web gui is available on the LAN, or at least responds to pings on the LAN nic.
If the WAN goes down, there should be something logged about that when it goes down. Can you explain more about the logs, ie after WAN disappears, is there normal activity still logged? Or is there a complete gap of anything logged until the cold restart? The latter would point to a hardware problem, a complete lock-up.
You might consider running memtest for a few days, and if that passes, run a cpu loading program for a few days.
PS - Explain your power setup please. Really DC-DC converter? 12V * 5A = 60W, not 120W ??
-
Are you able to communicate with whatever device is connected to the WAN port of the pfSense machine? And verify that the interface is functional? Is there a chance that your Internet connection is PPPoE or similar and restarting the box is logging it back in?
-
Hi charliem,
Thank you for your reply!
My box is located in my home country in my mother’s apartment, therefore when I had the problem I thought that the electricity went down, but I called my mother, she checked and the box was alive, but no response. Next time when the problem appear, I’ll ask my mother to try to log in via web interface to the box.
Regarding the logging: there is a complete gab until the cold restart takes place. That’s why I’m looking for help, because there is nothing logged to give me some clue.
Regarding the power: the DC/DC converting circuit is located in the case. It's DC/DC, because there is AC/DC adapter (laptop type) which is connected to the box. That's how the case is designed, I've never changed anything.
Thank you for your reply as well. My ISP is providing my internet connection via PPPoE, but they are quite reliable, never had any problems in the past.
Regards,
Nick -
Regarding the logging: there is a complete gab until the cold restart takes place. That’s why I’m looking for help, because there is nothing logged to give me some clue.
If no normal stuff is logged until restart, then it's almost certain the board is locked up.
Regarding the power: the DC/DC converting circuit is located in the case. It's DC/DC, because there is AC/DC adapter (laptop type) which is connected to the box. That's how the case is designed, I've never changed anything.
I'd be suspicious of your DC-DC power supply, and/or heat buildup in the case. I have no experience with DC –> ATX adapters. Can you run it temporarily with a standard ATX power supply? Also perhaps try running it with the case open?
-
Hi charliem,
I can confirm, that there is a log gab after system hang.
Regarding the DC - DC converter, I have installed 12 cm fan in the box and activated thermal monitoring function in pfSense. All temperatures are around 36 degrees so I'm guessing it's not a thermal issue.
The question is: How can I troubleshoot my problem remotely via SSH or WebGUI, since I'm away from my box?
Thanks for all the attention!
Regards,
Nikolay -
I can confirm, that there is a log gab after system hang.
…
The question is: How can I troubleshoot my problem remotely via SSH or WebGUI, since I'm away from my box?Well, I'm not sure what you expect to monitor or troubleshoot remotely. We've already established that it's probably a hardware issue, one that locks the machine suddenly and stops any further activity.
When I'm faced with a hardware problem, I start replacing hardware, and I'd start with your DC-DC converter and your external 12V power supply. As you say, you've been dealing with this since February, and it's not a configuration issue.
-
Thanks charliem, as soon as I get home I'll start hardware troubleshooting.
BTW the RAM module I'm using is not listed by my motherboard manufacturer, but it's the same brand (Kingston) and I think it's even better. Could that be the core issue to my problem?
I'm using: KHX1600C9D3B1K2 - only one stick 2 GB. It works at 1600 Mhz and my motherboard does support 1600 Mhz.
The manufacturer has listed: KVR1333D3N9/4G which runs at 1333 Mhz.
The brand is the same, so I'm guessing the quality of the modules will be the same.
Thanks.
Regards,
Nick -
I'd also try with a new PSU.
My current pfSense is also working fine with a DC>ATX converter at for an year now, but had problems in the past with a different type of DC>ATX psu model. You should just grab a psu from an old PC for testing.
-
Dear all,
It's been a while since my last post, but I wanted to make sure I've solved the problem.
The system hang was caused by a faulty PSU. Thanks a lot for all your suggestions.
I've replaced the AC-DC power brick with FSP 12V 12.5A 150W and the DC to DC PSU with MiniBox 12V 160W picoPSU-160-XT.
Now the system is quite stable.
Good luck to all of you and all the best,
Nick -
Now the system is quite stable.
If not it would be perhaps going to set up an APC PSU in forn of the pfSense.