Crash on Print?!
-
Hi,
I realize how crazy this sounds, but I am starting to suspect that communication with my printer is crashing my pfsense firewall.
Running 2.3.2-release-p1 on a Soekris net6501. Simple home network - connected to cable modem, separate wifi access point, along with a bunch of hard wired machines. Generally extremely stable.
However, have noticed a very odd pattern - I have a new brother MFC printer, connected via Ethernet, and to which I print from my MacBook via WiFi. I have noticed over the past two months that printing to the printer has a more than 25% chance of crashing the firewall hard. Specifically, the red error light comes on on the Soekris and the machine is locked hard and cannot be debugged over serial. Reboot fixes, of course, and it only seems to happen RIGHT after I print. I don't print too often, but it is frequent enough that I have noticed.
I am willing to troubleshoot, but not sure where to begin as the GUI logs only show signs of the reboot, and nothing is logged (at least that I can detect) about the origin of the crash.
Any thoughts or ideas on where I should look?
Tom
-
Not enough information, seriously. Start by posting your network topology, what is where and how they are connected to each other. If it happens (which we don't know yet because you seem to think we have a crystal ball) that your printer is in the same network segment as the printing client system the traffic won't even hit pfSense during a print operation.
-
Thanks.
The network is a single contiguous segment.
Cable Modem
|
|–-> Soekris net6501 running pfSense (WAN)
LAN Interface
|
|---> Unmanaged GigE Switch
| |
Brother MFC Netgear R6300 WAP
*
*
****> MacBook AirThere are a few other devices hanging off the GigE Switch and on the wireless network, but I've never had any issues until switching to the new printer a few months ago.
The firewall and the printer are definitely on the same segment. I realize nothing should be visible from the fw when traffic is moving from the WAP to the printer, but I was wondering if there weren't some strange Apple broadcast packets that were somehow causing the issue.
I realize how ridiculous this sounds and that it may be coincidental or point to another issue. I am certainly open to other troubleshooting suggestions or ideas.
-
There is no output at all over serial before it stops responding?
It's entirely possible it's somehow tripping a hardware issue at the time, but given your diagram printer traffic should not be hitting the firewall at all anyhow. So unless it's from broadcast or multicast traffic (a loop, perhaps?) that happens when the MAC tries to print, I don't see how it could affect the firewall.
-
My thought was broadcast / multicast as well.
I am getting no crash dump, no error logs, and nothing on serial. The soekris has the red error light lit and needs a hard reset.
I am just curious if 1. anyone has seen anything like this with pfsense, or with soekris hardware, and 2. any thoughts on how i might troubleshoot / diagnose further.
Tom
-
I have seen similar issues on Soekris hardware. It's nearly always the hardware.
-
I would look at possible grounding issues and other AC power related issues. If nothing is found then it would be the time to fire up a packet capture during a print operation using a separate machine that can capture all of the traffic up to the point of the crash.
-
I've been inspired by the board to investigate alternate ideas, particularly based on the feedback about Soekris reliability. I had always considered the devices to be highly reliable, but am now seeing quite a few issues, particularly around the thermal package. I have the device in a large closet on a high shelf, which should be OK, but got me to dig. Well, it appears that the stock case and heat sink are NOT up to the job, as the CPU core is currently running around 79C with little / no load.
I now expect the printer IS causing the issue, due to heating the closet, not some strange broadcast packets!!!
Incidentally, does anyone know how to override the Tj Max setting - the coretemp module is unable to read the CPU ID and sets the Tj Max to 100; for the net6501-50 it should be 90.
Tom