2.3 Lockup with no packages
-
Not sure if it "makes sense" ;) but it does seem a reasonable correlation.
Looks like you have the same driver messages about watchdog timeout happening first, then the rest of the code/system is detecting the link bounce and causing things to get wrapped around the axle.
Yet 2.2.6 works fine, so a place to look is any changes to the em driver between the two.
Is there anything in the pfSense release notes? Maybe one of the pfSense folks knows off the top of their head.
-
The missing table issue has not always been my issue. I just forgot to clean it out on this one. But I think something similar has happened where there is some "issue" and php-fpm just starts hammering the hardware to the point of overload. In my case, overloading the NIC.
-
Apr 22 11:42:28 php-fpm 21408 /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.3.179 ) Apr 22 11:42:27 kernel em2: link state changed to UP Apr 22 11:42:27 check_reload_status Linkup starting em2 Apr 22 11:42:25 php-fpm 21408 /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.3.179 ) Apr 22 11:42:24 check_reload_status Linkup starting em2 Apr 22 11:42:24 kernel em2: link state changed to DOWN Apr 22 11:42:24 kernel em2: Watchdog timeout -- resetting
Let us talk about this entries here please;
- Hotplug event detected for LAN(lan) static IP (192.168.3.179)
Who is rejecting here the cable or is it broken or loose it only the connection? Or did anything is not sitting
or fitting right in the plug or jack? - em2: link state changed to DOWN
Ok if your WAN port is set to the status Down it could be that the watchdog timer is the timing out
and close that link or try it to renew or restart it again. - (192.168.3.179)
Is it normal to have a private IP at the WAN port or is that the LAN port who is creeping down here?
According to the WatchDog timer I was meaning this one talked about here:
WtachDog Timer threadHere is another em0 thread that is similiar to yours here, or am I wrong now?
em0: thread - Hotplug event detected for LAN(lan) static IP (192.168.3.179)
-
Frank,
The "hotplug" event is the result from "watchdog timeout – resetting". The watchdog timeout message is coming from the device driver, because the code thinks the Transmit processing is hung. When it detects that error condition, it does a reset on the device, which looks like someone unplugged and replugged the cable (the timestamps are in reverse order; it got me too for a bit).Good find on the em0 thread; that has potential for workaround.
OP, if you have an extra switch, could you try putting it inbetween your pfSense box and your LAN connection? This would be a workaround, just to get data. Or add a switch between the WAN and whatever it connects to.
The watchdog thread you found, that is a hardware watchdog, different than what is happening here. A hardware watchdog is more general use: if a process starts taking 100% CPU, the hardware watchdog can't be acessed in time, so the whole box will reset (and likely provide a core file). Similar idea, but not directly related to the OP issue.
-
What if I just turn off the watchdog in the BIOS?
Also I'm not following what adding a switch would do? I don't thin my next hop is having a failure since I have multiple pfSense boxes connected to various hardware. The odds of them all failing is is very slight.
-
The watchdog that is from the em driver is a software watchdog, not a hardware watchdog and has nothing to do with the BIOS. The code has a timer that runs, and periodically checks the status of the hardware. The error message means the software thinks the transmit side is stuck; outgoing packets are queued up, but the queue is not getting processed, so it backed up. One side of the queue is the software sending a packet, the kernel checks to see if there is room on the queue, puts the packet on the queue if there is or drops it if there is not. The other side of the queue is the NIC hardware: that pulls items off the queue and puts them on the wire. After a packet is on the wire the hardware typically generates an interrupt, the driver code marks that place in the queue as freed and life is good. If the hardware does not pull packets off the queue, the queue will back up and the transmit side grinds to a halt. A typical reason that the hardware won't pull packets from the queue is because it thinks the link is down. Lots of stuff happens automatically in the hardware for you, at the phy level on a NIC.
That is what we have happening; that is the reason why the driver is doing the "watchdog timeout – resetting" to clear the hardware.
Now this may be a false failure: something else could be happening that is preventing the hardware from generating the interrupt or the driver code from processing it.
Adding a switch on either or both sides: it gives a data point. The pfSense interface plugged into a switch instead of where it is now should give you confidence that the link is not physcially bouncing on you. Think of a typical configuration: cable modem to WAN of the pfSense box, pfSense LAN to a swtich or wireless router. If the ISP does something that causes your cable modem to bounce the WAN link, you won't realize it, but the pfSense box will see it.
Again, since I don't know your configuration or the hardware either side of the pfSense box is hooked to, I'm trying to make educated guesses, based on what I've done/seen in the past. It's possible that something changed in the driver between 2.2.6 and 2.3 (2.2.6 is FreeBSD 10.1, 2.3 is FBSD 10.3) that is triggering this, but I don't know.
If you can, get the output of
ifconfig -aSome of the things you can find by googling "freebsd em driver watchdog timeout" talk about disabling things like TSO. ifconfig will let us know what options are available/enabled on your hardware, then it's just figuring out how to turn them off in pfSense (probably on the interface configuration page). TSO is a likely candidate for causing problems.
-
@mer:
The watchdog that is from the em driver is a software watchdog, not a hardware watchdog and has nothing to do with the BIOS.
Yes, Frank's confusing this with something completely different and entirely unrelated.
This is an apparent em driver regression between FreeBSD 10.1 and 10.3. Or maybe an OS-level change related to it. One thing I'm curious about is whether having netmap enabled is the source of an edge case issue with some em cards. That's maybe the most significant difference in our kernel and stock FreeBSD 10.3 in that regard, so it's worth a shot. I compiled a kernel minus netmap. Not something I want to publicly link at this point, but SoloIT I PMed you a link and instructions if you can please try that.
-
Here is the ifconfig if it helps:
em0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic>ether 00:30:18:xx:xx:xx inet6 fe80::xxx:xxx:xxx:xxx%em0 prefixlen 64 scopeid 0x3 inet xxx.xxx.xxx.xxx netmask 0xffffff00 broadcast xxx.xxx.xxx.xxx nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>) status: active em1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic>ether 00:30:18:xx:xx:xx inet6 fe80::xxx:xxx:xxx:xxx%em1 prefixlen 64 scopeid 0x4 inet xxx.xxx.xxx.xxx netmask 0xfffffff0 broadcast xxx.xxx.xxx.xxx nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>) status: active em2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic>ether 00:30:18:xx:xx:xx inet6 fe80::xxx:xxx:xxx:xxx%em2 prefixlen 64 scopeid 0x5 inet xxx.xxx.xxx.xxx netmask 0xffffff00 broadcast xxx.xxx.xxx.xxx nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>) status: active</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic></up,broadcast,running,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic></up,broadcast,running,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic></up,broadcast,running,simplex,multicast>
Sorry for the delay.
-
FYI, I have another piece of hardware that was suffering the problem much worse using igb interface.
-
FYI, I have another piece of hardware that was suffering the problem much worse using igb interface.
I'd take cmb up on the offer of the kernel he built without the NETMAP option; it should have removed the NETMAP option completely from the kernel. The em and igb drivers should be from the same code base so it is possible they share a problem.
-
Firewall box with em NIC and modified kernel is up and running. It's seems to be routing traffic as normal. I should know by morning how it's holding up. In the past it has crashed during my remote back which moves a lot of data.
I will keep everyone posted. Again thank you all for your help.
-
13.5 hours and no crashes. I did have odd stuff in my system.log. Any ideas?
Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *371 open() "/usr/local/www/redir/cgi-bin/ajaxmail" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /redir/cgi-bin/ajaxmail HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *370 open() "/usr/local/www/fcgi-bin/performance.fcgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /fcgi-bin/performance.fcgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *369 open() "/usr/local/www/fcgi-bin/dispatch.fcgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /fcgi-bin/dispatch.fcgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *368 open() "/usr/local/www/das/cgi-bin/session.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /das/cgi-bin/session.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *367 open() "/usr/local/www/wingame.pl" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /wingame.pl HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *366 open() "/usr/local/www/webscr" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /webscr HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *365 open() "/usr/local/www/webproc" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /webproc HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *364 open() "/usr/local/www/verify.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /verify.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *363 open() "/usr/local/www/traffic/process.fcgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /traffic/process.fcgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *362 open() "/usr/local/www/top/out" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /top/out HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *361 open() "/usr/local/www/tjcgi1" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /tjcgi1 HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:52 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:52 [error] 27529#0: *360 open() "/usr/local/www/te/o.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /te/o.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *359 open() "/usr/local/www/start" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /start HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *358 open() "/usr/local/www/sse.dll" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /sse.dll HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *357 open() "/usr/local/www/spcnweb" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /spcnweb HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *356 open() "/usr/local/www/search.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /search.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *355 open() "/usr/local/www/rshop.pl" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /rshop.pl HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *354 open() "/usr/local/www/readmsg" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /readmsg HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *353 open() "/usr/local/www/rbaccess/rbunxcgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /rbaccess/rbunxcgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *352 open() "/usr/local/www/rbaccess/rbcgi3m01" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /rbaccess/rbcgi3m01 HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *351 open() "/usr/local/www/passremind" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /passremind HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *350 open() "/usr/local/www/out.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /out.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *349 open() "/usr/local/www/openwebmail/openwebmail-main.pl" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /openwebmail/openwebmail-main.pl HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *348 open() "/usr/local/www/navega" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /navega HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *347 open() "/usr/local/www/msglist" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /msglist HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *346 open() "/usr/local/www/mainsrch" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /mainsrch HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *345 open() "/usr/local/www/mainmenu.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /mainmenu.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *344 open() "/usr/local/www/logout" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /logout HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *343 open() "/usr/local/www/logout" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /logout HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *342 open() "/usr/local/www/login" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /login HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *341 open() "/usr/local/www/login.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /login.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *340 open() "/usr/local/www/link" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /link HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *339 open() "/usr/local/www/krcgistart" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /krcgistart HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *338 open() "/usr/local/www/krcgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /krcgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *337 open() "/usr/local/www/index" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /index HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *336 open() "/usr/local/www/index.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /index.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *335 open() "/usr/local/www/ib/301_start.pl" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /ib/301_start.pl HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *334 open() "/usr/local/www/hslogin.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /hslogin.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *333 open() "/usr/local/www/hotspotlogin.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /hotspotlogin.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *332 open() "/usr/local/www/getattach" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /getattach HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *331 open() "/usr/local/www/frame_html" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /frame_html HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *330 open() "/usr/local/www/findweather/hdfForecast" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /findweather/hdfForecast HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *329 open() "/usr/local/www/findweather/getForecast" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /findweather/getForecast HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *328 open() "/usr/local/www/fg.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /fg.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *327 open() "/usr/local/www/crtr/out.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /crtr/out.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *326 open() "/usr/local/www/clicks.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /clicks.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *325 open() "/usr/local/www/click.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /click.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *324 open() "/usr/local/www/br5.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /br5.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *323 open() "/usr/local/www/bp_revision.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /bp_revision.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *322 open() "/usr/local/www/bbs/postshow.pl" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /bbs/postshow.pl HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *321 open() "/usr/local/www/bbs/postlist.pl" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /bbs/postlist.pl HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *320 open() "/usr/local/www/auth" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /auth HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *319 open() "/usr/local/www/atx/out.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /atx/out.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *318 open() "/usr/local/www/atc/out.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /atc/out.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *317 open() "/usr/local/www/at3/out.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /at3/out.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *316 open() "/usr/local/www/arr/index.shtml" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /arr/index.shtml HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27254#0: *315 open() "/usr/local/www/ajaxmail" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /ajaxmail HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *314 open() "/usr/local/www/a2/out.cgi" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /a2/out.cgi HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *313 open() "/usr/local/www/cgi-bin/webproc" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /cgi-bin/webproc?getpage=/../../etc/passwd&var:language=en_us&var:page=* HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:51 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:51 [error] 27529#0: *312 open() "/usr/local/www/rom-0" failed (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /rom-0 HTTP/1.1", host: "10.10.1.1" Apr 26 06:54:25 bhamfirewall.warrenmfg.local nginx: 2016/04/26 06:54:25 [error] 27529#0: *311 "/usr/local/www/HNAP1/index.html" is not found (2: No such file or directory), client: 10.10.1.177, server: , request: "GET /HNAP1/ HTTP/1.1", host: "10.10.1.1"
-
Well, that's good news on the modified kernel. Are 10.10.1.1 and 10.10.1.177 IPs in your network? If you ssh/console to the box, ls /usr/local/www are any of those files/directories present? Should they be?
-
10.10.1.177 seems to be infected with some Malware i guess :o
-
10.10.1.1 is the firewall and 10.10.1.177 is client. I'm so tired of looking at the log files, I total glossed over the accessing IP. I'll look into it and see if there is any malware on the machine.
I did not look for all the files/folders, but the dozen or so I checked do not exist.
Thanks for the second set of eyes.
-
That 10.10.1.177 indeed looks to be compromised, or possibly running a vulnerability assessment tool but that list seems to match known malware. Something on that machine is scanning looking for common vulnerable pages.
-
Thanks I am looking into that client. It's used by a part time person, so it explains why I've not seen it before.
Good news, the patch is still holding up. No crashes. I'm planning to roll the patch to the other firewalls tonight.
This weekend I'll also change back to my other hardware platform for the one odd location using the igb NIC rather than then em that most of my firewalls use.
-
I did update all my other firewall's with the patch yesterday evening.
This morning, the first test box I had installed the patch on encountered the same watchdog error:Apr 27 01:24:22 bhamfirewall kernel: em2: Watchdog timeout -- resetting Apr 27 01:24:22 bhamfirewall kernel: em2: link state changed to DOWN Apr 27 01:24:22 bhamfirewall check_reload_status: Linkup starting em2 Apr 27 01:24:23 bhamfirewall php-fpm[13740]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (10.10.1.1 ) Apr 27 01:24:25 bhamfirewall kernel: em2: link state changed to UP Apr 27 01:24:25 bhamfirewall check_reload_status: Linkup starting em2 Apr 27 01:24:26 bhamfirewall php-fpm[13740]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (10.10.1.1 ) Apr 27 01:24:26 bhamfirewall check_reload_status: rc.newwanip starting em2 Apr 27 01:24:27 bhamfirewall php-fpm[13740]: /rc.newwanip: rc.newwanip: Info: starting on em2. Apr 27 01:24:27 bhamfirewall php-fpm[13740]: /rc.newwanip: rc.newwanip: on (IP address: 10.10.1.1) (interface: LAN[lan]) (real interface: em2). Apr 27 01:24:27 bhamfirewall check_reload_status: Reloading filter Apr 27 01:24:28 bhamfirewall xinetd[11505]: Starting reconfiguration Apr 27 01:24:28 bhamfirewall xinetd[11505]: Swapping defaults Apr 27 01:24:28 bhamfirewall xinetd[11505]: readjusting service 6969-udp Apr 27 01:24:28 bhamfirewall xinetd[11505]: Reconfigured: new=0 old=1 dropped=0 (services) Apr 27 01:25:46 bhamfirewall kernel: em2: Watchdog timeout -- resetting Apr 27 01:25:46 bhamfirewall kernel: em2: link state changed to DOWN Apr 27 01:25:46 bhamfirewall check_reload_status: Linkup starting em2 Apr 27 01:25:47 bhamfirewall php-fpm[13740]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (10.10.1.1 ) Apr 27 01:25:49 bhamfirewall check_reload_status: Linkup starting em2 Apr 27 01:25:49 bhamfirewall kernel: em2: link state changed to UP Apr 27 01:25:50 bhamfirewall php-fpm[13740]: /rc.linkup: Hotplug event detected for LAN(lan) static IP (10.10.1.1 ) Apr 27 01:25:50 bhamfirewall check_reload_status: rc.newwanip starting em2 Apr 27 01:25:51 bhamfirewall php-fpm[13740]: /rc.newwanip: rc.newwanip: Info: starting on em2. Apr 27 01:25:51 bhamfirewall php-fpm[13740]: /rc.newwanip: rc.newwanip: on (IP address: 10.10.1.1) (interface: LAN[lan]) (real interface: em2). Apr 27 01:25:51 bhamfirewall check_reload_status: Reloading filter Apr 27 01:25:52 bhamfirewall xinetd[11505]: Starting reconfiguration Apr 27 01:25:52 bhamfirewall xinetd[11505]: Swapping defaults Apr 27 01:25:52 bhamfirewall xinetd[11505]: readjusting service 6969-udp Apr 27 01:25:52 bhamfirewall xinetd[11505]: Reconfigured: new=0 old=1 dropped=0 (services)
Above is the system.log from when it started. There is nothing before. Once it starts, this set of error messages just keeps repeating until I reset the firewall. Since it was the LAN port acting up, I could still ping the firewall from the WAN and establish an OpenVPN connection. Also, my IPSec tunnels remained running.
I was really hoping the patch fixed it the problem as it was running for over a day and typically locked up before then. Suggestions are appreciated.
-
Anything happening at about that timeframe, backups, remote transfer of lots of data? One of the other users of the modified kernel reported a problem after awhile, during a transfer of large amount of data.
In all the other cases of you having an issue, was the WAN side still alive but the LAN side dead//stuck
-
Yes, a lot of data was being moved as a back to a remote site was in progress.
In the other cases, the port that locks up is random. I've had both LAN and WAN lock up. I don't see any pattern to which port locks.