PFsense die without a trace



  • hi,
    My setup is a very generic 2 WAN(failover) with 1LAN in a pentium box with a 20 GB HD.

    i am currently running 1.2-RC3  and recently i have experienced about 4 times that Pfsense box connected switch just blinked like crazy and internet just stop working.  i had to reboot the box and everything will come back normal.  before i reboot, I tried to see things on the console, it is just 'locked'

    how do i trace back on what happened? is there any log entry? where to start trouble shoot?
    ?????????????????? :'( :'( :'( :o :o :o :( :( :(



  • All I can say is I had this behaviour before but with only one WAN and 3 LAN ports so I think we can rule out dual WAN as a source.

    The device (a PC with a PIII 866 with 1.2-RC3 built on Sun Dec 9 12:15:38 EST 2007) didn't respond to webGUI requests on HTTPS and SSH nor would it pass traffic.
    But it still did work as DHCP server replying to requests! There is no console attached.

    I was able to power it down by ACPI power button and reboot it. It's running fine again since with no visual trace in the logs.
    All I could find were some RRD graphs logging nothing in that time leaving a gap.

    WAN is a PPPoE client on a german ADSL line, FWIW

    ![2007-12-27 status_rrd_graph_img.php1.png](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php1.png)
    ![2007-12-27 status_rrd_graph_img.php1.png_thumb](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php1.png_thumb)
    ![2007-12-27 status_rrd_graph_img.php2.png](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php2.png)
    ![2007-12-27 status_rrd_graph_img.php2.png_thumb](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php2.png_thumb)
    ![2007-12-27 status_rrd_graph_img.php3.png](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php3.png)
    ![2007-12-27 status_rrd_graph_img.php3.png_thumb](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php3.png_thumb)
    ![2007-12-27 status_rrd_graph_img.php4.png](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php4.png)
    ![2007-12-27 status_rrd_graph_img.php4.png_thumb](/public/imported_attachments/1/2007-12-27 status_rrd_graph_img.php4.png_thumb)



  • If the problem persists you might enable remote syslogging to another host so that we can capture what is in the logs before the reboot.



  • i would be more than happy open the remote login, but the problem is.. it's production, means all of my users will be standing outside outside my room and 'wondering' around… that means i had to reboot it as soon as i declare it is not accessible.



  • I think you are missing the point.  Remote logging will transmit the log in real time to anther host so if pfSense does mess up again you will be able to see what happened on the second machine that you are logging to.



  • @wlee618:

    i would be more than happy open the remote login

    Umm remote syslogging is NOT a remote login. Like sullrich said it provides a way to have a real time copy of the system log sent to another machine.



  • It gladly never occured in the office or at customer's sites, only at home.
    I am restructuring this ATM and don't have the time and spare machine to setup a remote syslog server. I am gonna do this once I am finished.



  • OK, it happened again.
    My kids called and complained about lost INet but this time I was in the office and still could log in remotely.

    Here's the syslog output:

    Jan 15 17:54:42 dhcpd: send_packet: Invalid argument
      Jan 15 17:55:28 sshd[68414]: Accepted keyboard-interactive/pam for root from 192.168.100.34 port 4819 ssh2
      Jan 15 17:54:38 dhcpd: send_packet: Invalid argument
      Jan 15 17:46:04 dhcpd: send_packet: Invalid argument
      Jan 15 17:46:04 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 17:45:56 dhcpd: send_packet: Invalid argument
      Jan 15 17:45:56 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 17:45:52 dhcpd: send_packet: Invalid argument
      Jan 15 17:45:52 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 17:45:50 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 17:39:04 last message repeated 2 times
      Jan 15 17:31:09 last message repeated 8 times
      Jan 15 17:25:01 last message repeated 2 times
      Jan 15 17:23:24 dhcpd: send_packet: Invalid argument
      Jan 15 17:23:21 dhcpd: send_packet: Invalid argument
      Jan 15 17:22:59 dhcpd: uid lease 192.168.2.89 for client 00:30:05:f8:89:ab is duplicate on 192.168.2/24
      Jan 15 17:09:14 dhcpd: send_packet: Invalid argument
      Jan 15 17:09:14 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 17:09:06 dhcpd: send_packet: Invalid argument
      Jan 15 17:09:06 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 17:09:02 dhcpd: send_packet: Invalid argument
      Jan 15 17:09:02 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 17:08:59 dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24
      Jan 15 14:36:03 php: : Not a valid interface action ""
      Jan 15 14:36:03 php: : Processing -
      Jan 15 14:36:03 php: : Not a valid interface action ""
      Jan 15 14:36:03 php: : Processing start -
      Jan 15 14:36:03 php: : Hotplug event detected for fxp1 but ignoring since interface is not set for DHCP
      Jan 15 14:36:03 php: : Processing fxp1 - start
      Jan 15 14:36:01 check_reload_status: rc.linkup starting
      Jan 15 14:36:00 kernel: fxp1: link state changed to UP
      Jan 15 14:35:58 kernel: fxp1: link state changed to DOWN
      Jan 15 13:15:29 dnsmasq[22971]: reading /var/dhcpd/var/db/dhcpd.leases

    Looks like my LAN network card (fxp1) had/has a dropout. That's one of those Intel Dual NICs.
    The DHCP duplicates might be related to switching a new device from pool DHCP to a static DHCP lease. But the reference to the pool was still there…

    Anything else you'd like to see, Scott?



  • Fix your duplicate leases, for one: dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24



  • @sullrich:

    Fix your duplicate leases, for one: dhcpd: uid lease 192.168.2.88 for client 00:30:05:4f:0c:a9 is duplicate on 192.168.2/24

    Yeah, as noted before, these arrive when switching from dynamic to static lease.
    At:  Diagnostics | DHCP leases
    I added the MAC to a static lease by hitting '+' next to 'Lease Type'. If I don't manually delete the MAC from /var/dhcpd/var/db/dhcpd.leases(~) afterwards it gets mentioned as duplicate…

    Could you please automate that? !!!



  • Nothing really stands out otherwise from the logs.  And yeah, you are right.  It should clear out that entry, please open a cvstrac bug report and include as much information as possible.  Thanks :)



  • Done.
    http://cvstrac.pfsense.com/tktview?tn=1592

    Scott, what does this mean?

    
      Jan 15 14:36:03    php: : Not a valid interface action ""
      Jan 15 14:36:03    php: : Processing -
      Jan 15 14:36:03    php: : Not a valid interface action ""
      Jan 15 14:36:03    php: : Processing start -
    
    

Log in to reply