Losing connection



  • Hi All,

    I got a problem, which I can get my head around. Been working at it for weeks, but can't seem to find the problem. The machine has been rebooted several times in the past and did not show this behavior.

    Situation as following:

    Running latest version

    1 physical server;

    • 1 virtual server with pfsense 10.0.0.1
    • 1 virtual server for management 10.0.0.2

    1 physical server;

    • 1 virtual server with AD + storage 192.168.1.12
    • 1 virtual server with exchange 192.168.1.13

    VLANS: 10,11,30,40,50,60,70,80,90,100

    pfsense running DHCP server(s) for all VLANS;
    VLAN 10: 10.0.0.x
    VLAN 11: 192.168.1.x
    VLAN 30: 192.168.30.x
    VLAN 40: 192.168.40.x
    etc.

    More information for the troubling VLAN;
    Subnet: 192.168.1.0
    Subnetmask: 255.255.255.0
    range from: 192.168.1.40
    range to: 192.168.1.180

    DNS 192.168.1.12
    gateway: 192.168.1.1

    This configuration has been running for almost 1 year. Their has been an power down at the facility and now we are getting the following weird sympthon:

    Around 3 PC's in the 192.168.1.x range, are losing their connection with internet after 20 / 30 minutes. When this happends and I look at the ARP table, I see the following line:

    VLAN11 192.168.1.77 d4:be:d9:c1:c4:0c PC06 Expires in 1198 seconds vlan

    When I remove this ARP line, the connection works again. But then again when the connection is lost and I go look at the ARP table, it states the same 1198 seconds. And when I remove it again, it works again.

    Can anyone help me to resolve this issue?


  • LAYER 8 Netgate

    Is that the proper MAC address for that IP address?

    (Why is pfSense doing DHCP if you have AD?)



  • @derelict

    The MAC is indeed proper for the IP-adress. Today decided to rebuild the entire range of 192.168.1.x and VLAN but still getting the same behavior.

    I made pfSense managing the DHCP because the AD will be migrated out of the network this year.


  • LAYER 8 Netgate

    Run a packet capture to see exactly what's going on I guess.

    For instance, when it stops working, before you touch anything, start a packet capture on LAN for 192.168.1.77 on VLAN11. Set it for like 1000000 packets.

    Ping from the firewall to 192.168.1.77, let it fail. Run your fix, ping from the firewall again, let it succeed. Stop the capture and see what you have.

    Diagnostics > Packet Capture

    Really sounds like some kind of IP address conflict. Are there any ARP moves logged in the system log?



  • @derelict Thank you for pointing us in the right direction. Eventually found in the switch a different ARP the in the pfsense. Eventually followed those issue around in the network and solved the issue.

    Thank you for the advice.