pfsense LAN stops working



  • Hi everyone,

    So i have this recurring problem for the past weeks that I cannot find a solution to.

    After a few minutes of up time. The LAN interface suddenly stops working and I cannot ping to the router from within pfsense shell. I checked the logs and found this occurring very few seconds

    Feb 28 15:38:14 kernel arp: 88.88.254.224 moved from 84:b5:41:1d:58:46 to ac:b5:7d:dd:42:2b on re0
    Feb 28 15:38:45 kernel arp: 88.88.254.221 moved from ac:b5:7d:dd:42:2b to 40:98:ad:52:64:6f on re0
    Feb 28 15:39:13 kernel arp: 88.88.90.7 moved from ac:b5:7d:dd:42:2b to 88:ae:07:ce:aa:a0 on re0
    Feb 28 15:39:24 kernel arp: 88.88.254.224 moved from ac:b5:7d:dd:42:2b to 84:b5:41:1d:58:46 on re0
    Feb 28 15:39:31 kernel arp: 88.88.254.241 moved from ac:b5:7d:dd:42:2b to 88:ae:07:12:ab:f7 on re0
    Feb 28 15:39:33 kernel arp: 88.88.90.20 moved from ac:b5:7d:dd:42:2b to 80:b0:3d:34:7a:17 on re0
    Feb 28 15:40:15 kernel arp: 88.88.254.221 moved from ac:b5:7d:dd:42:2b to 40:98:ad:52:64:6f on re0
    Feb 28 15:40:29 kernel arp: 88.88.254.224 moved from ac:b5:7d:dd:42:2b to 84:b5:41:1d:58:46 on re0
    Feb 28 15:40:30 kernel arp: 88.88.254.219 moved from ac:b5:7d:dd:42:2b to 50:bc:96:a6:5b:23 on re0
    Feb 28 15:40:32 kernel arp: 88.88.254.219 moved from 50:bc:96:a6:5b:23 to ac:b5:7d:dd:42:2b on re0
    Feb 28 15:40:33 kernel arp: 88.88.254.219 moved from ac:b5:7d:dd:42:2b to 50:bc:96:a6:5b:23 on re0
    Feb 28 15:40:43 kernel arp: 88.88.90.7 moved from ac:b5:7d:dd:42:2b to 88:ae:07:ce:aa:a0 on re0
    Feb 28 15:41:01 kernel arp: 88.88.254.241 moved from ac:b5:7d:dd:42:2b to 88:ae:07:12:ab:f7 on re0
    Feb 28 15:41:03 kernel arp: 88.88.90.20 moved from ac:b5:7d:dd:42:2b to 80:b0:3d:34:7a:17 on re0
    Feb 28 15:41:06 kernel arp: 88.88.254.224 moved from ac:b5:7d:dd:42:2b to 84:b5:41:1d:58:46 on re0
    Feb 28 15:41:35 kernel arp: 88.88.254.221 moved from ac:b5:7d:dd:42:2b to 40:98:ad:52:64:6f on re0
    Feb 28 15:41:38 kernel arp: 88.88.254.224 moved from ac:b5:7d:dd:42:2b to 84:b5:41:1d:58:46 on re0
    Feb 28 15:41:45 kernel arp: 88.88.254.221 moved from ac:b5:7d:dd:42:2b to 40:98:ad:52:64:6f on re0
    Feb 28 15:42:02 kernel arp: 88.88.254.219 moved from ac:b5:7d:dd:42:2b to 50:bc:96:a6:5b:23 on re0
    Feb 28 15:42:08 kernel arp: 88.88.254.224 moved from ac:b5:7d:dd:42:2b to 84:b5:41:1d:58:46 on re0
    Feb 28 15:42:13 kernel arp: 88.88.90.7 moved from ac:b5:7d:dd:42:2b to 88:ae:07:ce:aa:a0 on re0
    Feb 28 15:42:31 kernel arp: 88.88.254.241 moved from ac:b5:7d:dd:42:2b to 88:ae:07:12:ab:f7 on re0
    Feb 28 15:42:33 kernel arp: 88.88.90.20 moved from ac:b5:7d:dd:42:2b to 80:b0:3d:34:7a:17 on re0

    I have tried googling into it and i cannot find the exact solution. some suggested arp poisoning by which from what i understand is an attempt to access pfsense with a static IP.

    my pfsense box is configured with a captive portal. all users who are MAC filtered can access the internet without issues. but those who uses vouchers are getting a message from their devices that says the internet is not available.

    i have exhausted my resources and i am truly hoping someone can point me to the right direction.

    i currently have a realtek NIC and tp-link routers setup. only the tp link routers are configured to have static IP so i can access them.

    alt text

    in the screenshot i provided, i am trying to ping the routers connected but i cannot get any response or error. it just sits there.

    i hope someone can help point what i am doing wrong. i can provide more logs but i do not know which one is useful.


  • LAYER 8 Global Moderator

    Where exactly is this 88.88.88 addresses?

    88.88.90 and 88.88.254?

    Are you using this 88.88 address space on your local network??



  • Yes. Sorry for failing to bring up my setup:

    WAN
    192.168.0.1 upstream gateway
    DHCP

    LAN
    88.88.88.88/16
    Range - 88.88.8.1 ~ 88.88.254.254

    Routers
    88.88.88.16/16
    88.88.88.17/16

    Do you think the device with that MAC is causing the failure? I have checked all the MACS on my routers and they do not mat h those MACS in my log. Is it possible for me to just block those MACS?

    Thanks.


  • LAYER 8 Global Moderator

    @kengo said in pfsense LAN stops working:

    LAN
    88.88.88.88/16

    Yeah that is just utterly completely BORKED!!

    You can not just pull public IP space of you know where and use it on your local network.. That space is owned by

    inetnum: 88.88.0.0 - 88.91.255.255
    descr: Telenor Business Solutions AS
    address: NORWAY

    Clearly that is not YOU ;)

    Also by the way
    88.88.88.88/16

    Is not a network address that would be a HOST address, the network would be 88.88.0.0/16

    There is plenty of space in rfc1918 for you to use - this is ZERO reason to just pull some public IP space and try and use it local.. As to IP moving around to different macs - so you either have duplicate IPs

    What is this device?
    ac:b5:7d:dd:42:2b

    That is a liteon mac - What are you using for your wireless AP?

    So first thing I would do would be to use rfc1918 space for your network vs someone public IP space. Then track down that device where you have mac changing around.. Possible is some sort of wireless router/ap that is using its mac vs the clients mac and then if the device connects to different AP it normal mac is seen by pfsense and these devices are flipping back and forth?

    Without some better understanding of how your infrastructure is configured and what hardware is in play, etc. But yeah a flapping like that is never a good sign.

    Could be your have more than 1 dhcp server running handing out duplicate IPs? Would need more info to try and guess to what the problem is - but first thing I would do is track down exactly what device that mac is, that liteon mac. After you correct your IP space being used.



  • Oh my! Thanks for correcting me on that! Never knew that using that IP range would cause an issue! I will switch to 192.168.88.x/16 then

    That mac I am unsure which device it's coming from. But I will fix the ip range first and post back the results.

    By the way The wireless AP I am using are 3 tp-link 840n. But I checked all the macs of my wireless AP and they do not have that mac address. The NIC I have is a realtek gigabit. I have an intel 8492mt dual gigabit coming that I am waiting to put in the machine.

    Again, thanks soooooo much for pointing that out!


  • LAYER 8 Global Moderator

    And again 192.168.88.x/16 is not a NETWORK that is a host address.

    192.168.0.0/16 would be the network...

    How many devices do you have... A /16 is freaking huge - you have some 65k clients on your network? Or do you want to use 192.168.88.0/24 which would be a network address. But only space for 254 clients.

    How many devices are on your network at any one time - and appropriately size the network.. /24, /23, /22 -- anything above /22 is a lot of devices to be on the same broadcast domain.. And if wireless without proper filtering of multicast and broadcast would be a slow mess. If you have that many devices prob should segment them to multiple broadcast domains vs using such a large mask and putting them all on the same L2.



  • Hi John,

    So I have went ahead and changed the network dhcp range to 192.168.8.1 - 192.168.254.254/16 and the host to 192.168.88.88/16

    I am still having the issue.

    When I checked the arp table,

    ac:b5:7d:dd:42:2b

    the MAC address is showing to 192.168.88.10,13,15,16,17 which are the IP addresses of the wireless AP

    0_1551442545392_arptable.jpg

    I don't know what to make of it. I tried connecting my Galaxy phone and when I looked into the arp table, my device showed but it was associated to the ac:b5:7d:dd:42:2b MAC address.

    I have tp-link 840n wireless AP.

    I am really lost at the moment. Issuing a PING command from the shell to any of these routers will result in 100% packet loss and still no response.


  • LAYER 8 Global Moderator

    Well your going need to get with your AP support - normal AP don't do that..

    And yeah that is going to cause you all kinds of problems for sure especially if some other AP passes the actual clients mac. Do you have your AP in some sort of client bridge/repeater mode with a wireless uplink or something?

    Not related to your problem - but why are you hooked on this /16 mask? You have 65K some clients?


  • Netgate Administrator

    Yeah, this looks a lot like something running as a wireless repeater.



  • Hi everyone, so after a week of tinkering i am still having this problem. but its much less than before. however, the system logs are still filled with "kernel arp moved ac:b5:7d:dd:42:2b" message. I have installed arpwatch and this also reflects in my system logs "flip flop ac:b5:7d:dd:42:2b"

    this is now my setup:

    changed IP of lan to 192.168.88.88/16
    dhcp range to 192.168.99.1 ~ 192.168.254.254

    changed LAN realtek NIC to dual gigabit intel NIC

    temporarily reduced to 2 wireless AP archer ac 750 (192.168.88.17,192.168.88.18) I was previously using tp link wr840n v6 wireless APs before but I changed to tp link archer ac750 to see if it would resolve the issue but it didnt.

    I tried putting the MAC ac:b5:7d:dd:42:2b to a static ARP table and then making an IP filter in my captive portal (0 up/down) to see if it would help. But the problem persists.

    I still cannot find the source of the problem.

    I am now thinking of changing my current cpu hardware from a core2quad q6600 to an i5 660 spare that I have because I really cannot find any other solution.

    what other logs should I be looking for so I can share them. I am really stuck on this for 2 months now.

    thank you again so much for your continued support.


  • LAYER 8 Global Moderator

    You understand this has ZERO to do with pfsense right!!! ZERO!!

    draw up how you have these AP connected and configured... If your running 1 AP in repeater mode, and another in normal Mode and then you have clients switching between them... Then yes the macs are going to be flipping all over the place.

    Or you could have clients changing up their macs to bypass your captive portal?

    Lets see more logs, and how you have these AP configure.. And how you have everything wired.

    Feb 28 15:41:06 kernel arp: 88.88.254.224 moved from ac:b5:7d:dd:42:2b to 84:b5:41:1d:58:46 on re0
    Feb 28 15:41:35 kernel arp: 88.88.254.221 moved from ac:b5:7d:dd:42:2b to 40:98:ad:52:64:6f on re0

    This is pfsense telling you that some IP out on the network that use to be .224 on mac 42:2b changed to 58:46
    Hey btw also IP .221 that also use to be 42:2b changed to 64:6f

    So .224 is first a ac:b5:7d is liteon device.. Then all of sudden its a Samsung (84:b5:41)?
    So .221 is first a liteon, then its a Apple (40:98:ad)

    Then each changes back to a liteon device?

    That has ZERO to do with pfsense... So you could put in a freaking super computer to run pfsense on.. Still going to see the same issue.. What devices was ac:b5:7d:dd:42:2b

    Lets see the new logs.

    Do you have your AP setup like this?

    0_1552473686289_repeature.png

    So pfsense at first sees 99.10 on mac AA:BB, and if client moves to be on AP that actually gives you mac of client vs its own mac pfsense then see 99.10 on XX:YY

    Then if it moved back - the IP would go back to mac AA:BB this is exactly what your seeing in your logs the first time... So please post the logs your seeing now.. And what are the MAC address of all the AP devices.. The mac should be on the bottom of them, etc. And how do you have them connected and configured.

    You even stated that the mac address was your AP

    the MAC address is showing to 192.168.88.10,13,15,16,17 which are the IP addresses of the wireless AP

    So how and the hell do you think changing what your run pfsense on has anything to do with it?



  • hi @johnpoz

    first of all, i would like to thank you for your continued support and patience. second of all, i would like to apologize if i frustrated you in anyway. i am also not sure what to do now so i hope you can continue to bear with me for a little longer.

    i made this quick diagram using mspaint, it was the only app i have so also please bear with it.

    0_1552480087356_sorry_mspaint_setup.png

    i will paste a portion of the logs i got because they just repeat over and over

    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 ac:b5:7d:dd:42:2b (0:1b:21:32:d2:29)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 0:1b:21:32:d2:29 (ac:b5:7d:dd:42:2b)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 ac:b5:7d:dd:42:2b (0:1b:21:32:d2:29)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 0:1b:21:32:d2:29 (ac:b5:7d:dd:42:2b)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 ac:b5:7d:dd:42:2b (0:1b:21:32:d2:29)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 0:1b:21:32:d2:29 (ac:b5:7d:dd:42:2b)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 ac:b5:7d:dd:42:2b (0:1b:21:32:d2:29)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 0:1b:21:32:d2:29 (ac:b5:7d:dd:42:2b)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 ac:b5:7d:dd:42:2b (0:1b:21:32:d2:29)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 0:1b:21:32:d2:29 (ac:b5:7d:dd:42:2b)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 ac:b5:7d:dd:42:2b (0:1b:21:32:d2:29)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 0:1b:21:32:d2:29 (ac:b5:7d:dd:42:2b)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 ac:b5:7d:dd:42:2b (0:1b:21:32:d2:29)
    Mar 13 20:16:28 arpwatch report: pausing (cdepth 3)
    Mar 13 20:16:28 arpwatch flip flop 192.168.88.88 0:1b:21:32:d2:29 (ac:b5:7d:dd:42:2b)

    i really wish to understand where i have mistaken so i can correct it. if you say that this is not within pfsense and its within the AP setup i have, please let me know which of these AP configurations are wrong

    Wireless AP

    1. set static ip : 192.168.88.10 / 192.168.88.15 / 192.168.88.18
    2. turn off dhcp server
    3. set dhcp gateway to 192.168.88.88 (pfsense lan)
    4. wireless configuration set to no password (disabled security)

    again, i would like to express my sincere appreciation for all the support you are providing me.

    quick update:

    here's what i found inside the arp table

    0_1552482519283_arptable1.png
    0_1552482781709_arptable2.png
    its showing different devices using the same MAC listed under liteon technology.

    what should i do with my wireless AP? does replacing the wireless AP fix it? i am asking because i switched from wr840n v6 to archer c2 v1.


  • LAYER 8 Global Moderator

    @kengo said in pfsense LAN stops working:

    ac:b5:7d:dd:42:2b

    What device has this MAC?


  • Netgate Administrator

    Looks like it's 192.168.88.18. I suggest that AP is not configured correctly.
    Try turning it off and see if that removes the problem.

    Steve


Log in to reply