[solved] Can't ping WANGW address



  • hi, I have a pfsense appliance (one of these: http://linitx.com/product/12647) and a fairly standard home setup with the WAN interface connected to an ISP supplied router using a static /30 (the .241 and .242 addresses shown below) and the LAN/OPT1 interfaces on private address space.

    <–ISP ---> [ISP WAN IP] ROUTER [8x.xx.xx.241] <–---> [8x.xx.xx.242] PFSENSE [192.168.21.254] <–----> [192.168.21.0/24] LAN

    On pfsense, the WAN IP is the 8x.xx.xx.242 and the WANGW address is 8x.xx.xx.241.  It's LAN IP is 192.168.21.254.  After setting the appliance up, the dashboard shows the WAN and LAN interfaces as up and with the correct IP addresses.  If I SSH to the pfsense box, I cannot ping or telnet to the router on the .241 address.  No traffic makes it from the LAN or the pfsense appliance out to the Internet.  In the logs I see a number of entries like: "arpresolve: can't allocate llinfo for 8x.xx.xx.242"

    Points to note:

    • I've added pass rules for all traffic on both the LAN and the WAN interfaces just to prove it isn't a firewalling issue

    • My current firewall has been running in this setup for a number of years, and continues to work using the same network config and the same cables, so there is no issue with the ISP router or the cabling.

    • I've tried both patch and crossover cables between the router and the pfsense appliance

    • I've tried 2 of these applicances now and have the same problem on both, so it's unlikely to be a hardware issue with the appliance itself, unless there's some component fault but that would likely make it unusable for everyone that has one

    • I've switched the WAN and OPT1 interfaces around (assigned devices) and the issue continues to affect the WAN interface only.. i.e. I can communicate across the OPT1 interface regardless of the device it is assigned

    • I've tried disabling hardware checksum offloading according to http://doc.pfsense.org/index.php/Lost_Traffic_/_Packets_Disappear but it made no difference.

      .

      Any ideas at all before I reluctantly return these appliances and look elsewhere for a firewall??

      Many thanks,



  • Please post the output of the following pfSense shell commands```

    /etc/rc.banner
    ifconfig
    netstat -r -n
    ping -c 5 8x.xx.xx.241

    
    There appears to be something wrong with your configuration that it is apparently sending arps for 8x.xx.xx.242 (supposedly the IP address of your pfSense WAN interface).
    
    But if you are unwilling to make public your IP addresses you could send the output to me in a PM.


  • hi, many thanks for the reply.

    I've pm'd you the output from the commands on the pfsense appliance, and also the equivalent commands from the current firewall (IPCop 1.4.21 / Linux 2.4.31).

    Regards,



  • Thanks for the information.

    The WAN interface has static IP and is UP and RUNNING. The default route is to x.x.x.241 and has a use count of 400. The use count in the subnet of the WAN interface was 6. The ping reports

    ping: sendto: No buffer space available

    which might indicate the NIC transmit ring was full, that is the NIC transmitter had stopped but ifconfig reported Status Active with flags UP and RUNNING.

    Have you correctly typed the IP address of the router and corresponding subnet into the pfSense configuration? Digit or octet order transposition perhaps?

    If that checks out I suggest you reboot pfSense (to start with a "clean slate") and then take a packet capture on the WAN interface while you attempt a```
    ping -c 5

    
    Does the ISP router have any diagnostic facilties? what does it report if you attempt a ping to the pfSense WAN IP address?


  • hi, thanks again for taking a look at this.

    The IP address of the router (WANGW) is definitely correct.. it can be seen as the address of the default gateway in the IPv4 part of the routing table that I sent in the pm.  Along with the WAN address of the firewall, they form the 2 usable addresses of that 8x.xx.xx.240/30 subnet, the broadcast being the .243

    I swapped the firewalls over again to take the capture, but I didn't keep the output, and obviously have swapped my old firewall back in again to be able to make this reply! :(  But interestingly, it did show both the ICMP request reaching the router on ..241 and the ICMP reply coming back to the WAN interface on ..242.  Despite this, the shell still reported 100% packet loss on the ping, but silently this time.. no mention of the "No buffer space available".

    Not sure about the router diagnostics.. it's a pretty limited command set from the shell interface and certainly nothing that I recognise (i.e. not BSD/Linux of any kind) but the packet capture seems to show that it's a problem somewhere on the pfsense box.

    Best wishes.



  • @darrend:

    The IP address of the router (WANGW) is definitely correct.. it can be seen as the address of the default gateway in the IPv4 part of the routing table that I sent in the pm.  Along with the WAN address of the firewall, they form the 2 usable addresses of that 8x.xx.xx.240/30 subnet, the broadcast being the .243

    Sorry, I meant that the pfSense WAN IP address and subnet matched that configured in the ISP router.

    @darrend:

    I swapped the firewalls over again to take the capture, but I didn't keep the output, and obviously have swapped my old firewall back in again to be able to make this reply! :(  But interestingly, it did show both the ICMP request reaching the router on ..241 and the ICMP reply coming back to the WAN interface on ..242.  Despite this, the shell still reported 100% packet loss on the ping, but silently this time.. no mention of the "No buffer space available".

    Pity you didn't keep the capture output - the details might have reported something significant. To be more precise, the pfSense packet capture shows packets given to the pfSense driver, it can't show the arrival of packets at another system. That piece of pedantry doesn't seem applicable here though since the packet capture reports a response.

    @darrend:

    Not sure about the router diagnostics.. it's a pretty limited command set from the shell interface and certainly nothing that I recognise (i.e. not BSD/Linux of any kind) but the packet capture seems to show that it's a problem somewhere on the pfsense box.

    OK, lets explore that possibility in a little more detail. Please send me the output from pfSense shell commandtcpdump -i vr2 -vvv -estarted before aping -c 5directed at the ISP router. Terminate the packet capture after the ping terminates. Then collect the output of the pfSense shell command```
    arp -a -n



  • hi.. back at work now.. bit less time to spend on this hence the delay replying.

    Sorry, I meant that the pfSense WAN IP address and subnet matched that configured in the ISP router.

    ah.. yes it does :)

    The tcpdump (with added -n to prevent the DNS resolution) shows lots of traffic leaving and arriving at the WAN interface.  It seems that traffic is getting out and back in as normal - not just to the router, but to anywhere externally - but it simply dies on the WAN interface on its return.  The ping commands still show no output, the interface stats show no packets having arrived on that interface.  The arp command shows the IP address of the router with the correct MAC address, and similar for machines on the LAN.

    I tried switching over to the other pfsense appliance and duplicated all the results from the first one.  Starting to smell like some kind of driver issue for the nic if you ask me.

    Samples of the tcpdump and arp output pm'd to you.

    Cheers!



  • @darrend:

    Samples of the tcpdump and arp output pm'd to you.

    I haven't received it.

    I wonder if your "ISP router" bonds to the first MAC address it talks to. Some cable modem are reported to do something like this. It is reported that powering off the modem for 30 seconds or more "resets" it so it will talk to the next MAC address that talks with it.



  • gah! my browser session timed out before I posted the pm and I didn't notice the error.. will repost in a sec.

    The router doesn't bond to a MAC address - it will talk to anything connected to it.  I've replaced network cards on my current firewall with no problems and as you'll see from the tcpdump sample, it's happily moving packets in both directions from and to the WAN interface on the pfsense box.  They just don't go anywhere after that.

    (edit.. to keep as much public as poss and out of the PM I sent)

    The tcpdump shows traffic correctly being routed.. mostly to the upstream DNS server and back, but the packets then just die at the WAN interface.  The stats for that interface show almost no packets going in or out of the interface and the fact that it can't see its own responses seems to show it's not a firewall issue.

    Odd looking DNS requests for things like "0.pool.ntp.org.mydomain.org" are caused because the requestor never sees the response to the "0.pool.ntp.org" request, so it then appends the search domain and tries that instead.  As the dump shows, all responses are reaching their target and getting back again before dying on the WAN interface.

    It looks to me like the pfsense box adamantly refuses to move traffic between 2 interfaces configured with different subnet sizes.  I've previously had this same box moving traffic OK from the OPT1 to the LAN and back where both are configured with 192.168.x.0/24 nets.  Remember too, that I have tried this on 2 different pfsense appliances, and with the interfaces assigned to different devices on both boxes - pretty much ruling out any hardware issue.

    All very confusing, but I think the upshot is that they have to be returned as simply unfit for purpose before it becomes too late to do so :(

    Cheers!



  • @darrend:

    The tcpdump shows traffic correctly being routed.. mostly to the upstream DNS server and back, but the packets then just die at the WAN interface.

    As best I can tell from the tcpdump the incoming packets are incorrectly addressed. Here's why. In FreeBSD NICs are usually initialised to accept incoming traffic targeted at their MAC address. Everything else is ignored by the NIC (except packets targeted at the broadcast MAC address).

    Your trace shows incoming packets destined to MAC address x❌x:2b:fa:aa (eth2 on your IPCOP system) but the trace shows outgoing packets (e.g. timestamp 01:35:16.511095) from MAC address x❌x:2a:15:c1 which doesn't appear to be in the configuration information your previously sent me. Have you quietly changed the hardware?

    Has your "ISP router" "bonded" to the "previous firewall" as I discussed in a previous reply? In FreeBSD the ARP timer appears to be at least 1200 seconds. I don't know if that is a common value but it suggests you should wait quite a while for your "ISP Router" to notice the MAC address of the firewall has changed. (Or you could help it along by powering it off, waiting say 30 seconds then powering it on after you have switched over the machines. OR you could try setting the MAC address into the pfSense configuration but that could lead to further confusion in the future.)

    By default, tcpdump sets the NIC into "promiscuous" mode where it accepts all incoming frames regardless of destination MAC address. Hence the arriving traffic appears in the trace even though it would normally be ignored by the NIC.

    But why doesn't the incoming traffic get processed by pfSense when the interface is in promiscuous modem because tcpdump is running?
    The outgoing traffic shows IP checksums of 0 and tcpdump reports that as bad checksum. Did you have hardware checksumming enabled on that interface? (Disable all hardware checksums on System -> Advanced, Networking tab.)



  • Your trace shows incoming packets destined to MAC address x❌x:2b:fa:aa (eth2 on your IPCOP system) but the trace shows outgoing packets (e.g. timestamp 01:35:16.511095) from MAC address x❌x:2a:15:c1 which doesn't appear to be in the configuration information your previously sent me. Have you quietly changed the hardware?

    interesting.. so what is coming in to the pfsense is actually getting routed with a MAC address set from the router's arp table/cache, and not the actual address of the interface that sent the request?  I hadn't noticed that MAC address was actually the IPCop's.  The x❌x:2a:15:c1 isthe pfsense box that I did the tcpdumps on, but not the one I sent you the config data from.. apologies, but I have been trying lots of other things during this period, including swapping out both of these identical appliances that I have.

    The outgoing traffic shows IP checksums of 0 and tcpdump reports that as bad checksum. Did you have hardware checksumming enabled on that interface? (Disable all hardware checksums on System -> Advanced, Networking tab.)

    yeah, I noticed the checksum comments in the output too.. when I googled it, the suggestion was that tcpdump effectively reported the checksum prior to the hardware offloading occurring.  I therefore tried both with and without hardware checksumming, with and without fragmentation clearing, and pretty much every combination of options on that networking tab of the advanced page.

    Your theory about the "bonding" or caching on the router is of course plausible now because of this MAC address that belongs to the IPCop showing up (great spot!)  Before, I thought that too improbable because I've changed hardware behind the router more than once with no obvious issues of this nature, and because packets were actually getting back to the pfsense from the router.  I'm going to investigate this fully now..

    Cheers!



  • Thanks wallabybob.. you da man  8)

    Correct all along.. router caching the MAC address from the IPCop in its arp table/cache, so clearing it made it work.  These very bytes will be flowing to the forum via the pfsense appliance flashing prettily under my desk.

    Great work friend.. I really appreciate your time and patience as I was at the point of sending the boxes back while I could still get a refund.  So pleased that I don't have to now and that I can retire my poor old IPCop before its hardware gave up the ghost once and for all.

    Take care,
    Darren.


Locked