SOLVED: TCP disconnects with second pfsense router



  • Hi,

    Apologies for any misuse of terminology here - I'm something of a networking newbie. For a while, I've used pfSense 1.2.3 between me and the internet. It runs inside an ESXi VM, with one virtual interface on an 'inside' vSwitch (with other VMs and one phyiscal interface), and the other virtual interface on the 'outside' vSwitch (with only the WAN physical interface). My internal network is 192.168.0.0/24, and everything is working fine.

    What I'm trying to add is a second virtual router into this setup, to live between 192.168.0.0 and another internal network - 192.168.10.0
    This second pfSense VM is on 192.168.10.1, with the 'WAN' interface on an IP within 192.168.0.0
    I've set a firewall rule on the WAN interface to allow all traffic on all protocols. I've set a static route on the first router to direct traffic to 192.168.10.0 to where it should be going.
    I'm trying to keep this separate from the first router. The setup nearly works, but there's something fundamentally wrong I hope someone can help with:

    1. from 192.168.10.245, I SSH to 192.168.10.1, go into shell, and start pinging an internet IP. All is well.
    2. from 192.168.10.245, I SSH to an IP address within 192.168.0.0, bring up some activity in the window. All is well.
    3. from 192.168.10.245, I RDP to a machine within 192.168.0.0. All is well.
    4. from a machine in 192.168.0.0, I RDP to 192.168.10.245. This will connect, but stops working after maybe a minute.
    5. from a machine in 192.168.0.0, I SSH to 192.168.10.1. This will connect, but stops working after maybe a minute.

    From what I can see, connections which are initiated from the 192.168.0.0 side seem to 'die' after a short while. The connections in scenarios 1-3 will all work indefinitely, it seems.

    Looking at pftop (from within 192.168.10.0, since that works), I see the likes of the following when I observe these 'disconnects'. So I SSH, and it works for a few seconds:

    PR    D SRC                   DEST                 STATE   AGE   EXP  PKTS BYTES
    tcp   I 192.168.0.122:63121   192.168.10.1:22       4:4     18 86400    93 12544

    But now I've been kicked off, so I connect again:

    PR    D SRC                   DEST                 STATE   AGE   EXP  PKTS BYTES
    tcp   I 192.168.0.122:63121   192.168.10.1:22       4:4     53 86396   103 19648
    tcp   I 192.168.0.122:63122   192.168.10.1:22       4:4     13 86395    79  9808

    … which unsurprisingly creates a new connection, but the original connection is still there, and established? This phenomenon occurs with RDP traffic too.

    Any help is appreciated around how I can troubleshoot further! So far I'm led to believe that it's any machine within 192.168.0.0 talking to any machine in 192.168.10.0 which has this problem. The other way around is fine.

    Thanks,
    Chris



  • can you draw diagrams? and post more outputs, check the way i made my thread http://forum.pfsense.org/index.php/topic,43854.0.html



  • While I don't have somewhere I can quickly upload an image, maybe some ASCII art will help:

    Before:
                          _
          ADSL Modem – |v|
                        |s|
                        |w|
                        |i|
                        |t|
                        |c|
                  _    |h|
                  |p| -- ||
                  |f|
                  |s|
                  |0|    _
    192.168.0.1  |
    | -- |v| -- [vm0]
                        |s|
                        |w| – [vm1]
                        |i|
                        |t| – [vm2]
                        |c|
                        |h| – [vm3]
      physical switch – |_|

    Note the 'outside' and 'inside' virtual switches. They're connected together with a pfsense VM, which has an interface in each switch. Anything on the physical network will be able to see the Internet thanks to that VM. Everything in that regard is working fine.

    The new setup:
                          _
          ADSL Modem – |v|
                        |s|
                        |w|
                        |i|
                        |t|
                        |c|
                  _    |h|
                  |p| -- ||
                  |f|
                  |s|
                  |0|    _
    192.168.0.1  |
    | -- |v| -- [vm0]
                        |s|
                        |w| – [vm1]
      physical switch – |i|
                        |t| -- [vm2]
                        |c|
                  _    |h| – [vm3]
    192.168.0.107 |p| – ||
                  |f|
                  |s|
                  |1|    _
    192.168.10.1  |
    | -- |v|
                        |s|
                        |w|
                        |i|
                        |t|
                        |c|
                        |h|
                        |_| -- [vm4] 192.168.10.245

    In this setup, everything works, but only for a short period of time, which is what I can't understand. vm4 at the bottom of the setup can see all the way through to the internet fine, and can RDP or SSH to machines in the 192.168.0.0 network. But connections which are initiated from the 192.168.0.0 network into the 192.168.10.0 network don't last very long before causing the applications using them to hang.



  • i've been reading / researching virtualisation compatibility with pfsense and from my understanding,people using the 1.2.x branch have had more problems than those using a 2.x release.

    I gather that this is due to a poor driver in the older release, try using pfsense 2.x.

    I'm eager to find out if your performance issues go away.



  • Good idea, but unfortunately it didn't make any difference. I took a snapshot of the VM, upgraded firmware from 1.2.3-RELEASE to 2.0-RELEASE, but the issue remained. I've now reverted to the snapshot, as I can't use the web interface with IE8.

    I'm still not sure where the problem might lie. Hopefully not with the fact that the system is virtual, since the first pfsense router is working fine in this virtual environment. Considering the internet-facing pfsense VM, I can connect through the WAN interface to a machine on the LAN side using (e.g.) RDP, and it's fine. Do the same thing using the internal pfsense VM, and I'm disconnected after around a minute. However, reverse the direction of the connection, and it's fine. That's the bit I don't understand.



  • I also read something about putting the vmswitch in promiscuous mode, let me know if that setting helps



  • Sorry if this is obvious to you but I have to ask why the two separate pfSense VMs?  Are you trying to create a DMZ?



  • @cjmurray:

    I've now reverted to the snapshot, as I can't use the web interface with IE8.

    That's only a problem in IE's compatibility mode, turn that off.

    Your general issue is trying to statefully filter asymmetrically routed traffic, which can't work. You need "Bypass firewall rules for traffic on the same interface" under System>Advanced on the first system to work around that.



  • cmb, thank you, that's both problems solved!  ;D

    I'm now on 2.0 and with the setup functioning fine, although I did need to set "Disable DNS Rebinding Checks" on the second router for DNS resolution to work after the upgrade to 2.0

    biggsy, no problem, I'd ask the same! It's actually because I'm bound by physical interfaces on the first ESXi server. Now that I've had a good look, I can't get another NIC in there, so I'll have to move this second router VM onto another box which does have enough NICs. Plus, I'm trying to investigate some NFS usage over time and am quite interested in the RRD graphs on the second router (keeping them separate from the ones on the first router, which should only be doing internet routing)

    Thanks again,
    Chris


Log in to reply