Can only communicate in one direction. (A bit complicated.)
-
Well.. I'm bassically stuck on an issue I've encountered. I've set up site to site openvpn tunnels with pfsense in the past (we have 3 different ones currently), but for some reason I am just stuck. lol The scenerio I currently have is this:
I have an HA cluster at an offsite location with two pfsense 2.3.4 boxes. xmlrpc sync is enabled between them, state sync is off(conflicts with limiters which will be used in the fute, so I have it off) and there are carp ip's for both the lan and the wan. Our Main office has a single PFSense machine (2.3-RELEASE). Currently, there are no multiwan vpn setups. We do have two WAN connections at site A, but right now I'm just trying to get a single, stable vpn connection up from interface "WAN2" to Site B. The problem is that I am only able to communicate in one direction through the vpn tunnel. For the most part, I can't communicate from site A to site B. I can ping Site B's firewall from machines on Site A, but not any other machines. I also can't do a traceroute to site B's firewall. :/ All communication seems to be fine if it is initiated by a machine on Site B, trying to contact machines on site A. I can ping/trace/etc.. from site B to site A.Site A = Main Office ( OpenVPN Server )
Site B = Colocation ( OpenVPN Client )(192.168.0.0/21) <-Local Subnet… (And yes, it's very wierd... it's from the last guy that worked here.)
|SITE A| <-Running Openvpn server, tied to wan address and port 1197
(1.2.3.1)
|
INTERNET (Open VPN Tunnel is 172.16.50.0/30)
|
(2.2.3.3) <-Carp Wan IP, OpenVPN Client connecting through this address.
/
(2.2.3.1) (2.2.3.2)
[SITE B Primary] [SITE B Secondary]
(172.16.172.1) (172.16.172.2)
\ /
(172.16.172.3) <-Carp LAN IP
|
[SWITCH] <-Layer 2
|
(172.16.172.101) <-Set Static and with the Carp LAN ip as GW
[SERVER 1]FROM SITE A TO SITE B. (Notice I can ping the firewall but not host 172.16.172.101.)
brom@brom-D415MT-BM2DK ~ $ ping 172.16.172.1 PING 172.16.172.1 (172.16.172.1) 56(84) bytes of data. 64 bytes from 172.16.172.1: icmp_seq=1 ttl=63 time=35.7 ms 64 bytes from 172.16.172.1: icmp_seq=2 ttl=63 time=35.8 ms 64 bytes from 172.16.172.1: icmp_seq=3 ttl=63 time=35.5 ms ^C --- 172.16.172.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 35.548/35.733/35.898/0.210 ms brom@brom-D415MT-BM2DK ~ $ traceroute 172.16.172.1 traceroute to 172.16.172.1 (172.16.172.1), 30 hops max, 60 byte packets 1 fw01.pride1.local (192.168.5.254) 0.175 ms 0.163 ms 0.151 ms 2 * * * 3 * * * 4 * * * 5 * * * 6 * * * 7 * * * 8 * * * 9 * * * 10 * * * 11 * * * 12 * *^C brom@brom-D415MT-BM2DK ~ $ ping 172.16.172.101 PING 172.16.172.101 (172.16.172.101) 56(84) bytes of data. ^C --- 172.16.172.101 ping statistics --- 10 packets transmitted, 0 received, 100% packet loss, time 8999ms
I CAN ping host 172.16.172.101 from the firewall at site B. I've disabled the local firewall on host 172.16.172.101.
FROM SITE B TO SITE A
C:\Users\Admin>ping 192.168.1.30 Pinging 192.168.1.30 with 32 bytes of data: Reply from 192.168.1.30: bytes=32 time=37ms TTL=62 Reply from 192.168.1.30: bytes=32 time=35ms TTL=62 Reply from 192.168.1.30: bytes=32 time=35ms TTL=62 Ping statistics for 192.168.1.30: Packets: Sent = 3, Received = 3, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 35ms, Maximum = 37ms, Average = 35ms Control-C ^C C:\Users\Admin>tracert 192.168.1.30 Tracing route to BROM-D415MT-BM2 [192.168.1.30] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 172.16.172.1 2 36 ms 35 ms 35 ms 172.16.50.1 3 35 ms 35 ms 35 ms BROM-D415MT-BM2 [192.168.1.30] Trace complete. C:\Users\Admin>
Here is the conf for openvpn from Site A's firewall..
dev ovpns5 verb 1 dev-type tun tun-ipv6 dev-node /dev/tun5 writepid /var/run/openvpn_server5.pid #user nobody #group nobody script-security 3 daemon keepalive 10 60 ping-timer-rem persist-tun persist-key proto udp cipher AES-256-CBC auth SHA1 up /usr/local/sbin/ovpn-linkup down /usr/local/sbin/ovpn-linkdown local 1.2.3.1 ifconfig 172.16.50.1 172.16.50.2 lport 1197 management /var/etc/openvpn/server5.sock unix route 172.16.172.0 255.255.255.0 secret /var/etc/openvpn/server5.secret
And here is the conf from the openvpn client on Site B's firewall..
dev ovpnc1 verb 1 dev-type tun tun-ipv6 dev-node /dev/tun1 writepid /var/run/openvpn_client1.pid #user nobody #group nobody script-security 3 daemon keepalive 10 60 ping-timer-rem persist-tun persist-key proto udp cipher AES-256-CBC auth SHA1 up /usr/local/sbin/ovpn-linkup down /usr/local/sbin/ovpn-linkdown local 2.2.3.3 lport 0 management /var/etc/openvpn/client1.sock unix remote 1.2.3.1 1197 ifconfig 172.16.50.2 172.16.50.1 route 192.168.0.0 255.255.248.0 secret /var/etc/openvpn/client1.secret resolv-retry infinite
From the logs I can see that the VPN service is stable and not flapping/going up and down. Below is the activity since yesterday.
Jul 13 16:41:10 openvpn 69474 Peer Connection Initiated with [AF_INET]2.2.3.3:45950 Jul 13 16:40:46 openvpn 69474 Peer Connection Initiated with [AF_INET]2.2.3.3:25270 Jul 13 15:40:01 openvpn 69474 Initialization Sequence Completed Jul 13 15:40:00 openvpn 69474 Peer Connection Initiated with [AF_INET]2.2.3.3:43229 Jul 13 15:40:00 openvpn 69474 UDPv4 link remote: [undef] Jul 13 15:40:00 openvpn 69474 UDPv4 link local (bound): [AF_INET]1.2.3.1:1197 Jul 13 15:40:00 openvpn 69474 /usr/local/sbin/ovpn-linkup ovpns5 1500 1560 172.16.50.1 172.16.50.2 init Jul 13 15:40:00 openvpn 69474 /sbin/ifconfig ovpns5 172.16.50.1 172.16.50.2 mtu 1500 netmask 255.255.255.255 up Jul 13 15:40:00 openvpn 69474 do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=0 Jul 13 15:40:00 openvpn 69474 TUN/TAP device /dev/tun5 opened Jul 13 15:40:00 openvpn 69474 TUN/TAP device ovpns5 exists previously, keep at program end Jul 13 15:40:00 openvpn 69474 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
PS: I replaced the IP's listed in the logs with the fake ones used in the diagram.
The openVPN interfaces on both SITE A's firewall and Site B's firewalls are set to allow anything through the vpn tunnel. Additionally, the WAN2 connection on Site A's firewall has a rule to allow from UDP port 1197. LAN and OpenVPN interfaces on both sites A and B allow anything as well. These firewall rules are up at the Top.
SITE A WAN2 FIREWALL RULE (Rule at Top)
Protocol Source Port Destination Port Gateway Queue Schedule IPv4 TCP/UDP 2.2.3.3 * * 1197-1198 * * none
SITE A OPENVPN FIREWALL RULE (Rule at Top)
Protocol Source Port Destination Port Gateway Queue Schedule IPv4 * * * * * * * none
SITE A LAN1 FIREWALL RULE (Rule at Top)
Protocol Source Port Destination Port Gateway Queue Schedule IPv4 TCP/UDP LAN net * * * * * none
SITE B OPENVPN FIREWALL RULE (Rule at Top)
Protocol Source Port Destination Port Gateway Queue Schedule IPv4 * * * * * * * none
SITE B LAN1 FIREWALL RULE (Rule at Top)
Protocol Source Port Destination Port Gateway Queue Schedule IPv4 * LAN net * * * * * none
Outgoing NAT for Site A is set to manual.. I have tried both not having any additional outgoing NAT rules (which I believe should be correct) and also have tried having them, but it doesn't seem to make a difference in the behaviour. Below is the outgoing NAT rule I currently have in place on Site A.
SITE A OUTGOING NAT (The Rule below is at the top of the list.)
Interface Source Source Port Destination NAT Address NAT Port Static Port WAN2 192.168.0.0/21 * 172.16.172.0/24 WAN2 address * Randomized Port --- Various Other rules Below --
And these are the current Outgoing NAT rules on Site B's firewall. It is set to Hybrid and has set 2.2.3.3 as the outgoing address.
SITE B OUTGOING NATInterface Source Source Port Destination NAT Address NAT Port Static Port WAN 172.16.172.0/24 * * 2.2.3.3 * Randomized Port ---Automatic Rules:--- WAN 127.0.0.0/8 172.16.172.0/24 172.16.50.0/24 * * WAN Address * "Keep Source Port Static" WAN 127.0.0.0/8 172.16.172.0/24 172.16.50.0/24 * * WAN Address * Randomized Port
Packet Captures are showing that the pings are making their way from Site A to site B, reaching the destination host at site B, getting a response from that host, and then timing out trying to get back to site A. I'm not seeing the response coming back to site A. This same behaviour shows up when not using the CARP Wan IP address for the Openvpn address and removing the hybrid NAT rules.
Well… I've dumped every bit of relevant info I can think of in this post, but just let me know if you'd like more or would like some packet captures. Thank you again for any help you guys may give me! My brain is so shot from the various issues I've ran into unrelated to this that I'm just not seeing what is wrong here..
-
Packet Captures are showing that the pings are making their way from Site A to site B, reaching the destination host at site B, getting a response from that host, and then timing out trying to get back to site A. I'm not seeing the response coming back to site A.
Please post a quick packet capture showing this. Please take the capture on the 172.16.172.X interface on whichever node is currently CARP MASTER.
Set detail to advanced and hit view so we get the MAC addresses, etc.
Thanks.
-
No problem, I'll get right on it!
Not going to lie… I was hoping you were going to come back and say, "Stupid person! You forgot this super obvious thing! Just check box a and you'll be good." lol
-
Ok, here is a ping from a workstation on site b (172.16.172.52) to a machine at site A. (Captured on Interface LAN) It behaves as expected.. (I filtered the output using '192.168.1.30,172.16.172.52' in the host address field.)
06:16:13.846635 e8:40:f2:73:28:a9 > 00:00:5e:00:01:02, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 17628, offset 0, flags [none], proto ICMP (1), length 60)
172.16.172.51 > 192.168.1.30: ICMP echo request, id 3, seq 52082, length 40
06:16:13.880564 0c:c4:7a:7f:97:62 > e8:40:f2:73:28:a9, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 20767, offset 0, flags [none], proto ICMP (1), length 60)
192.168.1.30 > 172.16.172.51: ICMP echo reply, id 3, seq 52082, length 40
06:16:14.851254 e8:40:f2:73:28:a9 > 00:00:5e:00:01:02, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 17630, offset 0, flags [none], proto ICMP (1), length 60)
172.16.172.51 > 192.168.1.30: ICMP echo request, id 3, seq 52083, length 40
06:16:14.885552 0c:c4:7a:7f:97:62 > e8:40:f2:73:28:a9, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 20997, offset 0, flags [none], proto ICMP (1), length 60)
192.168.1.30 > 172.16.172.51: ICMP echo reply, id 3, seq 52083, length 40
06:16:15.855463 e8:40:f2:73:28:a9 > 00:00:5e:00:01:02, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 17632, offset 0, flags [none], proto ICMP (1), length 60)
172.16.172.51 > 192.168.1.30: ICMP echo request, id 3, seq 52084, length 40
06:16:15.889537 0c:c4:7a:7f:97:62 > e8:40:f2:73:28:a9, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 21096, offset 0, flags [none], proto ICMP (1), length 60)
192.168.1.30 > 172.16.172.51: ICMP echo reply, id 3, seq 52084, length 40
06:16:16.865600 e8:40:f2:73:28:a9 > 00:00:5e:00:01:02, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 17637, offset 0, flags [none], proto ICMP (1), length 60)
172.16.172.51 > 192.168.1.30: ICMP echo request, id 3, seq 52085, length 40
06:16:16.900040 0c:c4:7a:7f:97:62 > e8:40:f2:73:28:a9, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 21205, offset 0, flags [none], proto ICMP (1), length 60)
192.168.1.30 > 172.16.172.51: ICMP echo reply, id 3, seq 52085, length 40And the ping from a workstation on site A to site B doesn't show as coming through the vpn tunnel at all when viewed from site B. :( Sorry, I know it did at some point because I took packet captures earlier, but that doesn't appear to be the case now… What I can show though is that it is showing as coming through the openvpn tunnel at site A. (This is monitoring the openvpn virtual interface at site A.)
17:32:04.746976 AF IPv4 (2), length 88: (tos 0x0, ttl 63, id 45740, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.1.30 > 172.16.172.52: ICMP echo request, id 19109, seq 1, length 64
17:32:05.755154 AF IPv4 (2), length 88: (tos 0x0, ttl 63, id 45837, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.1.30 > 172.16.172.52: ICMP echo request, id 19109, seq 2, length 64
17:32:06.762968 AF IPv4 (2), length 88: (tos 0x0, ttl 63, id 45843, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.1.30 > 172.16.172.52: ICMP echo request, id 19109, seq 3, length 64
17:32:07.770909 AF IPv4 (2), length 88: (tos 0x0, ttl 63, id 46000, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.1.30 > 172.16.172.52: ICMP echo request, id 19109, seq 4, length 64
17:32:08.778926 AF IPv4 (2), length 88: (tos 0x0, ttl 63, id 46083, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.1.30 > 172.16.172.52: ICMP echo request, id 19109, seq 5, length 64
17:32:09.786900 AF IPv4 (2), length 88: (tos 0x0, ttl 63, id 46201, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.1.30 > 172.16.172.52: ICMP echo request, id 19109, seq 6, length 64So… Hmm....
-
Check the local, "software" firewall on 172.16.172.52. If the echo requests are going out and nothing is coming back either the replies are not being sent, they are being filtered, or they are being sent someplace else.
-
Ah, I actually had the firewall off for that test..
I essentially hit the point where I said, "Everything is configured right. What else can I try?" and then FINALLY had the good sense to reboot all firewalls involved. That fixed everything. ^^;;;
Incredibly frustrating, but I seem to have been neglecting the IT creed of "Have you tried turning it off and on again?" ^_^;;; Still not sure why it fixed anything… I had looked at the routing tables and all the routing seemed correct. But such is life I guess. lol
-
The routing was correct. The packets were being sent out the correct interface.
Rebooting other devices must have cleared something elsewhere.
Glad you got it sorted out.