Routing problems after changing physical WAN connection
-
Hi all,
I've been using pfsense for several months, and it works great. We have 2 connections set up through it to different ISPs, and I have the following physical configuration as well as LB configuration.WAN-> Inspire
LAN -> 10.0.1.0/24
DMZ(OPT1) -> 10.0.2.0/24
WAN2(OPT2) -> Thin Air3 Load Balancers
Default -> WAN (x5) OPT2
Inspire-First (Failover) -> WAN, OPT2
ThinAir-First (Failover) -> OPT2, WANI have this alias
Production_Servers (removed last octet for security)
204.228.224.xxx/32
203.152.99.xxx/29Special Routing Rule Rule
Interface: LAN
Protocol: any
Source: any
Destination: Singlehost/alias, Production_Servers
Log Packets: true
Schedule:None
Gateway: ThinAir-FirstWith our previous Internet connection, I could run a traceroute from my LAN to any of our production hosts. I would be routed over our ThinAir (Opt2) connection and everything would work as expected. If our OPT2 interface were to go down, it would route the query over to the WAN. 2 Days ago we replaced our WAN connection from DSL to high speed wireless. I changed the IP address and default gateway of the WAN, applied the rules, then restarted the router. Everything works as it did previously except the Server 204.228.224.xxx/32, this isn't routed over OPT 2(thin air), it's getting routed over the WAN (Inspire). Given that I haven't touched the aliases or the routing rules, I'm a bit stumped as to why this is happening. To complicate matters, if I check our syslog server, I can see that the traceroute to both 204.228.224.xxx/32 and 203.152.99.xxx/29 are applied to the same rule ( rule 173/0(match): pass in on sk1). To say I'm thoroughly confused would be an understatement. Can anyone shed some light on this? I'm using 1.2.3-RC 3
thanks,
Todd -
Hey guys,
Anyone? I'm also getting some really strange behavior when I run a traceroute to our servers. From an external host if I run a traceroute I get this output.1 3 ms <1 ms <1 ms vlan24.dc1.boi.spro.net [204.228.201.129]
2 <1 ms 4 ms 1 ms gi2-3.core1.boi.spro.net [198.60.194.25]
3 <1 ms <1 ms <1 ms gi2-2.core2.boi.spro.net [198.60.194.22]
4 1 ms <1 ms <1 ms p7-1.gw03.bois.eli.net [209.63.202.241]
5 26 ms 27 ms 27 ms tg9-1.cr01.boisidpz.integra.net [209.63.114.25]
6 27 ms 26 ms 27 ms tg13-1.cr01.slkcutxd.integra.net [209.63.114.30]
7 27 ms 27 ms 26 ms tg13-4.cr02.slkcutxd.integra.net [209.63.114.174]
8 27 ms 26 ms 26 ms tg9-1.cr02.renonvmp.integra.net [209.63.97.161]
9 27 ms 27 ms 26 ms tg9-4.cr01.renonvmp.integra.net [209.63.97.165]
10 26 ms 26 ms 26 ms tg13-2.cr02.sntdcabl.integra.net [209.63.97.158]
11 26 ms 27 ms 26 ms tg1-1.br01.plalcaxc.integra.net [209.63.115.14]
12 30 ms 30 ms 29 ms paix.bdr01.sjc01.ca.vocusconnect.net [198.32.176.134]
13 155 ms 155 ms 155 ms ge-0-0-0-52.bdr02.akl01.akl.VOCUS.net.au [114.31.199.39]
14 159 ms 170 ms 162 ms as38477.akl01.akl.VOCUS.net.au [114.31.203.21]
15 * 173 ms 180 ms ge-0-0-0.core-0.matthews.pmr.unleash.net.nz [116.90.128.44]
16 170 ms 170 ms 167 ms proxy1-core.eth.sta.thinair.net.nz [116.90.128.5]
17 172 ms 169 ms 172 ms vero-01.ap.sta.thinair.net.nz [116.90.138.67]
18 169 ms 187 ms 172 ms 116-90-129-42.wireless.sta.thinair.net.nz [116.90.129.42]
19 180 ms 190 ms 174 ms 203-114-161-15.wir.sta.inspire.net.nz [203.114.161.15]What's really strange is the 2nd to last line is the one with our Virtual IP on it. It should be the end of the route. For some reason the connection is getting bounced from our OPT2 (thin air) to our WAN( inspire) given that last line, it shouldn't even exist! Any ideas why this is happening? This wasn't an issue with our old connection.
thanks,
Todd -
Hey guys,
I've done a bit more digging with tcpdump, and I'm truly at a loss. It appears the connection is coming in correctly, but a reply is never routed back through the same interface as the incoming connection. This only happens from a single host, so it's something specific to that host, I thought perhaps something wrong with the LB routes, however they haven't changed. Is it possible this is a bug in pfsense with routing and state tables? As I said earlier, I haven't changed anything other than my interface's ip and gateway address and the ip address I ping in my LB. Here is what I'm seeing coming in OPT 2 via tcpdump and out WAN.INPUT OPT 2
tcpdump -c 10 -i fxp0 src aff.spidertracks.com and (port 9997 or port 80 or port 443)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on fxp0, link-type EN10MB (Ethernet), capture size 96 bytes
16:45:11.651424 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 2349331092:2349331096(4) ack 3786880561 win 258
16:45:11.758126 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 1384:1515(131) ack 1 win 258
16:45:12.345016 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 4:1384(1380) ack 1 win 258
16:45:12.524759 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 1515:2895(1380) ack 1 win 258
16:45:12.525500 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 2895:3099(204) ack 1 win 258
16:45:12.759144 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 3099:3103(4) ack 1 win 258
16:45:12.760376 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 3103:4483(1380) ack 1 win 258
16:45:12.761112 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 4483:4629(146) ack 1 win 258
16:45:12.762045 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 4629:6009(1380) ack 1 win 258
16:45:12.771209 IP host-204-228-224-228.ns1.spro.net.49963 > 116-90-140-43.wireless.sta.thinair.net.nz.9997: P 6009:6307(298) ack 1 win 258
10 packets captured
33 packets received by filter
0 packets dropped by kernelOUTPUT WAN (seem to be TCP acks for incoming requests above, but I'm not expert so please correct me if I'm wrong)
tcpdump -c 10 -i rl0 dst 204.228.224.228 and ( port 9997 or port 80 or port 443 )
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on rl0, link-type EN10MB (Ethernet), capture size 96 bytes
16:45:11.651807 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 2349331096 win 12301
16:45:11.759107 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 1 win 12301 <nop,nop,sack 1="" {1381:1512}="">16:45:12.345385 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 1512 win 12301
16:45:12.525118 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 2892 win 12301
16:45:12.525706 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 3096 win 12301
16:45:12.759393 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 3100 win 12301
16:45:12.760677 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 4480 win 12301
16:45:12.761371 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 4626 win 12301
16:45:12.762267 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 6006 win 12301
16:45:12.771509 IP 116-90-140-43.wireless.sta.thinair.net.nz.9997 > host-204-228-224-228.ns1.spro.net.49963: . ack 6304 win 12301
10 packets capturedthanks,
Todd</nop,nop,sack> -
Anyone? Maybe this is a better question. Using the command line, is there a way I can print out all routing information/rules as they would be applied to 204.228.224.228? I need more diagnostic information out of our rules than I currently know how to attain. If I had a tool I could use to print the rules as they would get applied on different interfaces, this could help me track down my problem.
thanks,
Todd -
which date of snapshot are you using?
Sounds like this issue, possibly:
http://forum.pfsense.org/index.php/topic,19763.0.htmlIf so, upgrade to latest snapshot.