"Why would you be natting to internal rfc1918 networks?"
Because I connect to a wireless network that I don't manage that uses rfc1918 IPs. Each wireless node (router) in the network gets configured with a random 10.0.0.0/29 network address during initial setup on each node. The routing for these nodes is managed with OLSR on the wireless network. Pfsense apparently used to have a plugin for OLSR, but doesn't any longer and I cannot add routes for my internal LAN to OLSR. Nodes can come and go without notification or coordination on this network, so I can't reasonably maintain an accurate static route list, so I have a generic static 10.0.0.0/8 route out to that network interface to cover all wireless networks. I'm only allocated a /29 on the wireless network, and I provide services from multiple internal LAN IPs, so I have NAT configured so it only consumes one wireless IP. This is on my OPT1 interface, and the IP and gateway are provided via DHCP from the wireless node.
I'm open to suggestions for better ways to do this, but this is the only way I could see getting it to work with the restrictions I have.
My internal LAN is 10.10.6.0/24. This works fine because the LAN interface's /24 route is more specific than the wireless /8, so things route properly.
The WAN port connects to my ISP, and is a 73.x.x.x/24 which is provided via DHCP from my cable modem.
So to recap:
To internet
73.x.x.1 (gateway)
|
73.x.x.x/24
WAN
+–---------+
| pfsense | LAN--10.10.6.1/24----To internal LAN
+-----------+
OPT1
10.117.100.157/29
|
10.117.100.153 (gateway)
To a couple dozen or so random 10.x.x.x/29 networks routed by OLSR
"Do you have both gateways you get via dhcp as "default"?" "Post up your gateway section"
The only gateway that is set default is the WAN (internet) side.
This is from my /conf/config.xml file:
<gateways><gateway_item><interface>opt1</interface>
<gateway>dynamic</gateway>
<name>MESH_NMT_DHCP</name>
<weight>1</weight>
<ipprotocol>inet</ipprotocol>
<monitor_disable></monitor_disable></gateway_item>
<gateway_item><interface>wan</interface>
<gateway>dynamic</gateway>
<name>WAN_DHCP</name>
<weight>1</weight>
<ipprotocol>inet</ipprotocol>
<monitor_disable><defaultgw><latencyhigh>1500</latencyhigh>
<losshigh>100</losshigh></defaultgw></monitor_disable></gateway_item>
<gateway_item><interface>wan</interface>
<gateway>dynamic</gateway>
<name>WAN_DHCP6</name>
<weight>1</weight>
<ipprotocol>inet6</ipprotocol>
<monitor_disable><defaultgw></defaultgw></monitor_disable></gateway_item></gateways>
and just for info:
<staticroutes><route><network>10.0.0.0/8</network>
<gateway>MESH_NMT_DHCP</gateway></route></staticroutes>
Normally netstat -nr shows this:
Internet:
Destination Gateway Flags Netif Expire
default 73.x.x.1 UGS em0
10.0.0.0/8 10.117.100.153 UGS em2
10.10.6.0/24 link#2 U em1
10.10.6.1 link#2 UHS lo0
10.117.100.152/29 link#3 U em2
10.117.100.153 10.117.100.153 UGHS em2
10.117.100.157 link#3 UHS lo0
73.x.x.0/24 link#1 U em0
73.x.x.x link#1 UHS lo0
75.75.75.75 73.x.x.1 UGHS em0
75.75.76.76 73.x.x.1 UGHS em0
127.0.0.1 link#8 UH lo0
172.16.0.0/12 10.117.100.153 UGS em2
When it goes bad I see this:
Internet:
Destination Gateway Flags Netif Expire
default 10.117.100.153 UGS em2
10.0.0.0/8 10.117.100.153 UGS em2
10.10.6.0/24 link#2 U em1
10.10.6.1 link#2 UHS lo0
10.117.100.152/29 link#3 U em2
10.117.100.153 10.117.100.153 UGHS em2
10.117.100.157 link#3 UHS lo0
73.x.x.0/24 link#1 U em0
73.x.x.x link#1 UHS lo0
75.75.75.75 73.x.x.1 UGHS em0
75.75.76.76 73.x.x.1 UGHS em0
127.0.0.1 link#8 UH lo0
172.16.0.0/12 10.117.100.153 UGS em2
I've looked through the various logs when the problem happens, and I don't see anything obviously wrong.
I've played with various values and ultimately disabled gateway monitoring to make sure that isn't causing the problem.