Exclude CARP traffic from Traffic Shaping



  • Hello,
      I am getting mad because it happens sometimes that my two boxes switch from one to another box with apparently no reason.

    The last time this happened few hours ago. Maybe it can be that the CARP traffic gets shaped/dropped under heavy traffic. Maybe this happens:

    For the moment I just started to exclude 224.0.0.18/32 from being processed from the traffic shaper (or better, I assign it a "wire speed queue").
    Under heavy traffic, the traffic for 224.0.0.18 gets shaped/limited/dropped, so CARP gets down. It is reasonable?

    Thanks,
    Michele



  • It happened again… I try to set as "default queue" the "unlimited queue", I hope this will solve...

    This was under very heavy traffic, 100% of the available bandwidth, with some queue drop in RRD graph...




  • @mdima:

    It happened again… I try to set as "default queue" the "unlimited queue", I hope this will solve...
    This was under very heavy traffic, 100% of the available bandwidth, with some queue drop in RRD graph...

    OK! I set the default queue for each interface to a "gigabit queue", so with no limit and set the protocol for each of the other "match" rules to TCP/UDP… so now definitely the CARP traffic should not be shaped anymore... Everything passed some tests I made in upload/download from the different networks of my office...



  • Crossing the fingers, it's a week I don't have this problem anymore, even if the firewalls were running and under very heavy traffic…

    Maybe I found it. So, CARP and Traffic Shaping totally need to be coordinated, and absolutely Traffic Shaping must be configured in order to not to catch any CARP packet.

    Who updates the "CARP Configuration Troubleshooting" document on the wiki? http://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting
    Can I?

    Ciao,
    Michele



  • same thing is happening to me. my default queue is the p2p (im using traffic shaping for the p2p protocols). so if i set my default queue to the 1gigabit, all uncategorized traffic will go throught this (including the uncategorized p2p). can u help me? can u be more specific (as in step by step) on what you did to resolve this please?

    Thanks!



  • @mamen0330:

    same thing is happening to me. my default queue is the p2p (im using traffic shaping for the p2p protocols). so if i set my default queue to the 1gigabit, all uncategorized traffic will go throught this (including the uncategorized p2p). can u help me? can u be more specific (as in step by step) on what you did to resolve this please?
    Thanks!

    Hi,
      yes, I confirm you that if the default queue is limited and on that interface is running CARP, when the default queue starts to drop packets the CARP will freak out and will cause random results (in my case, some VIPs were master on a box, some others on another box).

    To skip this problems, I was organizing the queues as follows (summarized, but I hope to represent it clearly):

    
    INTERFACE: 1 Gbit. 
      = qWire. Bandwidth: "90%". This is the Default queue and will be assigned for the local routes
      = qWan. Bandwidth: "20mbit" (the bandwith I have now on my provider). This queue will NOT be assigned to any rule in the firewall settings.
            = qHighPriority: 75%. Assigned as FIRST queue in the "floating rules" from LAN to "any". 
            = qMediumPriority: 20%: Assigned as SECOND queue in the "floating rules", from LAN to "any" where the traffic match some ports
            = qLowPriority: 5%: Assigned as LAST queue as above, matched on different ports. 
    
    

    While the Floating Rules (Firewall, Rules, Floating) are something like this (again, I summarize considering only the DMZ and WAN network which is more complete, but I hope it is clear). Note that the order of the following rules is important.


    Interface: WAN, Protocol: TCP/UDP, from: any, to: any, Queue: qHighPriority, Mark packets (in Advanced Options): qHighPriority
    Interface: WAN, Protocol: TCP/UDP, from: wan, to: any, Dest.Ports: "medium priority ports", Queue: qMediumPriority, Mark packets: qMediumPriority
    Interface: WAN, Protocol: TCP/UDP, from: any, to: any, Dest.Ports: "low priority ports", Queue: qLowPriority, Mark packets: qLowPriority
    Interface: WAN, Protocol: TCP/UDP, from: "WAN Network", to: "DMZ Network", Queue: qWire, Mark packets: qWire

    Interface: DMZ, Protocol: TCP/UDP, from: any, to: any, Queue: qHighPriority, Mark packets (in Advanced Options): qHighPriority
    Interface: DMZ, Protocol: TCP/UDP, from: wan, to: any, Dest.Ports: "medium priority ports", Queue: qMediumPriority, Mark packets: qMediumPriority
    Interface: DMZ, Protocol: TCP/UDP, from: any, to: any, Dest.Ports: "low priority ports", Queue: qLowPriority, Mark packets: qLowPriority
    Interface: DMZ, Protocol: TCP/UDP, from: any, to: "WAN network", Queue: qWire, Mark packets: qWire

    Interface: EMPTY, Protocol: any, from: any, to: any, Match packets (in Advanced Options, literally "You can match packet on a mark placed before on another rule."): qHighPriority, Queue: qHighPriority
    Interface: EMPTY, Protocol: any, from: any, to: any, Match packets: qMediumPriority, Queue: qMediumPriority
    Interface: EMPTY, Protocol: any, from: any, to: any, Match packets: qLowPriority, Queue: qLowPriority
    Interface: EMPTY, Protocol: any, from: any, to: any, Match packets: qWire, Queue: qWire


    So this closes the circle, after you create the queues and the rules as above, the "Internet Traffic" will follow a different "queue root" from the local traffic, and it will not interfere anymore with the CARP traffic, which will be assigned to the default queue (that is 90% of 1Gbit).
    This example is the result of many experiments and tries, I don't know if there is a better solution, but I was not able to identify the CARP traffic itself and assigning it to a specific queue (both filtering for protocol then for address from/to and so on), so I had to use this workaround and set the default queue working at "wire speed".

    I hope this example will help you…

    Have a nice day,
    Michele

    This assumes that you are able to categorize the traffic you want to shape. So from the LAN (or DMZ) to the Internet.



  • I just ran across this too.  It would be nice to have an option in the queue section to set a carp/pfsync queue.  This setting would then need to modify the built-in rules below to use the queue you specify.

    pass quick proto carp
    pass quick proto pfsync

    I guess it really isn't a big deal.  As long as you know that those carp rules are hidden and go to the default queue.  It would be nice if that were mentioned in the shaper docs.



  • Hehe! Thanks for the reply, it looked like this problem was involving almost none and this post looked like a monologue… but I think that the combination CARP + Traffic Shaper isn't that rare...

    Btw your suggestions would prevent any other user to run into this problem.

    Ciao,
    Michele



  • I'm running something like this, just not at 100% at any time really, but the options adam here put in, should work just fine.



  • I added http://redmine.pfsense.org/issues/2997 probably it want be looked upon before 2.1 gets released.



  • Thanks ermal



  • We have struggled with random CARP failovers for almost two years, and have come to the same conclusion now that the traffic shaper is the culprit.
    We will verify this shortly and if that turns out to be the case, we'll add a shaper rule to allow the carp protocol floating with a 7 priority (higher than ACK).

    Manually enabling/disabling CARP on either member fixed it for us in the past, but due to the loss of state on failback this was annoying…



  • Hi Namezero,
    probably the solution by adam is the easiest to implement:

    pass quick proto carp
    pass quick proto pfsync

    Just add this two floating rules on top of the list (pay attention to the "quick" option).

    Ciao,
    Michele



  • thanks for the reply.
    In the GUI, would a pass quick translate to an unqualified rule wih only the protocol selected?

    Also, we dont run pfsync on a shaped interface. Should the rule still be added just to be sure?



  • Hi namezero,
      if in the rule is specified only the selected protocol, yes, it would apply only to that protocol. Consider that the CARP protocol has sense only in the devices attached to the same physical network, and should be already protected by the password specified for each virtual IP.

    About pfSync, I would add it just to be sure, even if it does not have such bad effects if shaped.



  • Thank you for your quick answer. I have created the rule as can be seen in the screenshots.
    However, it doesn't seem to catch CARP traffic.
    I have also tried any protocol with destination 224.0.0.18 with the same results.
    Then I tried both with state set to none instead of keep state with the same results.

    But somehow the rule doesn't catch. The /tmp/rules.debug looks like it is correct:

    
    # User-defined rules follow
    
    anchor "userrules/*"
    pass log  quick  proto carp  from any to any keep state  label "USER_RULE: CARP must never be shaped"
    ...
    ...
    
    

    I have also noticed that before the user rules section there is something from a snort package:

    
    # Snort package
    block quick from <snort2c>to any label "Block snort2c hosts"
    block quick from any to <snort2c>label "Block snort2c hosts"
    block in log quick proto carp from (self) to any
    [b]pass quick proto carp
    pass quick proto pfsync[/b]</snort2c></snort2c> 
    

    However, snort isn't installed on this machine…

    Any ideas?

    EDIT: Could this be because the state for cars is NO_TRAFFIC:SINGLE?

    
    carp 	224.0.0.18 <- xxx.yyy.134.23 	NO_TRAFFIC:SINGLE
    
    






  • Greetings,

    I have dug a little deeper, and it seems that in 2.0.1 release this has been taken into consideration…
    In functions.inc, on line 2752 there is a function "function filter_process_carp_rules()" that is called during the rule generation:

    
    function filter_process_carp_rules() {
    	global $g, $config;
    	if(isset($config['system']['developerspew'])) {
    		$mt = microtime();
    		echo "filter_process_carp_rules() being called $mt\n";
    	}
    	$lines = "";
    	/* return if there are no carp configured items */
    	if(isset($config['installedpackages']['carp']['config']) &&
    			 $config['installedpackages']['carpsettings']['config'] <> "" or
    			 $config['virtualip']['vip'] <> "") {
    		$lines .= "block in log quick proto carp from (self) to any\n";
    		$lines .= "pass quick proto carp\n";
    		$lines .= "pass quick proto pfsync\n";
    	}
    	return $lines;
    }
    
    

    It also has nothing do do with snort (functions.inc, line 2111):

    
    	$ipfrules .= << <eod<br># Snort package
    block quick from <snort2c>to any label "Block snort2c hosts"
    block quick from any to <snort2c>label "Block snort2c hosts"
    
    EOD;
    
    	[b]$ipfrules .= filter_process_carp_rules();[/b]
    
    	$ipfrules .= "\n# SSH lockout\n";
    	if(is_array($config['system']['ssh']) && !empty($config['system']['ssh']['port'])) {
    		$ipfrules .= "block in log quick proto tcp from <sshlockout>to any port ";
    		$ipfrules .= $config['system']['ssh']['port'];
    		$ipfrules .= " label \"sshlockout\"\n";
    	} else {
    		if($config['system']['ssh']['port'] <> "")
    			$sshport = $config['system']['ssh']['port'];
    		else
    			$sshport = 22;
    		if($sshport)
    			$ipfrules .= "block in log quick proto tcp from <sshlockout>to any port {$sshport} label \"sshlockout\"\n";
    	}</sshlockout></sshlockout></snort2c></snort2c></eod<br> 
    

    This should take care of the solution, no?
    If this rule is in there, why are we missing CARP packets under heavy load?



  • Hi Namezero,
    good question, but unfortunately I don't have an answer.

    If you want to be sure that the CARP traffic is not limited by the traffic shaper, I can suggest you the same solution I implemented in my production environment: Create a "gigabit speed queue" and make it the default queue (do this for each interface). Then assign the traffic that you want explicitly to be limited/shaped using the floating rules.
    This solution has a higher impact on the configuration, but I can grant you that it totally works (since I implemented it I had no more issues with CARP under heavy load).

    You can find the detailed description of the solution in this thread, read my message posted on: "August 24, 2012, 06:03:57 pm".

    Ciao,
    Michele



  • There is no way to override the pfsync and carp rules going to the default queue.  The rules I listed are already listed and hidden before any rules you create.  They will always go to the default queue.  You must make sure your default queue is high priority or even it's own dedicated queue.  Then make sure to classify all other traffic.  I have essentially 4 queues.  Low, medium, high, very high.  I set the default queue to very high for the built in hidden rules that you can not override and I put everything else in different queues on the floating tab with match rules.  I first have a catch all to put everything in low priority.  I then classify other traffic into the medium and high queues(http, https, VPN, etc).  This leaves BitTorrent on the low queue.



  • Hmm that seems like a bug then.

    I was able to provoke the issue by jamming the connection and it started missing heartbeats right away and failed over after approximately 10-15 seconds.
    Then I tested by changing the link speed to 950Gpbs and the problem disappeared.
    I then set it back to the normal value and I cannot produce the problem since. Maybe a reboot would make the problem reoccur though when the states are cleared.

    In that case, I will comment out the code in functions.inc so that I can assign CARP & PFSYNC to the highest available queue in the GUI, since it seems too difficult changing everything going through the default queue to something else.



  • Hi Namezero,
       well… I did not know about that code:

    @namezero111111:

    I have dug a little deeper, and it seems that in 2.0.1 release this has been taken into consideration…
    In functions.inc, on line 2752 there is a function "function filter_process_carp_rules()" that is called during the rule generation:

    but even if in my rules.debug I find them, I had to implement a "gigabit default queue" to solve all the CARP/shaping problems… :S

    I update the ticket in redmine...



  • The hidden rules that your are talking about that specify carp do not have a queue specified so they go to the default queue.



  • @adam65535:

    The hidden rules that your are talking about that specify carp do not have a queue specified so they go to the default queue.

    OK, now I understand why I can't "shape" the CARP traffic (assign it to another queue), because there's a quick rule before that assigns the CARP protocol to the default queue… now everything is clearer, thanks! ;)



  • Thanks for all your replies.
    You are correct, Adam. Since those rules catch first they go to the default queue.

    Hence my workaround to comment the code out so I can define the rules in the GUI.
    I haven't decided whether to do it that way yet or whether I'll path a queue (qAck, qVeryHigh) into functions.inc, but the first way seems cleaner somehow.
    Either way should work though.

    I suppose those two "hidden" rules were well intentioned, but they do seem to wreak havoc when congested and shaped interfaces come into play.



  • I really wish rules like this were not hidden.  Put them in floating and give them a comment that they are built-in system rules for pfsync and carp and maybe allow users to disable them.  Hidden rules can become a problem for audits if the user does not know they exist.  If there is a rule that allows access to something I honestly feel it should be shown to the user somehow.  I feel that even DHCP rules should be shown on the floating tab as a system rule.

    For example… the hidden pfsync and carp rules when using carp seem to allow all traffic through the firewall that match because they don't specify an in, out, src, or dst (which users might not know how open those rules are).  Immediately I would worry that would allow any traffic to pass not only to your firewall but through your firewall with that protocol.  A quick test using hping seems to indicate the traffic is passed through any of the interfaces from any to any ( I modified the builtin rules to log and the logs show pass).  If you don't allow any incoming traffic from your WAN or you don't allow outgoing traffic from your DMZ network your security policy would have a hole in it for those protocols if the firewall forwards pfsync protocol or carp protocol packets because of the default rules.  I doubt many ISPs will pass that traffic but even with that restriction I doubt many people would really like the idea of someone on their WAN subnet being able to craft packets through their firewall if they use carp features causing those hidden rules to be put in place.

    The hidden rules I think really need to be restricted to certain IPs to restrict forwarding.

    That is just an example of why I think hidden rules are a bad idea.  Out of sight... out of mind.



  • Hmm yes you're right about that.
    This could become problematic, because theoretically someone on a DMZ or WAN on the same subnet would be able to pass malicious CARP / PFSYNC traffic through the firewall to interfere with these protocols on other subnets, no?
    Maybe my understanding of PFSYNC is a little too limited, but couldn't someone try to gratuitously inject invalid states into the firewall?



  • I can think of two scenarios right off the top of my head that could potentially be concerning even though they are probably not likely to happen.

    Messing with VRRP by someone sending traffic through the firewall to routers using VRRP (not carp) is probably more likely possible since VRRP is not encrypted like CARP.  I do not know anything about the encryption with CARP to know about the specifics but I am assuming the encryption done on CARP does help with that for the CARP protocol.

    Someone on a subnet that is defined to not have any outbound access could theoretically send data to a device on another subnet or potentially further than that if pfsync or carp traffic gets routed by whatever device it passes through.  A custom app could be programmed to tunnel traffic through CARP or PFSYNC protocol to an outside custom service listening on pfsync/carp protocol.  A fully restricted subnet/interface is one example where this could be an issue.



  • On carp there are protections but also not only on signing the contents but even TTL of the packet.

    For pfSync this is a bit more troublesome but pfsync should get rules only for the configured hosts to be allowed to come in.



  • The built in rules allow both pfsync and carp protocol packets from anywhere and to anywhere according to rules.debug.  Any user rules which come after that would be ignored from what I understand.

    Even though I have dedicated pfsync traffic to a specific interface for security the builtin rules would allow external WAN packets (or any other interface) to add states to the firewall.

    There are protections on carp itself as we mentioned.  VRRP uses the same protocol number though.  If someone can pass traffic from your WAN to your inside networks to mess with your internal Cisco routers VRRP traffic that could still be an issue.

    It seems like pfsync and carp would both need to have more restrictive rules to protect from this.

    Pfsync should only need to get traffic on the defined sync network. That one seems easiest to fix by restricting that rule.

    Psuedocode:

    pass pfsync from defined_sync_subnet to defined_sync_subnet
    pass pfsync from defined_sync_subnet to 224.0.0.x  (assuming this is used)

    Carp has to accept traffic from another system on the WAN which exposes it to spoofing but the encryption that you mentioned should protect it from that.  If the builtin rule for carp were also changed to put the same restrictions on the carp traffic but for each individual interface then that would help the firewall bypass issue.

    pass carp from lan_subnet to lan_subnet
    pass carp from lan_subnet to 224.0.0.x

    pass carp from wan_subnet to wan_subnet
    pass carp from wan_subnet to 224.0.0.x

    These rules would at least protect from being able to push pfsync and carp traffic into your protected networks from the WAN (or any other interface).  They could of course be further restricted to their sync partners src IP.

    It would be great to be able to setup a queue on this rules too to make sure they get top priority or some other solution which is what this discussion was really supposed to be talking about.



  • I thought in 2.1 this is already the case!

    I am looking at fixing this queue issue.



  • Thanks ermal!

    It would be great if you could post the changes here so we could backport the "official" solution to 2.0.1 for our systems that are already deployed and won't be upgraded soon.



  • This issue still exists with the latest snapshots. There needs to be default bypass for carp/pfxync traffic when the traffic shaper is configured, or a way to edit the queue for the built-in rules. Enabling the traffic shaper in a CARP/pfsync environment without making sure the default queue isnt rate-limited will break things.

    Thanks,

    Jon



  • Hi all,
    running pfSense 2.1.5-RELEASE (amd64) with CARP and Traffic shaper configured is still causing false failovers from master to backup under heavy traffic.

    Any update about this subject? Does the latest version (2.2.6) solve this problem?

    Thanks,
    armando



  • Hello all,

    I'd also like to know if this issue is still present in pfSense 2.2.6. Anyone using such configuration ?

    Regards,

    Régis


Log in to reply