Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    Load balancer behind TNSR, Poor NAT....

    TNSR
    3
    4
    1610
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      dans last edited by dans

      Maybe I should just open a ticket on this, but.... here goes:

      So, I am running HAproxy on a machine behind TNSR. Simple static NAT coming through on port 80. Once I open it up, and 8000 sessions build up after about 2 minutes, it stops making connections to HAproxy, and I can see this:

      TNSR01 tnsr(config)# show packet-counters
         Count                    Node                  Reason
              25                null-node               blackholed packets
               3        nat44-ed-out2in-slowpath        no translation
           24516        nat44-ed-out2in-slowpath        maximum sessions exceeded
      

      So... the "maximum sessions exceeded" is the issue here, I believe. I need to push a LOT of traffic through this NAT rule. What's the best way to optimize settings? Surely a measly 8000 NAT sessions doesn't max this beast out, right?

      I'm on the default NAT mode: "Endpoint Dependent"

      Looking at https://docs.netgate.com/tnsr/en/latest/advanced/dataplane-nat.html#dataplane-nat I'm thinking I need to set:

      dataplane nat max-translations-per-thread
      

      To some much larger number. What's the valid range? I don't see that in the docs?

      Also, how can I tell how many threads I have?

      TNSR01 tnsr(config)# show dataplane cpu threads
      ID Name     Type PID    LCore Core Socket
      -- -------- ---- ------ ----- ---- ------
       0 vpp_main      102880     1    1      0
      

      That would seem to indicate ONE thread.... and this is on the XG-1537... so... is something wrong?

      Suggestions welcome.

      Thanks,
      Dan

      D 1 Reply Last reply Reply Quote 0
      • D
        dans @dans last edited by

        So... I did end up opening a ticket.

        Turns out:

        dataplane nat max-translations-per-thread 1000000
        dataplane cpu workers 1
        

        Then restarting the dataplane got things working.

        However, I starting thinking (perhaps wrongly) that in order to get the full throughput out of this XG-1537, I really should have more workers, otherwise most of the cores would just be sitting idle.

        So, I changed "cpu workers" to 2, and restarted dataplane - it would not come up.... changing workers back to 1 worked fine...

        Next, I figured maybe 2 million translations was somehow "maxing out" the box. (It has 16Gig of RAM). OK, I'll reduce to 200k translations-per-thread, and spin up 3 workers. That seemed to work fine, as the primary web server based on the LAN was processing transactions no problem.

        However, the next morning, I found out that this "somehow" broke other simple static NAT rules being used for small services (remote SSH access to a couple hosts, etc.) Moved it back to 1 million max-translations-per-thread and a single worker. Fixed.

        Very puzzing.... Can I process 10gig worth of traffic through the box, most of it NATed to an internal webserver with only the single worker? If so, That's fine I guess...

        Anyone have any thoughts? Am I doing something "unusual" here? I know networking, but this is my first experience with VPP.

        Thanks,
        Dan

        audian D 2 Replies Last reply Reply Quote 0
        • audian
          audian @dans last edited by

          @dans thanks I know you are working with our TAC Support team on this, but I wanted to say thanks for sharing your journey here with the community.

          1 Reply Last reply Reply Quote 0
          • D
            danvdx @dans last edited by

            @dans I might be about to have a similar setup as yours. How did it end?
            Thanks from a fellow "Dan"

            1 Reply Last reply Reply Quote 0
            • First post
              Last post