Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Resolving IP addresses from media providers

    Scheduled Pinned Locked Moved General pfSense Questions
    1 Posts 1 Posters 219 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dotOneD
      dotOne
      last edited by

      Being extremely privacy and security all outbound connections I use are through VPN's
      Most if not all media providers (Netflix, Amazon Prime, etc) are allergic to VPN connections because they don't own the rights to provide each movie and tv serie in all countries.
      One of the options I have is to use a VPN provider that is not blocked by them.
      Since I'm also a big supporter of IPv6, this is pretty much impossible. The majority of VPN providers do not support IPv6.

      In the end I decided to to actively collect the IP addresses from the media providers and use them in pfBlockerNG IP match lists.
      This way I can select the WAN gateway for this traffic instead of the default VPN gateway.
      My initial process was pretty straightforward:

      • Collect FQDN's used by the media providers
      • Resolve the FQDN's to IP addresses
      • Add these IP addresses to pfBlockerNG match lists
      • create a FW policy with the IP match list to select the correct gateway.

      Initially I used a TAP and a packet broker to capture the traffic from my Apple TV's and filter out the DNS requests and HTTPS client hello packets to capture all the FQDN's
      While it worked pretty well, it is a cumbersome and laborious process and while it provided a lot of information, it definitely wasn't the way forward.
      My second approach was using tcpdump on the firewall and capture all DNS resolve requests sent by the Apple TV's.
      This worked pretty well but I found tcpdump not particularly flexible, especially if you know tshark is much more convenient, so I abandoned this method as well.
      Then I realised that all DNS requests are redirected using the NAT list to the unbound dns resolver on the firewall, so this would make a great source for all FQDN's

      A script can continuously read the unbound log file, extract the FQDN's and save them in a per provider file.
      Having the FQDNs I only have to resolve them to get the IP addresses.
      It turned out that Amazon provides different IP addresses every time an A or a quad A request is received. So, to get all possible IP addresses, multiple requests are required.
      In the end I created to scripts. The first reading the unbound log file for the required domain names and the second one that reads all domain names from the files created by the first script and periodically resolving the IP addresses and providing them to pfBlockerNG

      The first script:

      #!/usr/local/bin/bash
      #
      # name: extractMediaDomains
      # extract the FQDN's from the media providers and store them in a per provider file
      #
      #
      declare -A PROVIDER
      PROVIDER+=(["Netflix"]='(netflix\.com|nflxvideo\.net|nflxso\.net)')
      PROVIDER+=(["AmazonPrime"]='(amazon\.com|amazon\.co\.uk|amazonvideo\.com|media-amazon\.com|aiv-cdn\.net|aiv-delivery\.net|pv-cdn\.net|cloudfront\.net|llnwnd\.net)')
      PROVIDER+=(["NPOplus"]='(npo\.nl|npoplayer\.nl|bitmovin\.com|nepworldwide\.nl|scalia\.network|streamgate\.nl|2cnt\.net|npostart)')
      
      TMPDIR='/local/db/pfbng'
      LOCK="${TMPDIR}/lock"
      EXT='Domains.txt'
      PIDFILE="${TMPDIR}/${0##*/}.pid"
      
      [ -d ${TMPDIR} ] || mkdir -p ${TMPDIR}
      
      # Check if we are not already running
      [ -f ${PIDFILE} ] && PID=$(cat ${PIDFILE})
      if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then
              echo "Already running"
              exit
      else
              echo $BASHPID > ${PIDFILE}
      fi
      
      
      function process() {
              while IFS= read -r DOMAIN
              do
                      # Check for termination request
                      if [ -f ${TMPDIR}/terminate ]; then
                              rm -f ${PIDFILE}
                              exit 1
                      fi
      
                      if [ ! -f ${LOCK} ]; then
                              for INDEX in ${!PROVIDER[@]}
                              do
                                      FQDN="${PROVIDER[${INDEX}]}\.$"
                                      if [ ! -z $(echo ${DOMAIN} | grep -E "${FQDN}") ] && [ $(grep -c ${DOMAIN} ${TMPDIR}/${INDEX}${EXT}) == 0 ]; then
                                              /usr/bin/logger "Adding domain ${DOMAIN} to ${INDEX}"
                                              echo ${DOMAIN} >> ${TMPDIR}/${INDEX}${EXT}
                                      fi
                              done
                      fi
              done
      }
      
      
      # Create output files if they do not exist
      for INDEX in ${!PROVIDER[@]}
      do
              [ -f ${TMPDIR}/${INDEX}${EXT} ] || touch ${TMPDIR}/${INDEX}${EXT}
      done
      
      tail -F /var/log/resolver.log | grep -E ".*resolv.*[[:blank:]](A{1}|A{4})[[:blank:]]IN" | cut -d" " -f 11 |process
      

      the second script:

      
      #!/usr/local/bin/bash
      #
      # name: resolvMediaDomains
      # resolve the FQDN's provided by the extractMediaDomains script and feed them to pfBlockerNG
      #
      #
      PROVIDER=("Netflix" "AmazonPrime" "NPOplus")
      TMPDIR='/local/db/pfbng'
      TMPv4="${TMPDIR}/IPv4.tmp"
      TMPv6="${TMPDIR}/IPv6.tmp"
      LOCK="${TMPDIR}/lock"
      PFBDIR='/var/db/pfblockerng'
      EXT='Domains.txt'
      PIDFILE="${TMPDIR}/${0##*/}.pid"
      
      [ -d ${TMPDIR} ] || mkdir -p ${TMPDIR}
      
      # Check if we are not already running
      [ -f ${PIDFILE} ] && PID=$(cat ${PIDFILE})
      if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then
              echo "Already running"
              exit
      else
              echo $BASHPID > ${PIDFILE}
      fi
      
      
      IFS=$'\n'
      
      # Create output files if they do not exist
      for INDEX in ${PROVIDER[@]}
      do
              [ -f ${TMPDIR}/${INDEX}Domains.txt ] || touch ${TMPDIR}/${INDEX}Domains.txt
              [ -f ${TMPDIR}/${INDEX}IPv4.org ]    || touch ${TMPDIR}/${INDEX}IPv4.org
              [ -f ${TMPDIR}/${INDEX}IPv6.org ]    || touch ${TMPDIR}/${INDEX}IPv6.org
      done
      
      while true
      do
              touch ${LOCK}
              for INDEX in ${PROVIDER[@]}
              do
                      # Check for termination request
                      if [ -f ${TMPDIR}/terminate ]; then
                              rm -f ${LOCK}
                              rm -f ${PIDFILE}
                              exit
                      fi
      
                      # cleanup temp files for next itteration
                      echo > ${TMPv4}
                      echo > ${TMPv6}
      
                      # logger "Resolving ${INDEX} media hosts"
                      [ -f ${TMPDIR}/${INDEX}IPv4.txt ] && cp ${TMPDIR}/${INDEX}IPv4.txt ${TMPDIR}/${INDEX}IPv4.org
                      [ -f ${TMPDIR}/${INDEX}IPv6.txt ] && cp ${TMPDIR}/${INDEX}IPv6.txt ${TMPDIR}/${INDEX}IPv6.org
      
                      for DOMAIN in $(cat ${TMPDIR}/${INDEX}Domains.txt)
                      do
                              # ignore comment and empty lines
                              FQDN=$(echo ${DOMAIN} | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//')
                              if [[ ${FQDN::1} != '#' || ${FQDN::1} == '\n' ]] then
                                      # Process IPv4
                                      dig -t a +short ${FQDN} |grep -E '^[0-9]{1,3}.' >> ${TMPv4}
      
                                      # Process IPv6
                                      dig -t aaaa +short ${FQDN} |grep -E '^[0-9a-fA-F]{1,4}\:' >> ${TMPv6}
                              fi
                      done
                      cat ${TMPDIR}/${INDEX}IPv4.org ${TMPv4} |sort -n |uniq > ${TMPDIR}/${INDEX}IPv4.txt
                      cat ${TMPDIR}/${INDEX}IPv6.org ${TMPv6} |sort -n |uniq > ${TMPDIR}/${INDEX}IPv6.txt
      
                      # Copy updated IP lists to pfBlockerNG
                      cp ${TMPDIR}/${INDEX}IPv?.txt ${PFBDIR}
      
                      rm -f ${TMPDIR}/${INDEX}IPv?.org
              done
      
              sleep 5
              rm -f ${LOCK}
              sleep 25
      done
      

      My initial experiment capturing the DNS requests and client hello packets already provided my with the top-level domains.
      These are used in the script to capture the required FQDN's

      I created a start up script to easily start and stop both scripts

      #!/usr/local/bin/bash
      #
      
      BASEDIR='/local'
      TMPDIR='/local/db/pfbng'
      EXTRACT_PIDFILE='extractMediaDomains.pid'
      RESOLVE_PIDFILE='resolvMediaDomains.pid'
      
      
      startme() {
              ${BASEDIR}/bin/extractMediaDomains &
              ${BASEDIR}/bin/resolvMediaDomains &
      }
      
      
      stopme() {
              # Graceful stop
              touch ${TMPDIR}/terminate
              COUNT=60
              while [ ${COUNT} != 0 ]
              do
                      if [ -f ${TMPDIR}/${EXTRACT_PIDFILE} ] || [ -f ${TMPDIR}/${RESOLVE_PIDFILE} ]; then
                              ((COUNT--))
                      else
                               rm -f ${TMPDIR}/terminate
                               exit
                      fi
      
                      echo -n .
                      sleep 2
              done
              echo "processes not stopping... going to kill"
              killme
      }
      
      killme() {
              kill -9 $(cat ${TMPDIR}/${EXTRACT_PIDFILE}) && rm -f ${TMPDIR}/${EXTRACT_PIDFILE}
              kill -9 $(cat ${TMPDIR}/${RESOLVE_PIDFILE}) && rm -f ${TMPDIR}/${RESOLVE_PIDFILE}
              [ -f  ${TMPDIR}/terminate ] && rm -f {TMPDIR}/terminate
      }
      
      statusme() {
              [ -f ${TMPDIR}/${EXTRACT_PIDFILE} ] && PID=$(cat ${TMPDIR}/${EXTRACT_PIDFILE})
              if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then
                      echo "extract process running"
              else
                      echo "extract process not running"
              fi
      
              [ -f ${TMPDIR}/${RESOLVE_PIDFILE} ] && PID=$(cat ${TMPDIR}/${RESOLVE_PIDFILE})
              if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then
                      echo "resolve process running"
              else
                      echo "resolve process not running"
              fi
      }
      
      case "$1" in
              start)          startme ;;
              stop)           stopme ;;
              kill)           killme ;;
              restart)        stopme; startme ;;
              status)         statusme ;;
              *) echo         "Usage: $0 start|stop|restart|status" >&2
                                      exit 1
                                      ;;
      esac
      

      This setup/configuration is working pretty good. I hardly get locked out due to a VPN in use message.
      If that happens it is usually resolved after the IP match list is updated with the latest IP addresses.

      The next version will probably python based and will have time stamps on the FQDN's and IP addresses so they can be removed when they are not periodically refreshed by a capture or an address resolve.

      1 Reply Last reply Reply Quote 1
      • First post
        Last post
      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.