Resolving IP addresses from media providers
-
Being extremely privacy and security all outbound connections I use are through VPN's
Most if not all media providers (Netflix, Amazon Prime, etc) are allergic to VPN connections because they don't own the rights to provide each movie and tv serie in all countries.
One of the options I have is to use a VPN provider that is not blocked by them.
Since I'm also a big supporter of IPv6, this is pretty much impossible. The majority of VPN providers do not support IPv6.In the end I decided to to actively collect the IP addresses from the media providers and use them in pfBlockerNG IP match lists.
This way I can select the WAN gateway for this traffic instead of the default VPN gateway.
My initial process was pretty straightforward:- Collect FQDN's used by the media providers
- Resolve the FQDN's to IP addresses
- Add these IP addresses to pfBlockerNG match lists
- create a FW policy with the IP match list to select the correct gateway.
Initially I used a TAP and a packet broker to capture the traffic from my Apple TV's and filter out the DNS requests and HTTPS client hello packets to capture all the FQDN's
While it worked pretty well, it is a cumbersome and laborious process and while it provided a lot of information, it definitely wasn't the way forward.
My second approach was using tcpdump on the firewall and capture all DNS resolve requests sent by the Apple TV's.
This worked pretty well but I found tcpdump not particularly flexible, especially if you know tshark is much more convenient, so I abandoned this method as well.
Then I realised that all DNS requests are redirected using the NAT list to the unbound dns resolver on the firewall, so this would make a great source for all FQDN'sA script can continuously read the unbound log file, extract the FQDN's and save them in a per provider file.
Having the FQDNs I only have to resolve them to get the IP addresses.
It turned out that Amazon provides different IP addresses every time an A or a quad A request is received. So, to get all possible IP addresses, multiple requests are required.
In the end I created to scripts. The first reading the unbound log file for the required domain names and the second one that reads all domain names from the files created by the first script and periodically resolving the IP addresses and providing them to pfBlockerNGThe first script:
#!/usr/local/bin/bash # # name: extractMediaDomains # extract the FQDN's from the media providers and store them in a per provider file # # declare -A PROVIDER PROVIDER+=(["Netflix"]='(netflix\.com|nflxvideo\.net|nflxso\.net)') PROVIDER+=(["AmazonPrime"]='(amazon\.com|amazon\.co\.uk|amazonvideo\.com|media-amazon\.com|aiv-cdn\.net|aiv-delivery\.net|pv-cdn\.net|cloudfront\.net|llnwnd\.net)') PROVIDER+=(["NPOplus"]='(npo\.nl|npoplayer\.nl|bitmovin\.com|nepworldwide\.nl|scalia\.network|streamgate\.nl|2cnt\.net|npostart)') TMPDIR='/local/db/pfbng' LOCK="${TMPDIR}/lock" EXT='Domains.txt' PIDFILE="${TMPDIR}/${0##*/}.pid" [ -d ${TMPDIR} ] || mkdir -p ${TMPDIR} # Check if we are not already running [ -f ${PIDFILE} ] && PID=$(cat ${PIDFILE}) if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then echo "Already running" exit else echo $BASHPID > ${PIDFILE} fi function process() { while IFS= read -r DOMAIN do # Check for termination request if [ -f ${TMPDIR}/terminate ]; then rm -f ${PIDFILE} exit 1 fi if [ ! -f ${LOCK} ]; then for INDEX in ${!PROVIDER[@]} do FQDN="${PROVIDER[${INDEX}]}\.$" if [ ! -z $(echo ${DOMAIN} | grep -E "${FQDN}") ] && [ $(grep -c ${DOMAIN} ${TMPDIR}/${INDEX}${EXT}) == 0 ]; then /usr/bin/logger "Adding domain ${DOMAIN} to ${INDEX}" echo ${DOMAIN} >> ${TMPDIR}/${INDEX}${EXT} fi done fi done } # Create output files if they do not exist for INDEX in ${!PROVIDER[@]} do [ -f ${TMPDIR}/${INDEX}${EXT} ] || touch ${TMPDIR}/${INDEX}${EXT} done tail -F /var/log/resolver.log | grep -E ".*resolv.*[[:blank:]](A{1}|A{4})[[:blank:]]IN" | cut -d" " -f 11 |process
the second script:
#!/usr/local/bin/bash # # name: resolvMediaDomains # resolve the FQDN's provided by the extractMediaDomains script and feed them to pfBlockerNG # # PROVIDER=("Netflix" "AmazonPrime" "NPOplus") TMPDIR='/local/db/pfbng' TMPv4="${TMPDIR}/IPv4.tmp" TMPv6="${TMPDIR}/IPv6.tmp" LOCK="${TMPDIR}/lock" PFBDIR='/var/db/pfblockerng' EXT='Domains.txt' PIDFILE="${TMPDIR}/${0##*/}.pid" [ -d ${TMPDIR} ] || mkdir -p ${TMPDIR} # Check if we are not already running [ -f ${PIDFILE} ] && PID=$(cat ${PIDFILE}) if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then echo "Already running" exit else echo $BASHPID > ${PIDFILE} fi IFS=$'\n' # Create output files if they do not exist for INDEX in ${PROVIDER[@]} do [ -f ${TMPDIR}/${INDEX}Domains.txt ] || touch ${TMPDIR}/${INDEX}Domains.txt [ -f ${TMPDIR}/${INDEX}IPv4.org ] || touch ${TMPDIR}/${INDEX}IPv4.org [ -f ${TMPDIR}/${INDEX}IPv6.org ] || touch ${TMPDIR}/${INDEX}IPv6.org done while true do touch ${LOCK} for INDEX in ${PROVIDER[@]} do # Check for termination request if [ -f ${TMPDIR}/terminate ]; then rm -f ${LOCK} rm -f ${PIDFILE} exit fi # cleanup temp files for next itteration echo > ${TMPv4} echo > ${TMPv6} # logger "Resolving ${INDEX} media hosts" [ -f ${TMPDIR}/${INDEX}IPv4.txt ] && cp ${TMPDIR}/${INDEX}IPv4.txt ${TMPDIR}/${INDEX}IPv4.org [ -f ${TMPDIR}/${INDEX}IPv6.txt ] && cp ${TMPDIR}/${INDEX}IPv6.txt ${TMPDIR}/${INDEX}IPv6.org for DOMAIN in $(cat ${TMPDIR}/${INDEX}Domains.txt) do # ignore comment and empty lines FQDN=$(echo ${DOMAIN} | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//') if [[ ${FQDN::1} != '#' || ${FQDN::1} == '\n' ]] then # Process IPv4 dig -t a +short ${FQDN} |grep -E '^[0-9]{1,3}.' >> ${TMPv4} # Process IPv6 dig -t aaaa +short ${FQDN} |grep -E '^[0-9a-fA-F]{1,4}\:' >> ${TMPv6} fi done cat ${TMPDIR}/${INDEX}IPv4.org ${TMPv4} |sort -n |uniq > ${TMPDIR}/${INDEX}IPv4.txt cat ${TMPDIR}/${INDEX}IPv6.org ${TMPv6} |sort -n |uniq > ${TMPDIR}/${INDEX}IPv6.txt # Copy updated IP lists to pfBlockerNG cp ${TMPDIR}/${INDEX}IPv?.txt ${PFBDIR} rm -f ${TMPDIR}/${INDEX}IPv?.org done sleep 5 rm -f ${LOCK} sleep 25 done
My initial experiment capturing the DNS requests and client hello packets already provided my with the top-level domains.
These are used in the script to capture the required FQDN'sI created a start up script to easily start and stop both scripts
#!/usr/local/bin/bash # BASEDIR='/local' TMPDIR='/local/db/pfbng' EXTRACT_PIDFILE='extractMediaDomains.pid' RESOLVE_PIDFILE='resolvMediaDomains.pid' startme() { ${BASEDIR}/bin/extractMediaDomains & ${BASEDIR}/bin/resolvMediaDomains & } stopme() { # Graceful stop touch ${TMPDIR}/terminate COUNT=60 while [ ${COUNT} != 0 ] do if [ -f ${TMPDIR}/${EXTRACT_PIDFILE} ] || [ -f ${TMPDIR}/${RESOLVE_PIDFILE} ]; then ((COUNT--)) else rm -f ${TMPDIR}/terminate exit fi echo -n . sleep 2 done echo "processes not stopping... going to kill" killme } killme() { kill -9 $(cat ${TMPDIR}/${EXTRACT_PIDFILE}) && rm -f ${TMPDIR}/${EXTRACT_PIDFILE} kill -9 $(cat ${TMPDIR}/${RESOLVE_PIDFILE}) && rm -f ${TMPDIR}/${RESOLVE_PIDFILE} [ -f ${TMPDIR}/terminate ] && rm -f {TMPDIR}/terminate } statusme() { [ -f ${TMPDIR}/${EXTRACT_PIDFILE} ] && PID=$(cat ${TMPDIR}/${EXTRACT_PIDFILE}) if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then echo "extract process running" else echo "extract process not running" fi [ -f ${TMPDIR}/${RESOLVE_PIDFILE} ] && PID=$(cat ${TMPDIR}/${RESOLVE_PIDFILE}) if [ ! -z ${PID} ] && [ $(ps | grep -c "^${PID}") != 0 ]; then echo "resolve process running" else echo "resolve process not running" fi } case "$1" in start) startme ;; stop) stopme ;; kill) killme ;; restart) stopme; startme ;; status) statusme ;; *) echo "Usage: $0 start|stop|restart|status" >&2 exit 1 ;; esac
This setup/configuration is working pretty good. I hardly get locked out due to a VPN in use message.
If that happens it is usually resolved after the IP match list is updated with the latest IP addresses.The next version will probably python based and will have time stamps on the FQDN's and IP addresses so they can be removed when they are not periodically refreshed by a capture or an address resolve.