Browse Source

Added public updated list link

newworkflow_parseropti v1.1.0
Geoffrey Frogeye 2 years ago
parent
commit
e0f28d41d2
  1. 12
      README.md
  2. 14
      filter_subdomains.sh

12
README.md

@ -2,6 +2,8 @@
Generates a host list of first-party trackers for ad-blocking.
The latest list is available here: <https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt>
**DISCLAIMER:** I'm by no way an expert on this subject so my vocabulary or other stuff might be wrong. Use at your own risk.
## What's a first-party tracker?
@ -51,6 +53,10 @@ Just to build the list, you can find an already-built list in the releases.
## Usage
This is only if you want to build the list yourself.
If you just want to use the list, the latest build is available here: <https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt>
It was build using additional sources not included in this repository for privacy reasons.
### Add personal sources
The list of websites provided in this script is by no mean exhaustive,
@ -58,14 +64,16 @@ so adding your own browsing history will help create a better list.
Here's reference command for possible sources:
- **Pi-hole**: `sqlite3 /etc/pihole-FTL.db "select distinct domain from queries" > /path/to/eulaurarien/subdomains/my-pihole.custom.list`
- **Firefox**: `cp ~/.mozilla/firefox/<your_profile>.default/places.sqlite temp; sqlite3 temp "select distinct rev_host from moz_places" | rev | sed 's|^\.||' > /path/to/eulaurarien/subdomains/my-firefox.custom.list`
- **Firefox**: `cp ~/.mozilla/firefox/<your_profile>.default/places.sqlite temp; sqlite3 temp "select distinct rev_host from moz_places" | rev | sed 's|^\.||' > /path/to/eulaurarien/subdomains/my-firefox.custom.list; rm temp`
### Collect subdomains from websites
This step is optional if you already added personal sources.
Just run `collect_subdomain.sh`.
This is a long step, and might be memory-intensive from time to time.
This step is optional if you already added personal sources.
Alternatively, you can get just download the list of subdomains used to generate the official block list here: <https://hostfiles.frogeye.fr/from_websites.cache.list> (put it in the `subdomains` folder).
### Extract tracking domains
Make sure your system is configured with a DNS server without limitation.

14
filter_subdomains.sh

@ -9,8 +9,18 @@ sort -u temp/all_toblock.list > dist/firstparty-trackers.txt
# Format the blocklist so it can be used as a hostlist
(
echo "# First-party trackers"
echo "# List generated on $(date -Isec) by eulaurarien $(git describe --tags --dirty)"
echo "# First-party trackers host list"
echo "#"
echo "# About first-party trackers: https://git.frogeye.fr/geoffrey/eulaurarien#whats-a-first-party-tracker"
echo "# Source code: https://git.frogeye.fr/geoffrey/eulaurarien"
echo "# Latest version of this list: https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt"
echo "#"
echo "# Generation date: $(date -Isec)"
echo "# Generation version: eulaurarien $(git describe --tags --dirty)"
echo "# Number of source websites: $(wc -l temp/all_websites.list | cut -d' ' -f1)"
echo "# Number of source subdomains: $(wc -l temp/all_subdomains.list | cut -d' ' -f1)"
echo "# Number of blocked subdomains: $(wc -l dist/firstparty-trackers.txt | cut -d' ' -f1)"
echo
cat dist/firstparty-trackers.txt | while read host;
do
echo "0.0.0.0 $host"

Loading…
Cancel
Save