2019-11-10 18:14:25 +01:00
|
|
|
# eulaurarien
|
|
|
|
|
|
|
|
Generates a host list of first-party trackers for ad-blocking.
|
|
|
|
|
|
|
|
**DISCLAIMER:** I'm by no way an expert on this subject so my vocabulary or other stuff might be wrong. Use at your own risk.
|
|
|
|
|
|
|
|
## What's a first-party tracker?
|
|
|
|
|
|
|
|
Traditionally, websites load trackers scripts directly.
|
|
|
|
For example, `website1.com` and `website2.com` both load `https://trackercompany.com/trackerscript.js` to track their users.
|
|
|
|
In order to block those, one can simply block the host `trackercompany.com`.
|
|
|
|
|
|
|
|
However, to circumvent this easy block, tracker companies made the website using them load trackers from `somethingirelevant.website1.com`.
|
|
|
|
The latter being a DNS redirection to `website1.trackercompany.com`, directly pointing to a server serving the tracking script.
|
|
|
|
Those are the first-party trackers.
|
|
|
|
|
|
|
|
Blocking `trackercompany.com` doesn't work any more, and blocking `*.trackercompany.com` isn't really possible since:
|
|
|
|
|
|
|
|
1. Most ad-blocker don't support wildcards
|
|
|
|
2. It's a DNS redirection, meaning that most ad-blockers will only see `somethingirelevant.website1.com`
|
|
|
|
|
|
|
|
So the only solution is to block every `somethingirelevant.website1.com`-like subdomains known, which is a lot.
|
|
|
|
That's where this scripts comes in, to generate a list of such subdomains.
|
|
|
|
|
|
|
|
## How does this script work
|
|
|
|
|
|
|
|
It takes an input a list of websites with trackers included.
|
|
|
|
So far, this list is manually-generated from the list of clients of such first-party trackers
|
|
|
|
(latter we should use a general list of websites to be more exhaustive).
|
|
|
|
|
|
|
|
It open each ones of those websites (just the homepage) in a web browser, and record the domains of the network requests the page makes.
|
|
|
|
It then find the DNS redirections of those domains, and compare with regexes of known tracking domains.
|
|
|
|
It finally outputs the matching ones.
|
|
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
|
|
Just to build the list, you can find an already-built list in the releases.
|
|
|
|
|
|
|
|
- Bash
|
|
|
|
- Python 3.4+
|
|
|
|
- Firefox
|
|
|
|
- Selenium
|
|
|
|
- seleniumwire
|
|
|
|
- dnspython
|
2019-11-10 21:59:06 +01:00
|
|
|
- [progressbar2](https://pypi.org/project/progressbar2/)
|
2019-11-10 18:14:25 +01:00
|
|
|
|
2019-11-10 18:29:16 +01:00
|
|
|
And then just run `eulaurarien.sh`.
|
|
|
|
|
2019-11-10 18:14:25 +01:00
|
|
|
## Contributing
|
|
|
|
|
|
|
|
### Adding websites
|
|
|
|
|
|
|
|
Just add them to `websites.list`.
|
|
|
|
|
|
|
|
### Adding first-party trackers regex
|
|
|
|
|
|
|
|
Just add them to `regexes.py`.
|