Geoffrey Frogeye
3b6f7a58b3
They changed their privacy / pricing model and as such I don't have access to their massive DNS dataset anymore, even after asking. Since 2022-01-02, I put the list on freeze while looking for an alternative, but couldn't find any. To make the list update again with the remaining DNS sources I have, I put the last version of the list generated with the Rapid7 dataset as an input for subdomains, that will now get resolved with MassDNS.
115 lines
7.3 KiB
Markdown
115 lines
7.3 KiB
Markdown
# Geoffrey Frogeye's block list of first-party trackers
|
|
|
|
## What's a first-party tracker?
|
|
|
|
A tracker is a script put on many websites to gather informations about the visitor.
|
|
They can be used for multiple reasons: statistics, risk management, marketing, ads serving…
|
|
In any case, they are a threat to Internet users' privacy and many may want to block them.
|
|
|
|
Traditionnaly, trackers are served from a third-party.
|
|
For example, `website1.com` and `website2.com` both load their tracking script from `https://trackercompany.com/trackerscript.js`.
|
|
In order to block those, one can simply block the hostname `trackercompany.com`, which is what most ad blockers do.
|
|
|
|
However, to circumvent this block, tracker companies made the websites using them load trackers from `somestring.website1.com`.
|
|
The latter is a DNS redirection to `website1.trackercompany.com`, directly to an IP address belonging to the tracking company.
|
|
|
|
Those are called first-party trackers.
|
|
On top of aforementionned privacy issues, they also cause some security issue, as websites usually trust those scripts more.
|
|
For more information, learn about [Content Security Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP), [same-origin policy](https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy) and [Cross-Origin Resource Sharing](https://enable-cors.org/).
|
|
|
|
In order to block those trackers, ad blockers would need to block every subdomain pointing to anything under `trackercompany.com` or to their network.
|
|
Unfortunately, most don't support those blocking methods as they are not DNS-aware, e.g. they only see `somestring.website1.com`.
|
|
|
|
This list is an inventory of every `somestring.website1.com` found to allow non DNS-aware ad blocker to still block first-party trackers.
|
|
|
|
### Learn more
|
|
|
|
- [CNAME Cloaking, the dangerous disguise of third-party trackers](https://medium.com/nextdns/cname-cloaking-the-dangerous-disguise-of-third-party-trackers-195205dc522a) from NextDNS
|
|
- [Trackers first-party](https://blog.imirhil.fr/2019/11/13/first-party-tracker.html) from Aeris, in french
|
|
- [uBlock Origin issue](https://github.com/uBlockOrigin/uBlock-issues/issues/780)
|
|
- [CNAME Cloaking and Bounce Tracking Defense](https://webkit.org/blog/11338/cname-cloaking-and-bounce-tracking-defense/) on WebKit's blog
|
|
- [Characterizing CNAME cloaking-based tracking](https://blog.apnic.net/2020/08/04/characterizing-cname-cloaking-based-tracking/) on APNIC's webiste
|
|
- [Characterizing CNAME Cloaking-Based Tracking on the Web](https://tma.ifip.org/2020/wp-content/uploads/sites/9/2020/06/tma2020-camera-paper66.pdf) is a research paper from Sokendai and ANSSI
|
|
|
|
## List variants
|
|
|
|
### First-party trackers
|
|
|
|
**Recommended for hostfiles-based ad blockers, such as [Pi-hole](https://pi-hole.net/) (<v5.0, as it introduced CNAME blocking).**
|
|
**Recommended for Android ad blockers as applications, such ad [Blokada](https://blokada.org/).**
|
|
|
|
- Hosts file: <https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt>
|
|
- Raw list: <https://hostfiles.frogeye.fr/firstparty-trackers.txt>
|
|
|
|
This list contains every hostname redirecting to [a hand-picked list of first-party trackers](https://git.frogeye.fr/geoffrey/eulaurarien/src/branch/master/rules/first-party.list).
|
|
It should be safe from false-positives.
|
|
It also contains all tracking hostnames under company domains (e.g. `website1.trackercompany.com`),
|
|
useful for ad blockers that don't support mass regex blocking,
|
|
while still preventing fallback to third-party trackers.
|
|
Don't be afraid of the size of the list, as this is due to the nature of first-party trackers: a single tracker generates at least one hostname per client (typically two).
|
|
|
|
### First-party only trackers
|
|
|
|
**Recommended for ad blockers as web browser extensions, such as [uBlock Origin](https://ublockorigin.com/) (<v1.25.0 or for Chromium-based browsers, as it introduced CNAME uncloaking for Firefox).**
|
|
|
|
- Hosts file: <https://hostfiles.frogeye.fr/firstparty-only-trackers-hosts.txt>
|
|
- Raw list: <https://hostfiles.frogeye.fr/firstparty-only-trackers.txt>
|
|
|
|
This is the same list as above, albeit not containing the hostnames under the tracking company domains (e.g. `website1.trackercompany.com`).
|
|
This allows for reducing the size of the list for ad-blockers that already block those third-party trackers with their support of regex blocking.
|
|
Use in conjunction with other block lists used in regex-mode, such as [Peter Lowe's](https://pgl.yoyo.org/adservers/)
|
|
|
|
### Multi-party trackers
|
|
|
|
- Hosts file: <https://hostfiles.frogeye.fr/multiparty-trackers-hosts.txt>
|
|
- Raw list: <https://hostfiles.frogeye.fr/multiparty-trackers.txt>
|
|
|
|
As first-party trackers usually evolve from third-party trackers, this list contains every hostname redirecting to trackers found in existing lists of third-party trackers (see next section).
|
|
Since the latter were not designed with first-party trackers in mind, they are likely to contain false-positives.
|
|
On the other hand, they might protect against first-party tracker that we're not aware of / have not yet confirmed.
|
|
|
|
#### Source of third-party trackers
|
|
|
|
- [EasyPrivacy](https://easylist.to/easylist/easyprivacy.txt)
|
|
- [AdGuard](https://github.com/AdguardTeam/AdguardFilters)
|
|
|
|
(yes there's only two for now. A lot of existing ones cause a lot of false positives)
|
|
|
|
### Multi-party only trackers
|
|
|
|
- Hosts file: <https://hostfiles.frogeye.fr/multiparty-only-trackers-hosts.txt>
|
|
- Raw list: <https://hostfiles.frogeye.fr/multiparty-only-trackers.txt>
|
|
|
|
This is the same list as above, albeit not containing the hostnames under the tracking company domains (e.g. `website1.trackercompany.com`).
|
|
This allows for reducing the size of the list for ad-blockers that already block those third-party trackers with their support of regex blocking.
|
|
Use in conjunction with other block lists used in regex-mode, such as the ones in the previous section.
|
|
|
|
## Meta
|
|
|
|
In case of false positives/negatives, or any other question contact me the way you like: <https://geoffrey.frogeye.fr>
|
|
|
|
The software used to generate this list is available here: <https://git.frogeye.fr/geoffrey/eulaurarien>
|
|
|
|
## Acknowledgements
|
|
|
|
Some of the first-party tracker included in this list have been found by:
|
|
|
|
- [Aeris](https://imirhil.fr/)
|
|
- NextDNS and [their blocklist](https://github.com/nextdns/cname-cloaking-blocklist)'s contributors
|
|
- Yuki2718 from [Wilders Security Forums](https://www.wilderssecurity.com/threads/ublock-a-lean-and-fast-blocker.365273/page-168#post-2880361)
|
|
- Ha Dao, Johan Mazel, and Kensuke Fukuda, ["Characterizing CNAME Cloaking-Based Tracking on the Web", Proceedings of IFIP/IEEE Traffic Measurement Analysis Conference (TMA), 9 pages, 2020.](https://tma.ifip.org/2020/wp-content/uploads/sites/9/2020/06/tma2020-camera-paper66.pdf)
|
|
- AdGuard and [their blocklist](https://github.com/AdguardTeam/cname-trackers)'s contributors
|
|
|
|
The list was generated using data from
|
|
|
|
- [Cisco Umbrella Popularity List](http://s3-us-west-1.amazonaws.com/umbrella-static/index.html)
|
|
- [Public DNS Server List](https://public-dns.info/)
|
|
|
|
|
|
Similar projects:
|
|
|
|
- [NextDNS blocklist](https://github.com/nextdns/cname-cloaking-blocklist): for DNS-aware ad blockers
|
|
- [Stefan Froberg's lists](https://www.orwell1984.today/cname/): subset of those lists grouped by tracker
|
|
- [AdGuard blocklist](https://github.com/AdguardTeam/cname-trackers): same thing with a bigger scope, maintained by a bigger team
|
|
|