Closes #19 Names from https://github.com/AdguardTeam/cname-trackers/issues/1
7.2 KiB
Geoffrey Frogeye's block list of first-party trackers
What's a first-party tracker?
A tracker is a script put on many websites to gather informations about the visitor. They can be used for multiple reasons: statistics, risk management, marketing, ads serving… In any case, they are a threat to Internet users' privacy and many may want to block them.
Traditionnaly, trackers are served from a third-party.
For example, website1.com
and website2.com
both load their tracking script from https://trackercompany.com/trackerscript.js
.
In order to block those, one can simply block the hostname trackercompany.com
, which is what most ad blockers do.
However, to circumvent this block, tracker companies made the websites using them load trackers from somestring.website1.com
.
The latter is a DNS redirection to website1.trackercompany.com
, directly to an IP address belonging to the tracking company.
Those are called first-party trackers. On top of aforementionned privacy issues, they also cause some security issue, as websites usually trust those scripts more. For more information, learn about Content Security Policy, same-origin policy and Cross-Origin Resource Sharing.
In order to block those trackers, ad blockers would need to block every subdomain pointing to anything under trackercompany.com
or to their network.
Unfortunately, most don't support those blocking methods as they are not DNS-aware, e.g. they only see somestring.website1.com
.
This list is an inventory of every somestring.website1.com
found to allow non DNS-aware ad blocker to still block first-party trackers.
Learn more
- CNAME Cloaking, the dangerous disguise of third-party trackers from NextDNS
- Trackers first-party from Aeris, in french
- uBlock Origin issue
- CNAME Cloaking and Bounce Tracking Defense on WebKit's blog
- Characterizing CNAME cloaking-based tracking on APNIC's webiste
- Characterizing CNAME Cloaking-Based Tracking on the Web is a research paper from Sokendai and ANSSI
List variants
First-party trackers
Recommended for hostfiles-based ad blockers, such as Pi-hole. Recommended for Android ad blockers as applications, such ad Blokada.
- Hosts file: https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt
- Raw list: https://hostfiles.frogeye.fr/firstparty-trackers.txt
This list contains every hostname redirecting to a hand-picked list of first-party trackers.
It should be safe from false-positives.
It also contains all tracking hostnames under company domains (e.g. website1.trackercompany.com
),
useful for ad blockers that don't support mass regex blocking,
while still preventing fallback to third-party trackers.
Don't be afraid of the size of the list, as this is due to the nature of first-party trackers: a single tracker generates at least one hostname per client (typically two).
First-party only trackers
Recommended for ad blockers as web browser extensions, such as uBlock Origin.
- Hosts file: https://hostfiles.frogeye.fr/firstparty-only-trackers-hosts.txt
- Raw list: https://hostfiles.frogeye.fr/firstparty-only-trackers.txt
This is the same list as above, albeit not containing the hostnames under the tracking company domains (e.g. website1.trackercompany.com
).
This allows for reducing the size of the list for ad-blockers that already block those third-party trackers with their support of regex blocking.
Use in conjunction with other block lists used in regex-mode, such as Peter Lowe's
Multi-party trackers
- Hosts file: https://hostfiles.frogeye.fr/multiparty-trackers-hosts.txt
- Raw list: https://hostfiles.frogeye.fr/multiparty-trackers.txt
As first-party trackers usually evolve from third-party trackers, this list contains every hostname redirecting to trackers found in existing lists of third-party trackers (see next section). Since the latter were not designed with first-party trackers in mind, they are likely to contain false-positives. On the other hand, they might protect against first-party tracker that we're not aware of / have not yet confirmed.
Source of third-party trackers
(yes there's only two for now. A lot of existing ones cause a lot of false positives)
Multi-party only trackers
- Hosts file: https://hostfiles.frogeye.fr/multiparty-only-trackers-hosts.txt
- Raw list: https://hostfiles.frogeye.fr/multiparty-only-trackers.txt
This is the same list as above, albeit not containing the hostnames under the tracking company domains (e.g. website1.trackercompany.com
).
This allows for reducing the size of the list for ad-blockers that already block those third-party trackers with their support of regex blocking.
Use in conjunction with other block lists used in regex-mode, such as the ones in the previous section.
Meta
In case of false positives/negatives, or any other question contact me the way you like: https://geoffrey.frogeye.fr
The software used to generate this list is available here: https://git.frogeye.fr/geoffrey/eulaurarien
Acknowledgements
Some of the first-party tracker included in this list have been found by:
- Aeris
- NextDNS and their blocklist's contributors
- Yuki2718 from Wilders Security Forums
- Ha Dao, Johan Mazel, and Kensuke Fukuda, "Characterizing CNAME Cloaking-Based Tracking on the Web", Proceedings of IFIP/IEEE Traffic Measurement Analysis Conference (TMA), 9 pages, 2020.
- AdGuard and their blocklist's contributors
The list was generated using data from
- Rapid7 OpenData, who kindly provided a free account
- Cisco Umbrella Popularity List
- Public DNS Server List
Similar projects:
- NextDNS blocklist: for DNS-aware ad blockers
- Stefan Froberg's lists: subset of those lists grouped by tracker
- AdGuard blocklist: same thing with a bigger scope, maintained by a bigger team