Commit Graph

21 Commits

Author SHA1 Message Date
Geoffrey Frogeye cec96b7e50
Add Fukuda & co research paper to test suite 2020-12-06 22:13:05 +01:00
Geoffrey Frogeye 0cc18303fd
Re-import Rapid7 datasets when rules have been updated 2020-01-04 10:54:46 +01:00
Geoffrey Frogeye 808e36dde3
Improvements to subdomain collection
I use this for tracker identification so it's not perfect but still it's
a bit better.
2020-01-03 22:08:06 +01:00
Geoffrey Frogeye e93807142c
Explanations folder 2019-12-27 15:35:30 +01:00
Geoffrey Frogeye a4a908955a
Added index webpage 2019-12-27 15:21:33 +01:00
Geoffrey Frogeye d3b244f317
Forgot one dependency 2019-12-26 00:16:18 +01:00
Geoffrey Frogeye 2bcf6cbbf7
Added SINGLE_PROCESS environment variable 2019-12-25 15:15:49 +01:00
Geoffrey Frogeye b310ca2fc2
Clever pruning mechanism 2019-12-25 14:54:57 +01:00
Geoffrey Frogeye c65ae94892
Added ability to use Rapid7 API
Closes #11
2019-12-24 15:08:18 +01:00
Geoffrey Frogeye 7d1c1a1d54
Implement pruning 2019-12-21 19:38:20 +01:00
Geoffrey Frogeye 1a6e64da3d
Forgot numpy dependency 2019-12-20 21:08:21 +01:00
Geoffrey Frogeye 8b7e538677
Updated links
(could not bother guessing them)
2019-12-20 17:24:05 +01:00
Geoffrey Frogeye 38cf532854
Updated README
Split in two actually (program and list).

Closes #3

Also,
Closes #1
Because I forgot to do it earlier.
2019-12-20 17:15:39 +01:00
Geoffrey Frogeye e882e09b37
Added outdated documentation warning in README 2019-12-17 14:27:43 +01:00
Geoffrey Frogeye 7d01d016a5 Can now use AdBlock lists for tracking matching
It's not very performant by itself, especially since pyre2 isn't
maintained nor really compilableinstallable anymore.

The performance seems to have decreased from 200 req/s to 0.2 req/s when
using 512 threads, and to 80 req/s using 64 req/s.
This might or might not be related,as the CPU doesn't seem to be the
bottleneck.

I will probably add support for host-based rules, matching the
subdomains of such hosts (as for now there doesn't seem to be any other
pattern for first-party trackers than subdomains, and this would be a
very broad performace / compatibility with existing lists improvement),
and convert the AdBlock lists to this format, only keeping domains-only
rules.
2019-11-15 08:57:31 +01:00
Geoffrey Frogeye 4e69bdbfc3 CI Test commit 2 2019-11-11 12:41:22 +01:00
Geoffrey Frogeye e0f28d41d2 Added public updated list link 2019-11-11 12:10:46 +01:00
Geoffrey Frogeye a0a2af281f Added possibility to add personal sources 2019-11-11 11:19:46 +01:00
Geoffrey Frogeye 2f1af3c850 Added progressbar and ETA 2019-11-10 21:59:06 +01:00
Geoffrey Frogeye d49a7803e9 Fixed typos 2019-11-10 18:29:16 +01:00
Geoffrey Frogeye 80b23e2d5c Initial commit 2019-11-10 18:14:25 +01:00