Commit Graph

15 Commits

Author SHA1 Message Date
Geoffrey Frogeye 2b97ee4cb9
Better list output 2019-12-27 21:46:57 +01:00
Geoffrey Frogeye 171fa93873
Force pv output
Even if redirected to a file
Allow to see progress when ran in a cron or something
2019-12-26 15:38:56 +01:00
Geoffrey Frogeye 883942ba55
Allow custom massdns path 2019-12-26 00:33:23 +01:00
Geoffrey Frogeye 2bcf6cbbf7
Added SINGLE_PROCESS environment variable 2019-12-25 15:15:49 +01:00
Geoffrey Frogeye b310ca2fc2
Clever pruning mechanism 2019-12-25 14:54:57 +01:00
Geoffrey Frogeye c65ae94892
Added ability to use Rapid7 API
Closes #11
2019-12-24 15:08:18 +01:00
Geoffrey Frogeye 7d1c1a1d54
Implement pruning 2019-12-21 19:38:20 +01:00
Geoffrey Frogeye aca5023c3f
Fixed scripting around 2019-12-18 13:01:32 +01:00
Geoffrey Frogeye 5023b85d7c
Added intermediate representation for DNS datasets
It's just CSV.
The DNS from the datasets are not ordered consistently,
so we need to parse it completly.
It seems that converting to an IR before sending data to ./feed_dns.py
through a pipe is faster than decoding the JSON in ./feed_dns.py.
This will also reduce the storage of the resolved subdomains by
about 15% (compressed).
2019-12-13 21:59:35 +01:00
Geoffrey Frogeye 8d94b80fd0
Integrated DNS resolving to workflow
Since the bigger datasets are only updated once a month,
this might help for quick updates.
2019-12-13 13:38:23 +01:00
Geoffrey Frogeye 2b0a723c30
Fix log in scripts
Closes #8
2019-12-07 18:45:48 +01:00
Geoffrey Frogeye fe5f0c6c05
Added more rule sources 2019-12-03 17:33:46 +01:00
Geoffrey Frogeye 0159c6037c
Improved DNS resolving performances
Also various fixes.
Also some debug stuff, make sure to remove that later.
2019-12-03 15:35:21 +01:00
Geoffrey Frogeye c609b90390 Append top 1M subdomains rather than replacing it 2019-12-03 09:04:19 +01:00
Geoffrey Frogeye 69b82d29fd
Improved rules handling
Rules can now come in 3 different formats:
- AdBlock rules
- Host lists
- Domains lists
All will be converted into domain lists and aggregated
(only AdBlock rules matching a whole domain will be kept).

Subdomains will now be matched if it is a subdomain of any domain of the
rule.
It is way faster (seconds rather than hours!) but less flexible
(although it shouldn't be a problem).
2019-12-03 08:48:12 +01:00