eulaurarien

Author	SHA1	Message	Date
Geoffrey Frogeye	b43cb1725c	Autosave Not needed but since the import may take multiple hour I get frustrated if this gets interrupted for some reason.	2019-12-17 15:02:42 +01:00
Geoffrey Frogeye	d65107f849	Save dupplicates too Maybe I won't publish them but this will help me for tracking trackers.	2019-12-17 14:10:41 +01:00
Geoffrey Frogeye	03a4042238	Added level Also fixed IP logic because this was real messed up	2019-12-16 09:31:29 +01:00
Geoffrey Frogeye	aec8d3f8de	Reworked how paths work Get those tuples out of my eyes	2019-12-15 22:21:05 +01:00
Geoffrey Frogeye	7af2074c7a	Small optimisation of feed_switch	2019-12-15 17:12:44 +01:00
Geoffrey Frogeye	45325782d2	Multi-processed parser	2019-12-15 17:05:41 +01:00
Geoffrey Frogeye	ce52897d30	Smol fixes	2019-12-15 16:48:17 +01:00
Geoffrey Frogeye	954b33b2a6	Slightly better Rapid7 parser	2019-12-15 16:38:01 +01:00
Geoffrey Frogeye	4d966371b2	Workflow: SQL -> Tree Welp. All that for this.	2019-12-15 15:56:26 +01:00
Geoffrey Frogeye	ddceed3d25	Workflow: Can now import DnsMass output Well, in a specific format but DnsMass nonetheless	2019-12-15 00:28:08 +01:00
Geoffrey Frogeye	189deeb559	Workflow: Multiprocess Still trying. It's better than multithread though. Merge branch 'newworkflow' into newworkflow_threaded	2019-12-14 17:27:46 +01:00
Geoffrey Frogeye	d7c239a6f6	Workflow: Some modifications	2019-12-14 16:04:19 +01:00
Geoffrey Frogeye	5023b85d7c	Added intermediate representation for DNS datasets It's just CSV. The DNS from the datasets are not ordered consistently, so we need to parse it completly. It seems that converting to an IR before sending data to ./feed_dns.py through a pipe is faster than decoding the JSON in ./feed_dns.py. This will also reduce the storage of the resolved subdomains by about 15% (compressed).	2019-12-13 21:59:35 +01:00
Geoffrey Frogeye	ab7ef609dd	Workflow: Various optimisations and fixes I forgot to close this one earlier, so: Closes #7	2019-12-13 18:08:22 +01:00
Geoffrey Frogeye	f3eedcba22	Updated now based on timestamp Did I forget to add feed_asn.py a few commits ago? Oh well...	2019-12-13 13:54:00 +01:00
Geoffrey Frogeye	231bb83667	Threaded feed_dns Largely disapointing	2019-12-13 12:36:11 +01:00
Geoffrey Frogeye	9050a84670	Read-only mode	2019-12-13 12:35:05 +01:00
Geoffrey Frogeye	57416b6e2c	Workflow: POO and individual tables per types Mostly for performances reasons. First one to implement threading later. Second one to speed up the dichotomy, but it doesn't seem that much better so far.	2019-12-13 00:11:21 +01:00
Geoffrey Frogeye	55877be891	IP parsing C accelerated, use bytes everywhere	2019-12-09 09:47:48 +01:00
Geoffrey Frogeye	7937496882	Workflow: Base for new one While I'm automating this you'll need to download the A set from https://opendata.rapid7.com/sonar.fdns_v2/ to the file a.json.gz.	2019-12-09 08:12:48 +01:00

20 commits