It's just CSV. The DNS from the datasets are not ordered consistently, so we need to parse it completly. It seems that converting to an IR before sending data to ./feed_dns.py through a pipe is faster than decoding the JSON in ./feed_dns.py. This will also reduce the storage of the resolved subdomains by about 15% (compressed).
13 lines
400 B
Bash
Executable file
13 lines
400 B
Bash
Executable file
#!/usr/bin/env bash
|
|
|
|
function log() {
|
|
echo -e "\033[33m$@\033[0m"
|
|
}
|
|
|
|
log "Compiling locally known subdomain…"
|
|
# Sort by last character to utilize the DNS server caching mechanism
|
|
pv subdomains/*.list | sed 's/\r$//' | rev | sort -u | rev > temp/all_subdomains.list
|
|
log "Resolving locally known subdomain…"
|
|
pv temp/all_subdomains.list | ./resolve_subdomains.py --output temp/all_resolved.csv
|
|
|