not even past 1 year.
Oof. This makes me cringe so hard. I once took over a project (but the developer didn't know they were getting fired) and the guy was doing everything on his laptop, from his laptop. Deployments and builds were from his laptop. Even dependencies weren't checked into the code (using global installs of them on unknown versions). The owner had me come in because after talking to several people realized he was in a bad place.
It took me ~2 months to learn everything and document all the things. Then the owner fired him. That guy kept development back for so long by simply not documenting/sharing code and configuration. Now there's an entire team with a healthy development flow. But wow, I had some flashbacks reading that...
When I was looking for ad-block solutions on Android, Rethink DNS was actually on the top of my list. However, when I found out that their server was written in JavaScript, I did some benchmarking.
Rethink's server processed DNS requests in 400-500ms, which could potentially make a new webpage render up to half a second slower the first time:
~ > curl -w "DNS Time: %{time_namelookup}s\nConnect: %{time_connect}s\nTotal: %{time_total}s\n" -H "accept: application/dns-message" -H "content-type: application/dns-message" --data "<binary-data>" -o nul -s https://sky.rethinkdns.com/1:6AcDACIBLIDAAFQwIAAACA==
DNS Time: 0.004995s
Connect: 0.142332s
Total: 0.462496s
While the Cloudflare's server took just 5-10ms, as seen below: ~ > curl -w "DNS Time: %{time_namelookup}s\nConnect: %{time_connect}s\nTotal: %{time_total}s\n" -H "accept: application/dns-message" -H "content-type: application/dns-message" --data "<binary-data>" -o nul -s 1.1.1.1
DNS Time: 0.000026s
Connect: 0.006273s
Total: 0.008822s
In the end, I chose AdAway and have stuck to that choice.You can choose "system default" in the Rethink Android app from Configure -> DNS -> System DNS. You don't have to use Rethink's DoH (DNS over HTTPS) / DoT (DNS over TLS) servers. You can setup any DNSCrypt, Oblivious DoH, or plain old DNS endpoint with Rethink to forward DNS requests to.
> I chose AdAway and have stuck to that choice for now
A decent choice.
A word of caution when using it in non-root mode, though: AdAway borrows code from another (now discontinued but pretty solid) project viz. DNS66. From when I looked at the code, DNS66 never handled DNS over TCP. You can test this with Termux: dig +tcp example.com A.
> doubt their choice of implementing something so performance-critical as a DNS server in JavaScript
For a remotely-hosted DNS resolver, network latency & concurrency is likely to dominate than performance of the code itself. You're right that JS is slow (as compared to Rust, say), but looking at metrics from serving 300m to 3bn queries/day on Cloudflare, our median (CPU time) spent processing a DNS request was consistently less than 2ms, and IO wait (there are no "disks" so, this is mostly hitting the caches or upstream resolvers) was estimated to be less than 7ms.
Also, we employ a bunch of optimizations on client & the server, as appropriate (if we're missing any we should implement, let us know).
- Coalesce multiple requests into one upstream request: For example, if there's 40 clients all wanting to resolve the same domain (say, ipv4only.arpa) within milliseconds of each other, only one request is upstreamed.
- Utilize in-process LFU cache and Cloudflare's Cache API to avoid upstreaming where possible.
- Connection pool egress to avoid reconnect overheads.
- Race response for a query from multiple upstreams in parallel.
- Support TLS session resumption.
- Prefer the lighter TLS_AES_128_GCM_SHA256 cipher suite.
- Dynamically adjust the TLS frame size. DNS queries aren't that big.
- Disable Nagling on TCP.
- Rehydrate responses for top domains stored in the in-process cache in the background.
- Admission Control (load shedding) & queuing disciplines (CoDel etc)
---
> curl -w "DNS Time: %{time_namelookup}s\nConnect: %{time_connect}s\nTotal: %{time_total}s\n" -H "accept: application/dns-message" -H "content-type: application/dns-message" --data "<binary-data>" -o nul -s 1.1.1.1
Shouldn't the switch "-s" be "https://one.one.one.one/dns-query"? For DoH perf specifically, I use https://github.com/ameshkov/godnsbench as it supports parallel and cache-busting (random) queries.
Ex:
Rethink on Workers:
./godnsbench -a https://sky.rethinkdns.com/dns-query -p 10 -c 300 -t 1 -q {random}.dnsleaktest.com
godnsbench with the following configuration:
{
"Address": "https://sky.rethinkdns.com/dns-query",
"Connections": 10,
"Timeout": 1,
"QueriesCount": 300,
}
The test results are:
Elapsed: 6.120078941s
Average QPS: 49.018394
Processed queries: 300
Average per query: 20.400765ms
Errors count: 0
Cloudflare 1.1.1.1: ./godnsbench -a https://cloudflare-dns.com/dns-query -p 10 -c 300 -t 1 -q {random}.dnsleaktest.com
godnsbench with the following configuration:
{
"Address": "https://cloudflare-dns.com/dns-query",
"Connections": 10,
"Timeout": 1,
"QueriesCount": 300,
}
The test results are:
Elapsed: 6.780474724s
Average QPS: 44.244383
Processed queries: 299
Average per query: 22.677588ms
Errors count: 1
If we set the count parameter to 1 (closer to the mentioned real-world use cases), we can reproduce the result in my comment above. Rethink DNS actually takes 450 ms to return a response.
./godnsbench -c 1 -q google.com -a https://sky.rethinkdns.com/dns-query
...
2025/05/04 23:22:47 [info] Average per query: 452.3948ms
With Cloudflare DNS, it takes only 35 ms, even when using DNS over HTTPS. ./godnsbench -c 1 -q google.com -a https://cloudflare-dns.com/dns-query
...
2025/05/04 23:22:49 [info] Average per query: 34.4343ms
As sky.rethinkdns.com is hosted on Cloudflare, there shouldn't be any notable network differences between Rethink DNS and Cloudflare DNS. If you inspect the list of servers in the following two lists, they are almost identical.- https://check-host.net/check-ping?host=https://sky.rethinkdn...
- https://check-host.net/check-ping?host=https://cloudflare-dn...
If CPU time isn't the bottleneck of response time, any idea what the reason is behind the difference in response time?
Could the ~430ms be startup time for a sleeping worker?
Or is it the cost for a cache miss?
Ed: hm, I suppose google.com should pretty much always be a cache hit... Unless they have some really strange TTLs?
./godnsbench -c 1 -q google.com -a https://sky.rethinkdns.com/dns-query
...
2025/05/05 02:50:14 [info] Average per query: 532.1144ms
Subsequent queries for google.com within a few minutes would result in a slightly shorter response time, like my second comment above. ./godnsbench -c 1 -q google.com -a https://sky.rethinkdns.com/dns-query
...
2025/05/05 02:50:48 [info] Average per query: 468.5749ms
./godnsbench -c 1 -q google.com -a https://sky.rethinkdns.com/dns-query
...
2025/05/05 02:55:04 [info] Average per query: 472.1229ms
Some of the lengthy response times originate from the overhead of DNS over HTTPS. If we switch to the DNS over TLS version of Rethink DNS, there will be about a 240-320 ms improvement in response time. ./godnsbench -c 1 -q google.com -a tls://max.rethinkdns.com
...
2025/05/05 02:50:26 [info] Average per query: 290.4762ms
./godnsbench -c 1 -q google.com -a tls://max.rethinkdns.com
...
2025/05/05 02:50:51 [info] Average per query: 145.1868ms
However, 290ms is still quite a large delay for DNS queries, which makes webpage rendering a quarter of a second slower the first time you visit it.Is it all per request overhead and gets eaten by a single large request doing multiple lookups?
One won't really gain much insight for DoH / DoT resolver performance with one query per connection, I don't think.
For something as lightweight as domain name resolution, network latency / round-trip time dominate so much (for a public DNS resolver anyway) that the CPU-bound optimizations are almost always pointless if you don't first optimize your network deployment: For example, not running servers closer to your users wouldn't move the needle much regardless of how big your caches are, or the language your resolver is written in (C or Rust or hand-rolled assembly or SIMD/SIMT built-ins etc). A good indicator of performance of a pubic DNS resolver would be to monitor it over a period of time, to effectively rule out noise introduced by variances.
And 400ms ping is remarkably bad for a client that can otherwise do 20ms to a nearby server?
Our servers see 2500-25000 requests per second. So, my intention was to test the server, not the client (original post points out issues with the server).
> which is very different from real-world use cases
Not really. A decent DoH/DoT/ODoH client would pool TLS/HTTP connections (on top of mutliplexing HTTP/2 on TCP) & have its own caches (probably hydrate the top domains in the background, even).
> As sky.rethinkdns.com is hosted on Cloudflare, there shouldn't be any notable network differences between Rethink DNS and Cloudflare DNS.
Cloudflare likely co-locates its DNS resolvers with all major ISPs, which might explain the speed. For instance, ping 1.1.1.1 reports ridiculous numbers (<3ms) on some networks.
> With Cloudflare DNS, it takes only 35 ms, even when using DNS over HTTPS
In my experiments, for (non-random) repeat queries, the fastest Cloudflare response is about 20% faster with "-c 1" than Max and 30% faster than Sky.
You should try if any other non-BigTech public DNS posts comparable numbers for "-c 1": Like ControlD, AdGuard, NextDNS, CleanBrowsing, DNSFilter, Quad9 etc. In my experiments, these aren't significantly better than Max or Sky. And imo, concurrent "random" (cache-busting) queries are better test of a remote resolver's capabilities.
For Max specifically, caches speed up response times significantly, even if Max is deployed to only 30+ locations (as compared to Cloudflare's 300+) running 40+ low-powered servers on Fly.io's network. For Sky, a lot is under the control of Cloudflare: How they route & shape our traffic and just where they run our Workers. For the most part though, Sky is comparatively pretty fast. If you use Rethink (enable Configure -> DNS -> DNS Booster to turn on optimizations), you should be able to see per-query round-trip time in Configure -> Logs -> DNS.
Could WASM AoT bring about those? ;)
> rewrites ... compiled language
With stuff like AutoFDO on Android, we've come a full circle with folks "JITing" compiled blocks... https://lwn.net/Articles/995397/
> Propeller requires specific software and hardware support to do its job
WASM AoT is nothing special, we have had plenty of those since UNCOL (1958).
Two of the oldest systems using bytecode as distribution format are Burroughs from 1961, nowadays Unisys ClearPath MCP, and IBM AS/400 from 1988, nowadays IBM i.
This while not bothering to actually list all the products and attempts between 1958 and 2025.
Naturally on the magpie developer culture it is easier to sell shinny new, than having people caring about our history.
Not that many people talk about its usage as a stub resolver, and I don't know why!
chrisweekly•2d ago
Tangent: Bunny.net is my new favorite CDN / cloud service provider. They have scriptable DNS too.
Imustaskforhelp•1d ago
So I am off to cloudflare workers free tier.
chrisweekly•1d ago