Hypothesis, Antithesis, Synthesis

https://antithesis.com/blog/2026/hegel/

66•alpaylan•1h ago

Comments

DRMacIver•1h ago

Post author here btw, happy to take questions, whether they're about Hegel in particular, property-based testing in general, or some variant on "WTF do you mean you wrote rust bindings to a python library?"

anentropic•56m ago

TBH reading the first few words of that section I was definitely expecting it to continue "so we used Claude to rewrite Hypothesis in Rust..." so that was quite a surprise!

DRMacIver•53m ago

It's on the agenda! We definitely want to rewrite the Hegel core server in rust, but not as much as we wanted to get it working well first.

My personal hope is that we can port most of the Hypothesis test suite to hegel-rust, then point Claude at all the relevant code and tell it to write us a hegel-core in rust with that as its test harness. Liam thinks this isn't going to work, I think it's like... 90% likely to get us close enough to working that we can carry it over the finish line. It's not a small project though. There are a lot of fiddly bits in Hypothesis, and the last time I tried to get Claude to port it to Rust the result was better than I expected but still not good enough to use.

Chinjut•41m ago

You mention in the post that there are design differences between Hegel/Hypothesis and QuickCheck, partly due to attitude differences between Python/non-Haskell programmers and Haskell programmers. As someone coming from the Haskell world (though by no means considering Haskell a perfect language), could you expand on what kinds of differences these are?

DRMacIver•31m ago

So I think a short list of big API differences are something like:

* Hypothesis/Hegel are very much focused on using test assertions rather than a single property that can be true or false. This naturally drives a style that is much more like "normal" testing, but also has the advantage that you can distinguish between different types of failing test. We don't go too hard on this, but both Hegel and Hypothesis will report multiple distinct failures if your test can fail in multiple ways.

* Hegelothesis's data generation and how it interacts with testing is much more flexible and basically fully imperative. You can basically generate whatever data you like wherever in your test you like, freely interleaving data generation and test execution.

* QuickCheck is very much type-first and explicit generators as an afterthought. I think this is mostly a mistake even in Haskell, but in languages where "just wrap your thing in a newtype and define a custom implementation for it" will get you a "did you just tell me to go fuck myself?" response, it's a nonstarter. Hygel is generator first, and you can get the default generator for a type if you want but it's mostly a convenience function with the assumption that you're going to want a real generator specification at some point soon.

From an implementation point of view, and what enables the big conveniences, Hypothesis has a uniform underlying representation of test cases and does all its operations on them. This means you get:

* Test caching (if you rerun a failing test, it will immediately fail in the same way with the previously shrunk example)

* Validity guarantees on shrinking (your shrunk test case will always be ones your generators could have produced. It's a huge footgun in QuickCheck that you can shrink to an invalid test case)

* Automatically improving the quality of your generators, never having to write your own shrinkers, and a whole bunch of other quality of life improvements that the universal representation lets us implement once and users don't have to care about.

The validity thing in particular is a huge pain point for a lot of users of PBT, and is what drove a lot of the core Hypothesis model to make sure that this problem could never happen.

The test caching is because I personally hated rerunning tests and not knowing whether it was just a coincidence that they were passing this time or that the test case had changed.

tybug•1h ago

As possibly the one community on earth where it's actually better to post the code than the blog post: TL;DR this is a universal property-based testing protocol (https://github.com/hegeldev/hegel-core) and family of libraries (https://github.com/hegeldev/hegel-rust, more to come later).

I've talked with lots of people in the PBT world who have always seen something like this as the end goal of the PBT ecosystem. It seemed like a thing that would happen eventually, someone just had to do it. I'm super excited to actually be doing it and bringing great PBT to every and any language.

It doesn't hurt that this is coming right as great PBT in every language is suddenly a lot more important thanks to AI code!

hugeBirb•1h ago

Not that it matters at this point but the hegelian dialectic is not thesis, antithesis and synthesis. Usually attributed to Hegel but as I understand it he actually pushed back on this mechanical view of it all and his views on these transitory states was much more nuanced.

DRMacIver•1h ago

Conversation with Will (Antithesis CEO) a couple months ago, heavily paraphrased:

Will: "Apparently Hegel actually hated the whole Hegelian dialectic and it's falsely attributed to him."

Me: "Oh, hm. But the name is funny and I'm attached to it now. How much of a problem is that?"

Will: "Well someone will definitely complain about it on hacker news."

Me: "That's true. Is that a problem?"

Will: "No, probably not."

(Which is to say: You're entirely right. But we thought the name was funny so we kept it. Sorry for the philosophical inaccuracy)

wwilson•59m ago

If I had been wearing my fiendish CEO hat at the time, I might have even said something like: "somebody pointing this out will be a great way to jumpstart discussion in the comments."

One of the evilest tricks in marketing to developers is to ensure your post contains one small inaccuracy so somebody gets nerdsniped... not that I have ever done that.

1-more•53m ago

A sort of broadening of Cunningham's Law (the fastest way to get an answer online is not by posting the question, but by posting the wrong answer—very true in my experience). If there's no issue of fact at hand, then you end up getting some engagement about the intentional malapropism/misattribution/mistake/whatever and then the forum rules tend to herd participants back to discussing the matter at hand: your company.

https://meta.wikimedia.org/wiki/Cunningham%27s_Law

dfabulich•57m ago

If that's not motivation enough for you to rename it, well, TypeScript already has a static type checker called Hegel. https://hegel.js.org/ (It's a stronger type system than TypeScript.)

DRMacIver•56m ago

We looked at it and given that the repo was archived nearly two years ago decided it wasn't a problem.

jjgreen•1h ago

"Not that it matters ...", What? Of course it matters! I only come to HN for extended arguments on the meaning of the Dialectic.

AndrewKemendo•56m ago

I gave you one in a sibling ;)

AndrewKemendo•56m ago

Eh… it’s always worth keeping in mind the time period and what was going on with the tooling for mathematics and science at the time.

Statistics wasn’t really quite mature enough to be applied to let’s say political economy a.k.a. economics which is what Hegel was working in.

JB Say (1) was the leading mind in statistics at the time but wasn’t as popular in political circles (Notably Proudhon used Says work as epistemology versus Hegel and Marx)

I’ve been in serious philosophy courses where they take the dialectic literally and it is the epistemological source of reasoning so it’s not gone

This is especially true in how marx expanded into dialectical materialism - he got stuck on the process as the right epistemological approach, and marxists still love the dialectic and Hegelian roots (zizek is the biggest one here).

The dialectic eventually fell due to robust numerical methods and is a degenerate version version of the sampling Markov Process which is really the best in class for epistemological grounding.

Someone posted this here years ago and I always thought it was a good visual: https://observablehq.com/@mikaelau/complete-system-of-philos...

sigbottle•46m ago

I thought the dialectic was just a proof methodology, and especially the modern political angles you might year from say a Youtube video essay on Hegel, was because of a very careful narrative from some french dude (and I guess Marx with his dialectical materialism). I mean, I agree with many perspectives from 20th century continental philosophy, but it has to be agreed that they refactored Hegel for their own purposes, no?

AndrewKemendo•32m ago

Oh the amount of branching and forking and remixing of Hegel is more or less infinite

I think it’s worth again pointing out that Hegel was at the height of contemporary philosophy at the time but he wasn’t a mathematician and this is the key distinction.

Hagel lives in the pre-mathematical economics world. The continental philosophy world of words with Kant etc… and never crossed into the mathematical world. So I liking it too he was doing limited capabilities and tools that he had

Again compare this to the scientific process described by Francis Bacon. There are no remixes to that there’s just improvements.

Ultimately using the dialectic is trying to use an outdated technology for understanding human behavior

sigbottle•14m ago

I mean I don't know about Hegel, but Kant certainly dipped into mathematics. One of the reasons why he even wrote CPR was to unify in his mind, the rationalists (had Leibniz) versus the empiricists (had Newton). 20th century analytic philosophy was heavily informed by Kantian distinctions (Logical Positivism uses very similar terminology, and Carnap himself was a Neo-Kantian originally, though funnily enough Heidegger also was). In the 21st century, It seems like overall philosophy has gotten more specialized and grounded and people have moved away from one unified system of truth, and have gotten more domain-driven, both in continental and analytic philosophy.

It's no doubt that basically nobody could've predicted a priori 20th century mathematics and physics. Not too familiar with the physics side, but any modern philosopher who doesn't take computability seriously isn't worth their salt, for example. Not too familiar with statistics but I believe you that statistics and modern economic theories could disprove say, Marxism as he envisioned it.

That definitely doesn't mean that all those tools from back then are useless or even just misinformed IMO. I witness plenty of modern people (not you) being philosophically bankrupt when making claims.

sigbottle•25m ago

From what I understand, it's a proof technique (other techniques include Kant's Transcendental Deduction or Descartes's pure doubt) that requires generating new conceptual thoughts via internal contradiction and showing necessarily that you lead from one category to the next.

The necessity thing is the big thing - why unfold in this way and not some other way. Because the premises in which you set up your argument can lead to extreme distortions, even if you think you're being "charitable" or whatever. Descartes introduced mind-body dualisms with the method of pure doubt, which at a first glance seemingly is a legitimate angle of attack.

Unfortunately that's about as nuanced as I know. Importantly this excludes out a wide amount of "any conflict that ends in a resolution validates Hegel" kind of sophistry.

viccis•4m ago

>other techniques include Kant's Transcendental Deduction or Descartes's pure doubt

This is not quite accurate. Kant says very explicitly in the (rarely studied) Transcendental Doctrine of Method (Ch 1 Section 4, A789/B817) that this kind of proof method (he calls it "apagogic") is unsuitable to transcendental proofs.

You might be thinking of the much more well studied Antinomies of Pure Reason, in which he uses this kind of proof negatively (which is to say, the circumscribe the limits of reason) as part of his proof against the way the metaphysical arguments from philosophers of his time (which he called "dogmatic" use of reason) about the nature of the cosmos were posed.

pron•58m ago

> property-based testing is going to be a huge part of how we make AI-agent-based software development not go terribly.

There's no doubt, I think, testing will remain important and possibly become more important with more AI use, and so better testing is helpful, PBT included. But the problem remains verifying that the tests actually test what they're supposed to. Mutation tests can allow agents to get good coverage with little human intervention, and PBT can make tests better and more readable. But still, people have to read them and understand them, and I suspect that many people who claim to generate thousands of LOC per day don't.

And even if the tests were great and people carefully reviewed them, that's not enough to make sure things don't go terribly wrong. Anthropic's C compiler experiment didn't fail because of bad testing. Not only were the tests good, it took humans years to write the tests by hand, and the agents still failed to converge.

I think good tests are a necessary condition for AI not generating terrible software, but we're clearly not yet at a point where they're a sufficient one. So "a huge part" - possibly, but there are other huge parts still missing.

tybug•55m ago

I actually think there's another angle here where PBT helps, which wasn't explored in the blog post.

That angle is legibility. How do you know your AI-written slop software is doing the right thing? One would normally read all the code. Bad news: that's not much less labor intensive as not using AI at all.

But, if one has comprehensive property-based tests, they can instead read only the property-based tests to convince themselves the software is doing the right thing.

By analogy: one doesn't need to see the machine-checked proof to know the claim is correct. One only needs to check the theorem statement is saying the right thing.

pron•48m ago

Right, I said that property based tests are easier to read, and that's good. But people still have to actually read them. Also, because they still work best at the "unit" level, to understand them, the people reading them need to know how all the units are connected (e.g. a single person cannot review even PBTs required for 10KLOC per day [1]).

My point isn't so much about PBT, but about how we don't yet know just how much agents help write real software (and how to get the most help from them).

[1]: I'm only using that number because Garry Tan, CEO of YC, claimed to generate 10K lines of text per day that he believes to be working code and developers working with AI agents know they can't be.

DRMacIver•47m ago

> But the problem remains verifying that the tests actually test what they're supposed to.

Definitely. It's a lot harder to fake this with PBT than with example-based testing, but you can still write bad property-based tests and agents are pretty good at doing so.

I have generally found that agents with property-based tests are much better at not lying to themselves about it than agents with just example-based testing, but I still spend a lot of time yelling at Claude.

> So "a huge part" - possibly, but there are other huge parts still missing.

No argument here. We're not claiming to solve agentic coding. We're just testing people doing testing things, and we think that good testing tools are extra important in an agentic world.

pron•43m ago

> We're not claiming to solve agentic coding. We're just testing people doing testing things, and we think that good testing tools are extra important in an agentic world.

Yeah, I know. Just an opportunity to talk about some of the delusions we're hearing from the "CEO class". Keep up the good work!

ngruhn•18m ago

> I have generally found that agents with property-based tests are much better at not lying to themselves

I also observed the cheating to increase. I recently tried to do a specific optimization on a big complex function. Wrote a PBT that checks that the original function returns the same values as the optimized function on all inputs. I also tracked the runtime to confirm that performance improved. Then I let Claude loose. The PBT was great at spotting edge cases but eventually Claude always started cheating: it modified the test, it modified the original function, it implemented other (easier) optimizations, ...

rdevilla•53m ago

This is the first time in my HN membership where I was excited to read about the dialectic, only to be disappointed upon finding out the article is about Rust.

PBT is for sure the future - which is apparently now? 10 years ago when I was talking about QuickCheck [0] all the JS and Ruby programmers in my city just looked at me like I had two heads.

[0] https://github.com/ryandv/chesskell/blob/master/test/Test/Ch...

DRMacIver•6m ago

TBF PBT has been the present in Python for a while now.

10 years ago might have been a little early (Hypothesis 1.0 came out 11 years ago this coming Thursday), but we had pretty wide adoption by year two and it's only been growing. It's just that the other languages have all lagged behind.

It's by no means universally adopted, but it's not a weird rare thing that nobody has heard of.

lwhsiao•6m ago

[delayed]

No Signs of AI Replacing Offshore Workers

Official CLI for Resend

Building a Blog with Elixir and Phoenix

Security vendor slams supplier for delayed notice after staff data exposed

Netboot.xyz

Software for Myself

Anthropic's CEO Said All Code Will Be AI-Generated in a Year (March 2025)

Nomos – an execution firewall for AI agents

NASA Unveils Initiatives to Achieve America's National Space Policy

Günter Schabowski

We Don't Have a Lyme Disease Vaccine

Paper: Hallucination Detector That Works

Update on the OpenAI Foundation

AI Boom Drives US to Build Enough Battery Storage Systems for Domestic Demand

Why There Is No "AlphaFold for Materials" – AI for Materials Discovery

Cognitive Science of Religion

A $1k AWS bill led me to redesign my ECS architecture

Alibaba revealed the XuanTie C950, a 5-nanometer RISC-V Chip for agentic AI

ToolClad: Declarative tool interface contracts for agentic runtimes

Red Lobster's Last Gasp

Show HN: Gridland: make terminal apps that also run in the browser

Show HN: Ensemble Neuroscience – Full Brain Mapping for Precision Treatment

Show HN: Aurea, a lossy image codec I built from scratch that beats JPEG (Rust)

Launching AccessPatch on Product Hunt today – would love your support

The Last Contract: William T. Vollmann's Battle to Publish an Epic (2025)

As parents age, their children face hard choices about when to take the car keys

A Decade of Eventide: Evolving an Event-Sourced Architecture and Ecosystem

Playable CSS-Only Super Mario Bros Game

Show HN: I built the first AI agentic fitness coaching app

HyperAgents