Fully homomorphic encryption and the dawn of a private internet

157•barisozmen•3h ago

Comments

dcow•3h ago

Assuming speed gets solved as predicted, for an application like search, the provider would have to sync a new database of “vectors” to all clients every time the index updates. On top of that, these DBs are tens if not hundreds of GB huge.

blintz•3h ago

I say this as a lover of FHE and the wonderful cryptography around it:

While it’s true that FHE schemes continue to get faster, they don’t really have hope of being comparable to plaintext speeds as long as they rely on bootstrapping. For deep, fundamental reasons, bootstrapping isn’t likely to ever be less than ~1000x overhead.

When folks realized they couldn’t speed up bootstrapping much more, they started talking about hardware acceleration, but it’s a tough sell at time when every last drop of compute is going into LLMs. What $/token cost increase would folks pay for computation under FHE? Unless it’s >1000x, it’s really pretty grim.

For anything like private LLM inference, confidential computing approaches are really the only feasible option. I don’t like trusting hardware, but it’s the best we’ve got!

ipnon•2h ago

Don't you think there is a market for people who want services that have provable privacy even if it costs 1,000 times more? It's not as big a segment as Dropbox but I imagine it's there.

poly2it•2h ago

???

For the equivalent of $500 in credit you could self host the entire thing!

haiku2077•2h ago

You're not joking. If you're like most people and have only a few TiB of data in total, self hosting on a NAS or spare PC is very viable. There are even products for non-technical people to set this up (e.g. software bundled with a NAS). The main barrier is having an ISP with a sufficient level of service.

kube-system•1h ago

Sure, hardware is cheap.

However if you actually follow the 3-2-1 rule with your backups, then you need to include a piece of real estate in your calculation as well, which ain’t cheap.

bcraven•32m ago

I keep a small backup drive at my office which I bring home each month to copy my most sensitive documents and photos onto.

All my ripped media could be ripped again: I only actually have a couple of Tb of un-lose-able data.

adastra22•17m ago

FHE is so much more expensive that it would still be cheaper.

mahmoudimus•2h ago

there is, it's called governments. however this technology is so slow that using it in mission critical systems (think communication / coordinates during warfare) that it is not feasible IMO.

the parent post is right, confidential compute is really what we've got.

landl0rd•2h ago

For LLM inference, the market that will pay $20,000 for what is now $20 is tiny.

oakwhiz•1h ago

For most this would mean only specially treating a subset of all the sensitive data they have.

txdv•1h ago

I get that there is a big LLM hype, but is there really no other application for FHE? Like for example trading algorithms (not the high speed once) that you can host on random servers knowing your stuff will be safe or something similar?

seanhunter•1h ago

I speak as someone who used to build trading algorithms (not the high speed ones) for a living for several years, so knows that world pretty well. I highly doubt anyone who does that will host their stuff on random servers even if you had something like FHE. Why? Because it's not just the code that is confidential.

1) if you are a registered broker dealer you will just incur a massive amount of additional regulatory burden if you want to host this stuff in any sort of "random server"

2) Whoever you are, you need the pipe from your server to the exchange to be trustworthy, so no-one can MITM your connection and front-run your (client's) orders.

3) This is an industry where when people host servers in something like an exchange data center it's reasonably common to put them in a locked cage to ensure physical security. No-one is going to host on a server that could be physically compromised. Remember that big money is at stake and data center staff typically aren't well paid (compared to someone working for an IB or hedge fund), so social engineering would be very effective if someone wanted to compromise your servers.

4)Even if you are able to overcome #1 and are very confident about #2 and #3, even for slow market participants you need to have predictable latency in your execution or you will be eaten for breakfast by the fast players[1]. You won't want to be on a random server controlled by anyone else in case they suddenly do something that affects your latency.

[1] For example, we used to have quite slow execution ability compared with HFTs and people who were co-located at exchanges, so we used to introduce delays when we routed orders to multiple exchanges so the orders would arrive at their destinations at precisely the same time. Even though our execution latency was high, this meant no-one who was colocated at the exchange could see the order at one exchange and arb us at another exchange.

bruce511•2h ago

I get the "client side" of this equation; some number of users want to keep their actions/data private enough that they are willing to pay for it.

What I don't think they necessarily appreciate is how expensive that would be, and consequently how few people would sign up.

I'm not even assuming that the compute cost would be higher than currently. Let's leave aside the expected multiples in compute cost - although they won't help.

Assume, for example, a privacy-first Google replacement. What does that cost? (Google revenue is a good place to start that Calc.) Even if it was say $100 a year (hint; it's not) how many users would sign up for that? Some sure, but a long long way away from a noticeable percentage.

Once we start adding zeros to that number (to cover the additional compute cost) it gets even lower.

While imperfect, things like Tor provide most of the benefit, and cost nothing. As an alternative it's an option.

I'm not saying that HE is useless. I'm saying it'll need to be paid for, and the numbers that will pay to play will be tiny.

barisozmen•2h ago

An FHE Google today would be incredible expensive and incredibly slow. No one would pay for it.

The key question I think is how much computing speed will improve in the future. If we assume FHE will take 1000x more time, but hardware also becomes 1000x faster, then the FHE performance will be similar to today's plaintext speed.

Predicting the future is impossible, but as software improves and hardware becoming faster and cheaper every year, and as FHE provides a unique value of privacy, it's plausible that at some point it can become the default (if not 10 years, maybe in 50 years).

Today's hardware is many orders of magnitudes faster compared to 50 years ago.

There are of course other issues too. Like ciphertext size being much larger than plaintext, and requirement of encrypting whole models or indexes per client on the server side.

FHE is not practical for most things yet, but its venn diagram of feasible applications will only grow. And I believe there will be a time in the future that its venn diagram covers search engines and LLMs.

demaga•2h ago

> If we assume FHE will take 1000x more time, but hardware also becomes 1000x faster, then the FHE performance will be similar to today's plaintext speed

Yeah but this also means you can do 1000x more things on plaintext.

paulrudy•2h ago

> FHE enables computation on encrypted data

This is fascinating. Could someone ELI5 how computation can work using encrypted data?

And does "computation" apply to ordinary internet transactions like when using a REST API, for example?

pluto_modadic•2h ago

a simple example of partial homomorphic encryption (not full), would be if a system supports addition or multiplication. You know the public key, and the modulus, so you can respect the "wrap around" value, and do multiplication on an encrypted number.

other ones I imagine behave kinda like translating, stretching, or skewing a polynomial or a donut/torus, such that the point/intercepts are still solveable, still unknown to an observer, and actually represent the correct mathematical value of the operation.

just means you treat the []byte value with special rules

paulrudy•2h ago

Thank you. So based on your examples it sounds like the "computation" term is quite literal. How would this apply at larger levels of complexity like interacting anonymously with a database or something like that?

dachrillz•1h ago

A very basic way of how it works: encryption is basically just a function e(m, k)=c. “m” is your plaintext and “c” is the encrypted data. We call it an encryption function if the output looks random to anyone that does not have the key

If we could find some kind of function “e” that preserves the underlying structure even when the data is encrypted you have the outline of a homomorphic system. E.g. if the following happens:

e(2,k)*e(m,k) = e(2m,k)

Here we multiplied our message with 2 even in its encrypted form. The important thing is that every computation must produce something that looks random, but once decrypted it should have preserved the actual computation that happened.

It’s been a while since I did crypto, so google might be your friend here; but there are situations when e.g RSA preserves multiplication, making it partially homomorphic.

littlecranky67•1h ago

I get how that works for arithmetic operations - what about stuff like sorting, finding an element in a set etc? This would require knowledge of the cleartext data, wouldn't it?

barisozmen•1h ago

You can reduce anything happening on the computer to arithmetic operations. If you can do additions and multiplications, then it's turing complete. All others can be constructed from them.

littlecranky67•1h ago

While correct, that doesn't answer the question at all, though. If I have my address book submited into an FHE system and want to sort by name - how do you do that if the FHE system does not have access to cleartext names?

barisozmen•1h ago

You can do that by encrypting the names. You send encrypted names to the FHE-server, and then the server does necessary sorting computations on it.

The point of FHE is it can operate on gibberish-looking ciphertext, and when this ciphertext decrypted afterwards, the result is correct.

Indeed, there are those working on faster FHE sorting: https://eprint.iacr.org/2021/551.pdf

harvie•2h ago

Ok, lets stop being delusional here. I'll tell you how this will actualy work:

Imagine your device sending Google an encrypted query and getting back the exact results it wanted — without you having any way of knowing what that query was or what result they returned. The technique to do that is called Fully Homomorphic Encryption (FHE).

pluto_modadic•2h ago

queries are Oblivious Transfer - a second limited case of FHE that actually addresses the filter threat model.

teo_zero•1h ago

I think the opening example involving Google is misleading. When I hear "Google" I think "search the web".

The articles is about getting an input encrypted with key k, processing it without decrypting it, and sending back an output that is encrypted with key k, too. Now it looks to me that the whole input must be encrypted with key k. But in the search example, the inputs include a query (which could be encrypted with key k) and a multi-terabyte database of pre-digested information that's Google's whole selling point, and there's no way this database could be encrypted with key k.

In other words this technique can be used when you have the complete control of all the inputs, and are renting the compute power from a remote host.

Not saying it's not interesting, but the reference to Google can be misunderstood.

ElFitz•26m ago

> Now it looks to me that the whole input must be encrypted with key k. But in the search example, the inputs include a query […] and a multi-terabyte database […]

That’s not the understanding I got from Apple’s CallerID example[0][1]. They don’t seem to be making an encrypted copy of their entire database for each user.

[0]: https://machinelearning.apple.com/research/homomorphic-encry...

[1]: https://machinelearning.apple.com/research/wally-search

aitchnyu•1h ago

E2EE git was invented. I asked the creator if server can enforce protected branches or force pushes. He has no solution for evil clients. Maybe this could lead to E2EE Github?

https://news.ycombinator.com/item?id=44530927

athrowaway3z•1h ago

> Internet's "Spy by default" can become "Privacy by default".

I've been building and promoting digital signatures for years. Its bad for people and market-dynamics to have Hacker News or Facebook be the grand arbiter of everyone's identity in a community.

Yet here we are because its just that much simpler to build and use it this way, which gets them more users and money which snowballs until alternatives dont matter.

In the same vein, the idea that FHE is a missing piece many people want is wrong. Everything is still almost all run on trust, and that works well enough that very few use cases want the complexity cost - regardless of operation overhead - to consider FHE.

gblargg•1h ago

The idea that these will keep being improved on in speed reminds me of the math problem about average speed:

> An old car needs to go up and down a hill. In the first mile–the ascent–the car can only average 15 miles per hour (mph). The car then goes 1 mile down the hill. How fast must the car go down the hill in order to average 30 mph for the entire 2 mile trip?

Past improvement is no indicator of future possibility, given that each improvement was not re-application of the same solution as before. These are algorithms, not simple physical processes shrinking.

perching_aix•57m ago

41 mph, assuming the person asking the question was just really passionate about rounding numbers and/or had just the bare minimum viable measurement tooling available :)))

DeathArrow•47m ago

Most states will probably either forbid this or demand back doors.

IshKebab•17m ago

I think this should talk about the kinds of applications you can actually do with FHE because you definitely can't implement most applications (not at a realistic scale anyway).

utf_8x•16m ago

As someone who knows basically nothing about cryptography - wouldn't training an LLM to work on encrypted data also make that LLM extremely good at breaking that encryption?

I assume that doesn't happen? Can someone ELI5 please?

mynameismon•7m ago

From my understanding of cryptography, most schemes are created with the assumption that _any_ function that does not have access to the secret key will have a probabilistically small chance of decoding the correct message (O(exp(-key_length)) usually). As LLMs are also a function, it is extremely unlikely for cryptographic protocols to be broken _unless_ LLMs can allow for new types of attacks all together.

Jgoauh•10m ago

Homomophobic just 18 days after pride month :( damn

zkmon•10m ago

What baffles me is, how can code perform computations and comparisons on data that is still encrypted in memory.

VMG•8m ago

> 3. Data while processing is un-encrypted, as code need to 'see' the data

read the article again

baby•6m ago

code in FHE doesn't need to see the data

orwin•3m ago

I interrupted this fascinating read to tell that "actually", quantum computers are great at multi-dimensional calculation if you find the correct algorithms. It's probably the only thing they will ever be great at. You want to show that finding the algorithm is not possible with our current knowledge.

anyway, making the computer do the calculation is one thing, getting it to spew the correct data is another.... But still, the article (which seems great at the moment) brushes it of a bit too quickly.

GitHub abused to distribute payloads on behalf of malware-as-a-service

Show HN: UML is dead – so I'm building the tool to revive it

The Pragmatic Engineer 2025 Survey: What's in your tech stack?

The NEC PC Engine FX Game Console

Refactoring to Rust: integrate Rust performance surgically into other languages

"Changing elves to wolves makes a difference"

What happens when an octopus engages with art?

India hits 50% non-fossil power milestone five years ahead of 2030 target

How AI Vibe Coding Is Destroying Junior Developers' Careers

Amazon, Google and Vibe Coding with Steve Yegge [video]

OpenZFS Bug Ported to C

OrioleDB fastpath search (faster random key lookups for PostgreSQL)

Dictionary.com "devastated" paid users by abruptly deleting saved words lists

Upcoming changes to the Bitnami catalog (effective August 28th, 2025)

Cloudflare's Transparency Deserves More Credit

After Firing 9000 Employees for AI, Xbox Shares Cringe Job Post with AI Graphics

India became a French fry superpower

Astrophotographer Captures Solar Eclipse on Saturn

US passes first major national crypto legislation

Show HN: Take – process file lines with a logic-like language

Running a vibe-code platform (Coder) on your own infrastructure

YouTuber faces criminal charges, jail time for reviewing handheld consoles

Ask HN: Is Linux for laptop worth the trouble?

Will AI make you stupid?

Dotenv's Promotion on Runtime Message

RNNoise: Learning Noise Suppression (2017)

Sexually aggressive behavior triggered by parasitic infection [pdf]

Improving End-to-End Tests to Reduce Flakiness: Tools and Strategies

BatakJava: An Object-Oriented Programming Language with Versions [pdf]

Ancient DNA solves mystery of Hungarian, Finnish language family's origins