Do you consent to these personal data processing activities by us and our 1,666 partners?
In the fullness of time, we'll be able to generate everything locally. When that happens, why would we need middlemen?
Llama, Stable Diffusion, and Wan Video are a glimpse of what's to come with local AI. It'll get faster and higher quality.
At bare minimum, local agents will one day act like a filter in front of the internet and kill all the ads and bad content like hyper advanced ad blockers. No more rage bait. No more negativity. We'll be able to use AI to scrape the bullshit off the internet before it hits our eyeballs.
It could be nice to go back to that. Hopefully they’ll collapse faster than they grew though.
As far as I can tell, the big SaaS models also suck and have done so for quite a long time by now. They do some autocomplete stuff and destroy people's minds.
I'll give this business that it can do machine translation cheaply, but it's not good and likely never will be.
Anyway "as if values and decisions are always shared absolutely by everyone" is moot because you can GDPR opt out, or else read it via the archive sites.
Yeah, um, we can't... let... that... what. You're right, I should put my phone down.
Imagine a world where all rivers flow north, the wind always blows from the East, and noone is a schemer.
Imagine the river overflow and your secure pod goes to the bottom of the sea...
Last month, my phone decide to die. Luckly most of my info is in the cloud.
And the railway bulls are blind
There's a lake of stew and of whiskey too
You can paddle all around it in a big canoe In the big rock candy mountain
Not completely related, but jobs like these always fall victim to searching for a problem. If the problem that they are getting paid for to solve gets solved, they will need to find another job.
I'm sharing it here hoping to get a convincing counter argument.
The privacy emphasis we had for the past 6-7 was solely because of lack of tech innovation. The best innovation we had during the period was crypto which was mostly scummy. The moment LLMs came out, everyone forgot about privacy
Here's my supporting observation.
- a year or so back, we'd have wanted privacy focused IDE, browser, Notepad, todo list etc.
- launch something with that selling point and nobody wants it right now
- privacy focused tools (ddg etc) are losing steam, and innovation focused ones (AI in this case) like Perplexity is winning.
Privacy will come back as a main selling point, once we've exhausted with innovation. Here's my even more spicier take - don't trust proponents of privacy to push tech innovation forward.
###
I've worded the comment not to be dismissive but to start a conversation. If you're down voting please let me know what your thought process is.
Why should they?
To me it sounds more that they don’t think they could sell LLMs without privacy and they put a lot of effort to try to combine the two.
For Apple one might say that privacy is marketing but surely nobody will say that for Meta and Google
Privacy is about people controlling their data, which includes the ability for you as a user having a say in where your data is sent, used, and processed. All implementations mentioned above fail here.
Does nobody mean 'actually nobody', 'the VC community' or 'the people I hang around with'?
It's probably true that the VC community has abandoned a lot of stuff in favour of 'the new buzzword'. It doesn't mean the rest of humanity has though.
If you don't care about privacy, send nudes.
People for the most part value security and community far more than privacy.
Even in old days - most people lived in a village with zero privacy than lived out in the wilderness all alone.
People passed the income tax which gave far more information to the government than other ways of taxation.
Don’t have to imagine it. This is how it was just a few decades ago.
From the link that you posted, the first thing it asks for is your full name. This is the same when attempting to signup for the VPN service. If paying with crypto, the process tries to steal even more info from you such as zip code and country. No other normal privacy or VPN services ask for a full name or zip code.
The "freedom policy" on the site talks about how the company won't spy on anyone or do anything to compromise privacy, then goes on to talk about "think of the children!"
Minors might subscribe to privacy services. Vp.net would harm minors more than a child predator due to the false sense of security and privacy.
The business lies to everyone saying it is secure because of SGX, which has new vulnerabilities pop up weekly.
What happens when data is shared between Meta + Google? Now it doesn't matter if my data is portable, every player has a copy. There is a need to first establish a data pool (not one but a whole city of people's with private data pools) first before big companies will come asking you for it. How do we get there if all of our apps are designed for us to dump data into their mega pool?
My point about the Steam Machines is that they are essentially NFS boxes with a graphics card and Steam OS. That type of setup would enable personal data pools.
This isn't an inherently bad thing. If an AI could theoretically help you live your life better, nudge you in ways that stabilize your psychology and behaviors for the better of yourself, that's good.
The danger is an AI that decides to re-perpetrate the class division that our existing system does. Lesser fortunate people lose their upward mobility while being guided into subtle traps.
It WILL be turned against you at one point, may it be a decline of insurance in the US, political imprisonment on visiting a non-democratic system, and so on.
https://en.wikipedia.org/wiki/Pluribus_(TV_series)
> The show follows author Carol Sturka, played by Seehorn, as the rest of humanity is suddenly joined into a hive mind that seeks to amicably assimilate Carol and other immune individuals into the mind. The title of the series refers to e pluribus unum, a Latin phrase meaning 'out of many, one'.
> Set in Albuquerque, New Mexico, the series follows author Carol Sturka, who is one of only thirteen people in the world immune to the effects of "the Joining", resulting from an extraterrestrial virus that had transformed the world's human population into a peaceful and content hive mind (the "Others").
To me, one of the greatest dangers of the present moment is that we can't tell whether the LLMs are being asked to give subtly biased answers (or product-placement) on some questions. One cannot readily tell from the output.
The training compute footprints are enormous, several orders magnitude beyond what the average person has access to. Even if a company came out and said "here's our completely open-source model. All the training data, the training procedure, and here's the final-product model."
Maybe you could hire an auditing company? But how long would it take to audit? Would the state of the art advance drastically in the time between?
And people like to keep downvoting my "Make Classwarfare MAD Again" but like I'll wager 90% of people on HN are on the losing side of the war.
Or the people in charge use it for that.
Given human political cycles, every generation or so there's some attempt to demonise a minority or three, and every so often it goes from "demonise" to "genocide".
In principle, AI have plenty of other ways to go wrong besides the human part. No idea how long it would take for them to be competent enough for traditional "doom" scenarios, but the practical reality we can already witness is chronic human laziness: just as "vibe coding" was coined as "don't even bother looking at what the AI does just accept it", there's going to be similar trends in every other domain.
What this means for personalised recommendations, I don't know for sure, but suspect it'll look half way between a cult and taking horrorscopes and fashion guides too seriously.
They get to record people's innermost thoughts, the proprietary code (or derivatives of it) of countless corporations, the contract drafts and political speeches of dumb decision makers all over the world, and more.
The author cites AI therapists, but the people chose to use it themselves? Nobody is forcing them?
Now why bother? Your customer will ask their silver ball (LLMs) anything and everything, and you can directly do bulk analysis on (in theory) the entire interaction, including all of your customer's emotions available via text.
Lastly, your customers are now eager about this tool, so they're excited to integrate/connect everything to it. In a rush to satisfy customers, many companies have lazily built LLM integrations that could even undermine their business model. This pushes yet more data into the LLM. This isn't just telemetry like file names, this is full read access to all of your files. How is that not connected to privacy?
The article mentions numerous examples of how "AI" services erode privacy. And, really, you don't need anyone to tell you these things. If you're remotely technically inclined, which is a fair assumption on this forum, you should be well aware of how these companies operate. It would be more interesting to discuss other related points the article brings up, rather than the known connection between LLMs and privacy.
"Your data" isn't your contact details. It's the record of your interactions with all the external services which by definition they get when you interact with them. Your name is...really the most worthless part of it. Facebook doesn't care and can't monetize your name, but they very much care and do monetize what you browsed and searched, how long you spent looking at it, when you did it and who you talked to.
I want to have _that_ data locally. And why are all these companies making it so incredibly difficult for me to get it? It's MY data after all!
But the default approach of every app is:
> We'll manage your data for you, in our cloud! Ah, btw, you'll only have access to it when online, and only for as long as you pay the subscription fee. Also, if we go out of business, sorry, it's gone.
> <fineprint> You _can_ request a copy of it (damn GDPR), but only once every 30 days, it'll take us 48 hours to prepare it, and we'll send (some of) it to you as a badly-formatted CSV carefully crafted to make it as useless as possible. </fineprint>
gone for you. advertisers and "partners" still can have it. storage is cheap
Another reason to build all these data centers is to apply more active processing of collected information. The COVID scenario and Gaza genocide showed the government is losing tight control of online discourse and so they are doing more to get back into the driver seat.
Of course it doesn't help that people tell their most secret thoughts to an LLM, but before ChatGPT people did that to Google.
The recent AI advancements do make it easier though to process large amounts of data that is already being collected through existing means, and distill them, which has negative consequences on privacy.
But the distillation power of LLMs can also be used for privacy preserving purposes, namely local inference. You don't need to go to recipe websites any more, or go to wikipedia, or stack overflow, but can ask your local model. Sadly though, the non-local ones are still distinguishably better than the locally running ones, and this is probably going to stay.
This is almost exactly what I say on the landing page¹ of the product I'm building (an open-source personal database, with an AI assistant on top).
I want to believe this can be a reality, and I'm trying to make it become one, but there are two significant challenges:
1. AI = cloud. Taking my app as an example, it'll be at least 2 years before consumer hardware will be able to run the smallest model that performs somewhat decently (gpt-oss-20b). And of course in 2 years that model will be beyond obsolete. Would a regular user pay the price of a subpar experience in order to get data ownership and privacy? It's a very hard sell.
2. Apps/services are very jealous of their users' data. As a user I have to jump through incredible hoops just to get a point-in-time copy of my data. If I can get that at all. There is no incentive for apps to allow their users to own their data. On the contrary, it's better if they don't, so they remain locked in the app. Also, regular Joe and Jane users are not really asking to have access to their data, because there's no benefit for them either.
That is, I think, the key to overcome challenge #2: giving regular Joes and Janes an immediate and obvious benefit. If they see that only by owning their data they can do $INCREDIBLY_VALUABLE_THING, then they will themselves start demanding companies access to it, or they will jump through the hoops to get it. (That's the way I'm going about it. I'm nowhere near the end goal, of course, but I see promising results.²)
I have no idea how to overcome challenge #1 yet. Mainly because currently there aren't really any big downsides to using cloud models. Or, at least, we haven't seen them yet. Maybe if OpenAI starts injecting ads in GPT-8 responses, people will reconsider using a "stupider" but local, ad-free model.
Kindles?
wahnfrieden•2mo ago