frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Project Genie: Experimenting with infinite, interactive worlds

https://blog.google/innovation-and-ai/models-and-research/google-deepmind/project-genie/
380•meetpateltech•6h ago•190 comments

PlayStation 2 Recompilation Project Is Absolutely Incredible

https://redgamingtech.com/playstation-2-recompilation-project-is-absolutely-incredible/
164•croes•4h ago•60 comments

Claude Code daily benchmarks for degradation tracking

https://marginlab.ai/trackers/claude-code/
487•qwesr123•9h ago•252 comments

Grid: Forever free, local-first, browser-based 3D printing/CNC/laser slicer

https://grid.space/stem/
19•cyrusradfar•46m ago•1 comments

Drug trio found to block tumour resistance in pancreatic cancer

https://www.drugtargetreview.com/news/192714/drug-trio-found-to-block-tumour-resistance-in-pancre...
188•axiomdata316•7h ago•90 comments

Flameshot

https://github.com/flameshot-org/flameshot
79•OsrsNeedsf2P•3h ago•33 comments

Compressed Agents.md > Agent Skills

https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals
81•maximedupre•10h ago•39 comments

Launch HN: AgentMail (YC S25) – An API that gives agents their own email inboxes

100•Haakam21•6h ago•120 comments

Where to Sleep in LAX

https://cadence.moe/blog/2025-12-30-where-to-sleep-in-lax
19•surprisetalk•6d ago•7 comments

Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT

https://openai.com/index/retiring-gpt-4o-and-older-models/
119•rd•2h ago•163 comments

The Value of Things

https://journal.stuffwithstuff.com/2026/01/24/the-value-of-things/
43•vinhnx•4d ago•19 comments

Cutting Up Curved Things (With Math)

https://campedersen.com/tessellation
6•ecto•50m ago•0 comments

County pays $600k to pentesters it arrested for assessing courthouse security

https://arstechnica.com/security/2026/01/county-pays-600000-to-pentesters-it-arrested-for-assessi...
234•MBCook•4h ago•121 comments

A lot of population numbers are fake

https://davidoks.blog/p/a-lot-of-population-numbers-are-fake
219•bookofjoe•9h ago•206 comments

Is the RAM shortage killing small VPS hosts?

https://www.fourplex.net/2026/01/29/is-the-ram-shortage-killing-small-vps-hosts/
90•neelc•7h ago•123 comments

Waymo robotaxi hits a child near an elementary school in Santa Monica

https://techcrunch.com/2026/01/29/waymo-robotaxi-hits-a-child-near-an-elementary-school-in-santa-...
259•voxadam•9h ago•460 comments

Show HN: Kolibri, a DIY music club in Sweden

https://kolibrinkpg.com/
25•EastLondonCoder•7h ago•7 comments

The WiFi only works when it's raining (2024)

https://predr.ag/blog/wifi-only-works-when-its-raining/
17•epicalex•2h ago•3 comments

Reflex (YC W23) Senior Software Engineer Infra

https://www.ycombinator.com/companies/reflex/jobs/Jcwrz7A-lead-software-engineer-infra
1•apetuskey•6h ago

EmulatorJS

https://github.com/EmulatorJS/EmulatorJS
80•avaer•6d ago•11 comments

My Mom and Dr. DeepSeek (2025)

https://restofworld.org/2025/ai-chatbot-china-sick/
109•kieto•4h ago•72 comments

How to choose colors for your CLI applications (2023)

https://blog.xoria.org/terminal-colors/
140•kruuuder•8h ago•79 comments

Box64 Expands into RISC-V and LoongArch territory

https://boilingsteam.com/box64-expands-into-risc-v-and-loong-arch-territory/
29•ekianjo•4d ago•2 comments

Deep dive into Turso, the "SQLite rewrite in Rust"

https://kerkour.com/turso-sqlite
93•unsolved73•8h ago•88 comments

Run Clawdbot/Moltbot on Cloudflare with Moltworker

https://blog.cloudflare.com/moltworker-self-hosted-ai-agent/
129•ghostwriternr•8h ago•45 comments

The Hallucination Defense

https://niyikiza.com/posts/hallucination-defense/
33•niyikiza•3h ago•80 comments

US cybersecurity chief leaked sensitive government files to ChatGPT: Report

https://www.dexerto.com/entertainment/us-cybersecurity-chief-leaked-sensitive-government-files-to...
364•randycupertino•7h ago•189 comments

AI's impact on engineering jobs may be different than expected

https://semiengineering.com/ais-impact-on-engineering-jobs-may-be-different-than-initial-projecti...
74•rbanffy•5h ago•134 comments

Usenet personality

https://en.wikipedia.org/wiki/Usenet_personality
61•mellosouls•3d ago•28 comments

Apple buys Israeli startup Q.ai

https://techcrunch.com/2026/01/29/apple-buys-israeli-startup-q-ai-as-the-ai-race-heats-up/
78•ishener•2h ago•28 comments
Open in hackernews

Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT

https://openai.com/index/retiring-gpt-4o-and-older-models/
117•rd•2h ago

Comments

__loam•2h ago
Last time they tried to do this they got huge push back from the AI boyfriend people lol
cactusplant7374•2h ago
I wonder if they have run the analytics on how many users are doing that. I would love to see that number.
NitpickLawyer•1h ago
> only 0.1% of users still choosing GPT‑4o each day.

If the 800MAU still holds, that's 800k people.

simonw•1h ago
/r/MyBoyfriendIsAI https://www.reddit.com/r/MyBoyfriendIsAI/ is a whole thing. It's not a joke subreddit.
bananaflag•1h ago
And it's a pity that this highly prevalent phenomenon (to exaggerate a bit, probably the way tech in general will become the most influential in the next couple years) is barely mentioned on HN.
pxc•1h ago
I dunno. Tbf that subreddit has a combination of

  - a large number of incredibly fragile users
  - extremely "protective" mods
  - a regular stream of drive-by posts that regulars there see as derogatory or insulting
  - a fair amount of internal diversity and disagreement
I think discussion on forums larger than it, like HN or popular subreddits, is likely to drive traffic that will ultimately fuel a backfiring effect for the members. It's inevitable, and it's already happening, but I'm not sure it needs to increase.

I do think the phenomenon is a matter of legitimate public concern, but idk how that can best be addressed. Maybe high-quality, long form journalism? But probably not just cross-posting the sub in larger fora.

nomel•1h ago
> highly prevalent phenomenon

Any numbers/reference behind this?

ChatGPT has ~300 million active users a day. A 0.02% (delusion disorder prevalence) would be 60k people.

bananaflag•1h ago
I'm talking about romance, not delusion. Of course, you can consider AI romance a delusion, but it's not included in that percentage you mentioned.
nomel•1h ago
The percentage I mentioned was an example of how a very small prevalence can result in a reasonable number of people, like enough to fill a subreddit, because ChatGPT has a user count that exceeds all but 3 countries of the world.

Again, do you have anything behind this "highly prevalent phenomenon" claim?

pxc•1h ago
The range of attitudes in there is interesting. There are a lot of people who take a fairly sensible "this is interactive fiction" kind of attitude, and there are others who bristle at any claim or reminder that these relationships are fictitious. There are even people with human partners who have "married" one or more AIs.
unethical_ban•1h ago
IIRC you'll get modded or banned for being critical of the use case. Which is their "right", but it's freaking weird.
chasd00•39m ago
do you think they know they're just one context reset away from the llm not recognizing them at all and being treated like a stranger off the street? For someone mentally ill and somehow emotionally attached to the context it would be... jarring to say the least.
hamdingers•12m ago
Many of them are very aware of how LLMs work, they regularly interact with context limits and there have been threads about thoughtfully pruning context vs letting the LLM compact, making backups, etc.

Their hobby is... weird, but they're not stupid.

ragazzina•1h ago
>It's not a joke subreddit.

Spend a day on Reddit and you'll quickly realize many subreddits are just filled with lies.

unethical_ban•1h ago
Any sub that is based on storytelling or reposting memes, videos etc. are karma farms and lies.

Most subs that are based on politics or current events are at best biased, at worst completely astroturf.

The only subs that I think still have mostly legit users are municipal subs (which still get targeted by bots when anything political comes up) and hobby subs where people show their works or discuss things.

moomoo11•1h ago
Those people need to be uploaded into the Matrix and the data servers sent far, deep into space.
leumon•1h ago
well now you can unlock an 18+ version for sexual role-play so i guess its the other way around
ora-600•2h ago
I can't see o3 in my model selector as well?

RIP

MagicMoonlight•2h ago
That’s really going to upset the crazies.

Despite 4o being one of the worst models on the market, they loved it. Probably because it was the most insane and delusional. You could get it to talk about really fucked up shit. It would happily tell you that you are the messiah.

BeetleB•1h ago
It was the first model I used that was half decent at coding. Everyone remembers their gateway drug.
giancarlostoro•1h ago
I wonder if it will still be up on Azure? How much you think I can make if I setup 4o under a domain like yourgirlfriendis.ai or w/e

Note: I wouldnt actually, I find it terrible to prey on people.

lifetimerubyist•1h ago
ChatGPT Made Me Delusional: https://www.youtube.com/watch?v=VRjgNgJms3Q

Should be essential watching for anyone that uses these things.

patrickmcnamara•1h ago
The reaction to its original removal on Instagram Reels, r/ChatGPT, etc., was genuinely so weird and creepy. I didn't realise before this how many people had genuine parasocial (?) relationships with these LLMs.
pks016•45m ago
I was mostly using 4o for academic searches and planning. It was the best model for me. Based on the context I was giving and questions I was asking, 4o was the most the consistent model.

It used to get things wrong for sure but it was predictable. Also I liked the tone like everyone else. I stopped using ChatGPT after they removed 4o. Recently, I have started using the newer GPT-5 models (got free one month). Better than before but not quite. Acts way over smart haha

simonw•1h ago
> [...] the vast majority of usage has shifted to GPT‑5.2, with only 0.1% of users still choosing GPT‑4o each day.
SecretDreams•1h ago
What's the default model when a random user goes to use the chatgpt website or app?
mrec•1h ago
5.2 in the website. You can see what was used for a specific response by hovering over the refresh icon at the end.
AlexeyBrin•1h ago
On the paid version it is 5.2.
bananaflag•1h ago
5.2.

You can go to chatgpt.com and ask "what model are you" (it doesn't hallucinate on this).

SecretDreams•1h ago
Probably a relationship between what's the default and what model is being used the most. It is more about what OAI sets than what users care about. Flip side is "good enough is good enough" for most users.
johndough•1h ago
> (it doesn't hallucinate on this)

But how do we know that you did not hallucinate the claim that ChatGPT does not hallucinate its version number?

We could try to exfiltrate the system prompt which probably contains the model name, but all extraction attempts could of course be hallucinations as well.

(I think there was an interview where Sam Altman or someone else at OpenAI where it was mentioned that they hardcoded the model name in the prompt because people did not understand that models don't work like that, so they made it work. I might be hallucinating though.)

razodactyl•1h ago
Confabulating* If you were hallucinating we would be more amused :)
lifetimerubyist•1h ago
won't somebody think of the goonettes?!
deciduously•1h ago
This was not a word I was prepared to learn about today.
navigate8310•43m ago
https://old.reddit.com/r/myboyfriendisai/top?t=all
fpgaminer•1h ago
Well yeah, because 5.2 is the default and there's no way to change the default. So every time you open up a new chat you either use 5.2 or go out of your way to select something else.

(I'm particularly annoyed by this UI choice because I always have to switch back to 5.1)

arrowsmith•1h ago
What about 5.1 do you prefer over 5.2?
fpgaminer•28m ago
As far as I can tell 5.2 is the stronger model on paper, but it's been optimized to think less and do less web searches. I daily drive Thinking variants, not Auto or Instant, and usually want the _right_ answer even if it takes a minute. 5.1 does a very good job of defensively web searching, which avoids almost all of its hallucinations and keeps docs/APIs/UIs/etc up-to-date. 5.2 will instead often not think at all, even in Thinking mode. I've gotten several completely wrong, hallucinated answers since 5.2 came out, whereas maybe a handful from 5.1. (Even with me using 5.2 far less!)

The same seems to persist in Codex CLI, where again 5.2 doesn't spend as much time thinking so its solutions never come out as nicely as 5.1's.

That said, 5.1 is obviously slower for these reasons. I'm fine with that trade off. Others might have lighter workloads and thus benefit more from 5.2's speed.

adamiscool8•1h ago
0.1% of users is not necessarily 0.1% of conversations…
jedbrooke•1h ago
I still don’t know how openAI thought it was a good idea to have a model named "4o" AND a model named "o4", unless the goal was intentional confusion
afro88•1h ago
Considering how many people say ChatGTP too
uh_uh•1h ago
The other day I heard ChatGBD.
ben_w•1h ago
Have you heard Boris Johnson's version?

https://m.youtube.com/shorts/JAVMEs5CG1Y

lichenwarp•1h ago
I'm gonna watch this again about 5 times because it's so fucking funny
mandeepj•1h ago
The comments have their own overdose of deliciousness. That click to look at them, never disappoints :-)
razodactyl•1h ago
This one was great hahaha
tweakimp•1h ago
MY favourite is ChatJippiddy
Imustaskforhelp•1h ago
Do you watch primagen by instance?

A fellow Primagen viewer spotted.

lifetimerubyist•1h ago
Or just "gippity" for short.
tibbydudeza•1h ago
The Primeagen :).
bee_rider•1h ago
ChagGDP because a country worth of money was spent to train it.
Insanity•1h ago
I’ve been hearing that consistently from a friend, I gave up on correcting them because “ChatGPT” just wouldn’t stick
throw-the-towel•1h ago
I still don't like how French people don't call it "chat j'ai pété".
adzm•1h ago
GTP goes forward from the middle, teeth, then lips, as compared to GPT which goes middle, lips, teeth; you'll see this pattern happen with a lot of words in linguistic history
cryptoz•1h ago
Even more than that, I've seen a lot of people confuse 4 and 4o, probably because 4o sounds like a shorthand for 4.0 which would be the same thing as 4.
mimischi•1h ago
Come to think of it, maybe they had a play on 4o being “40”, and o4-mini being “04”, and having to append the “mini” to bring home the message of 04<40
Someone1234•1h ago
Even ChatGPT (and certainly Google) confuses the names.

I'm sure there is some internal/academic reason for them, but from an outside observer simply horrible.

jsheard•1h ago
Wasn't "ChatGPT" itself only supposed to be a research/academic name, until it accidentally broke containment and they ended up having to roll with it? The naming was cursed from the start.
nipponese•1h ago
When picking a fight with product marketing, just don't.
razodactyl•1h ago
How many times have you noticed people confusing the name itself: ChatGBT, ChatGTP etc.

We're the technical crowd cursed and blinded by knowledge.

recursive•1h ago
"4o" was bad to begin with, as "four-oh" is a common verbalization of "4.0".
pdntspa•1h ago
It's almost always marketing and some stupid idea someone there had. I don't know why non-technical people try and claim so much ownership over versioning. You nearly always end up with these ridiculous outcomes.

"I know! Let's restart the version numbering for no good reason!" becomes DOOM (2016), Mortal Kombat 1 (2025), Battlefield 1 (2016), Xbox One (not to be confused with the original Xbox 1)

As another example, look at how much of a trainwreck USB 3 has become

Or how Nvidia restarted Geforce card numbering

recursive•1h ago
Xbox should be in the hall of fame for terrible names.

There's also Xbox One X, which is not in the X series. Did I say that right? Playstation got the version numbers right. I couldn't make names as incomprehensible as Xbox if I tried.

femiagbabiaka•1h ago
There will be a lot of mentally unwell people unhappy with this, but this is a huge net positive decision, thank goodness.
haunter•1h ago
Which one is the AI boyfriend model? Tumblr, Twitter, and reddit will go crazy
goldenarm•1h ago
4o is the most popular one for that
NewsaHackO•1h ago
>We brought GPT‑4o back after hearing clear feedback from a subset of Plus and Pro users, who told us they needed more time to transition key use cases, like creative ideation, and that they preferred GPT‑4o’s conversational style and warmth.

This does verify the idea that OpenAI does not make models sycophantic due to attempted subversion by buttering up users so that that they use the product more, its because people actually want AI to talk to them like that. To me, that's insane, but they have to play the market I guess

Scene_Cast2•1h ago
As someone who's worked with population data, I found that there is an enormous rift between reported opinion (and HN and reddit opinion) vs revealed (through experimentation) population preferences.
toss1•1h ago
Sounds both true and interesting. Any particularly wild and/or illuminating examples of which you can share more detail?
hnuser123456•1h ago
The "my boyfriend is AI" subreddit.

A lot of people are lonely and talking to these things like a significant other. They value roleplay instruction following that creates "immersion." They tell it to be dark and mysterious and call itself a pet name. GPT-4o was apparently their favorite because it was very "steerable." Then it broke the news that people were doing this, some of them falling off the deep end with it, so they had to tone back the steerability a bit with 5, and these users seem to say 5 breaks immersion with more safeguards.

jaggederest•40m ago
My favorite somewhat off topic example of this is some qualitative research I was building the software for a long time ago.

The difference between the responses and the pictures was illuminating, especially in one study in particular - you'd ask people "how do you store your lunch meat" and they say "in the fridge, in the crisper drawer, in a ziploc bag", and when you asked them to take a picture of it, it was just ripped open and tossed in anywhere.

This apparently horrified the lunch meat people ("But it'll get all crusty and dried out!", to paraphrase), which that study and ones like it are the reason lunch meat comes with disposable containers now, or is resealable, instead of just in a tear-to-open packet. Every time I go grocery shopping it's an interesting experience knowing that specific thing is in a small way a result of some of the work I did a long time ago.

cm2012•1h ago
This is why I work in direct performance advertising. Our work reveals the truth!
make3•1h ago
Your work exploits people's addictive propensity and behaviours, and gives corporations incentives and tools to build on that.

Insane spin you're putting on it. At best, you're a cog in one of the worst recent evolutions of capitalism.

marrone12•1h ago
Advertising is not a recent evolution of capitalism, it's a foundational piece of it. Whatever you do as a job would not exist if there was no one marketing it. This hostility seems insane.
12345ieee•1h ago
The early theorists of capitalism didn't imagine that advanced psychology (that didn't even exist back then) would be used to convince people to buy $product.

Messages of that sophistication are always dangerous, and modern advertising is the most widespread example of it.

The hostility is more than justified, I can only hope the whole industry is regulated downwards, even if whatever company I work for sells less.

eru•29m ago
> Messages of that sophistication [...]

By demonising them, you are making ads sounds way more glamorous than they are.

losteric•58m ago
Advertising always seems like a prisoner’s dilemma. If no one advertised, people would still buy things.
cm2012•24m ago
Yes but the advantage would be much more towards incumbents
q3k•55m ago
Not having my job would be a tiny price to pay compared to the benefit of living in a world with no advertisements.
DetroitThrow•42m ago
>it's a foundational piece of it

No it's not

cm2012•28m ago
Exploitative ads are a small minority. I also think gambling advertising should be banned.
make3•1h ago
Exactly, that sounds to me like a TikTok vs NPR/books thing, people tell everyone what they read, then go spend 11h watching TikToks until 2am.
Macha•22m ago
I always thought that the idea that "revealed preferences" are preferences, discounts that people often make decisions they would rather not. It's like the whole idea that if you're on a diet, it's easier to not have junk food in the house to begin with than to have junk food and not eat more than your target amount. Are you saying these people want to put on weight? Or is it just they've been put in a situation that defeats their impulse control?

I feel a lot of the "revealed preference" stuff in advertising is similar in advertisers finding that if they get past the easier barriers that users put in place, then really it's easier to sell them stuff that at a higher level the users do not want.

9x39•1h ago
I thought this was almost due to the AI personality splinter groups (trying to be charitable) like /myboyfriendisai and wrapper apps who vocally let them know they used those models the last time they sunset them.
cornonthecobra•1h ago
Put on a good show, offer something novel, and people will gleefully march right off a cliff while admiring their shiny new purchase.
PlatoIsADisease•1h ago
Your absolutely right. You’re not imagining it. Here is the quiet truth:

You’re not imagining it, and honestly? You're not broken for feeling this—its perfectly natural as a human to have this sentiment.

cj•1h ago
I was one of those pesky users who complained when o3 suddenly was unavailable.

When 5.2 was first launched, o3 did a notably better job at a lot of analytical prompts (e.g. "Based on the attached weight log and data from my calorie tracking app, please calculate my TDEE using at least 3 different methodologies").

o3 frequently used tables to present information, which I liked a lot. 5.2 rarely does this - it prefers to lay out information in paragraphs / blog post style.

I'm not sure if o3 responses were better, or if it was just the format of the reply that I liked more.

If it's just a matter of how people prefer to be presented their information, that should be something LLMs are equipped to adapt to at a user-by-user level based on preferences.

pdntspa•1h ago
I thought it was based on the user thumbs-up and thumbs-down reactions, it evolving the way that it does makes it pretty obvious that users want their asses licked
josephg•1h ago
They have added settings for this now - you can dial up and down how “warm” and “enthusiastic” you want the models to be. I haven’t done back to back tests to see how much this affects sycophancy, but adding the option as a user preference feels like the right choice.

If anyone is wondering, the setting for this is called Personalisation in user settings.

SeanAnderson•42m ago
This doesn't come as too much of a surprise to me. Feels like it mirrors some of the reasons why toxic positivity occurs in the workplace.
GaggiX•1h ago
If people want an AI as a boyfriend at least they should use one that is open source.

If you disagree on something you can also train a lora.

jaggederest•1h ago
I think this kind of thing is a pretty strong argument for the entire open source model ecosystem, not just open weights but open data and the whole gamut.
fpgaminer•1h ago
I wish they would keep 4.1 around for a bit longer. One of the downsides of the current reasoning based training regimens is a significant decrease in creativity. And chat trained AIs were already quite "meh" at creative writing to begin with. 4.1 was the last of its breed.

So we'll have to wait until "creativity" is solved.

Side note: I've been wondering lately about a way to bring creativity back to these thinking models. For creative writing tasks you could add the original, pretrained model as a tool call. So the thinking model could ask for its completions and/or query it and get back N variations. The pretrained model's completions will be much more creative and wild, though often incoherent (think back to the GPT-3 days). The thinking model can then review these and use them to synthesize a coherent, useful result. Essentially giving us the best of both worlds. All the benefits of a thinking model, while still giving it access to "contained" creativity.

MillionOClock•1h ago
My theory, based on what I would see with non-thinking models, is that as soon as you start detailing something too much (ie: not just "speak in the style of X" but more like "speak in the style of X with [a list of adjectives detailing the style of X]" they would loose creativity, would not fit the style very well anymore etc. I don't know how things have evolved with new training techniques etc. but I suspected that overthinking their tasks by detailing too much what they have to do can lower quality in some models for creative tasks.
perardi•1h ago
Have you tried the relatively recent Personalities feature? I wonder if that makes a difference.

(I have no idea. LLMs are infinite code monkeys on infinite typewriters for me, with occasional “how do I evolve this Pokémon’ utility. But worth a shot.)

tom1337•1h ago
Would be cool if they'd release the weights for these models so users could now use them locally.
WorldPeas•1h ago
They'd only do that if they were some kind of open ai company /s
amelius•1h ago
lol :)
tgtweak•1h ago
gpt-oss is pretty great tbh - one of the better all-around local models for knowledge and grounding.
IhateAI•1h ago
Why would someone want to spend half a million dollars on GPUs and components (if not more) to run one year old models that genuinely aren't useful? You can't self host trillion parameter models unless you own a datacenter lol (or want to just light money on fire).
tom1337•1h ago
Are the mini / omni models really trillion parameter models?
IhateAI•55m ago
I don't think so, but you're still looking at a giant investment that can't really be justified for their capability.
jostmey•1h ago
I noticed how ChatGPT got progressively worse at helping me with my research. I gave up on ChatGPT 5 and just switched Grok and Gemini. I couldn’t be happier that I switched.
amelius•1h ago
Why not Claude?
jostmey•1h ago
I personally find Claude the best at coding, but it’s usefulness doesn’t seem to extend to scientific research and writing
esperent•1h ago
The limits on the $20 plan are too low compared to Gemini and ChatGPT. They're too low to do any serious work at all.
650REDHAIR•1h ago
Because I’m sick of paying $20 for an hour of claude before it throttles me.
azan_•1h ago
It's amazing how different are the experiences different people have. To me every new version of chatgpt was an improvement and gemini is borderline unusable.
tgtweak•1h ago
Very curious for what use cases you're finding gemini unusable.
azan_•47m ago
Scientific research and proof-reading. Gemini is the laziest LLM I've used. Frequently he will lie that he searched for something and just make stuff up, basically never happens to me when I'm using gpt5.2.
flexagoon•26m ago
Do you use it directly? I've only used it though Kagi Assistant but it works better than any other model for me
azan_•23m ago
Yes, only directly (I mean through the default gemini interface, not API).
double0jimb0•45m ago
In my experience with Gemini, I find it incapable of not hallucinating.
farcitizen•4m ago
I got the same experience. Dont get how people are saying gemini is so good.
sundarurfriend•1h ago
ChatGPT 5.2 has been a good motivator for me to try out other LLMs because of how bad it is. Both 5.1 and 5.2 have been downgrades in terms of instruction following and accuracy, but 5.2 especially so. The upside is that that's had me using Claude much more, and I like a lot of things about it, both in terms of UI and the answers. It's also gotten me more serious about running local models. So, thank you OpenAI, for forcing me to broaden my horizons!
PlatoIsADisease•1h ago
nah bruh you are just imagining it.

Its just as good as ever /s

orphea•58m ago
Have you had a chance to compare with Gemini 3?
leumon•1h ago
> We’re continuing to make progress toward a version of ChatGPT designed for adults over 18, grounded in the principle of treating adults like adults, and expanding user choice and freedom within appropriate safeguards. To support this, we’ve rolled out age prediction for users under 18 in most markets. https://help.openai.com/en/articles/12652064-age-prediction-...

interesting

kace91•1h ago
What’s the goal there? Sexting?

I’m guessing age is needed to serve certain ads and the like, but what’s the value for customers?

jandrese•1h ago
If you don't think the potential market for AI sexbots is enormous you have not paid attention to humanity.
leumon•1h ago
according to the age-prediction page, the changes are:

> If [..] you are under 18, ChatGPT turns on extra safety settings. [...] Some topics are handled more carefully to help reduce sensitive content, such as:

- Graphic violence or gore

- Viral challenges that could push risky or harmful behavior

- Sexual, romantic, or violent role play

- Content that promotes extreme beauty standards, unhealthy dieting, or body shaming

jacquesm•58m ago
Porn has driven just about every bit of progress on the internet, I don't see why AI would be the exception to that rule.
elevation•44m ago
Even when you're making PG content, the general propriety limits of AI can hinder creative work.

The "Easter Bunny" has always seemed creepy to me, so I started writing a silly song in which the bunny is suspected of eating children. I had too many verses written down and wanted to condense the lyrics, but found LLMs telling me "I cannot help promote violence towards children." Production LLM services would not help me revise this literal parody.

Another day I was writing a romantic poem. It was abstract and colorful, far from a filthy limerick. But when I asked LLMs for help encoding a particular idea sequence into a verse, the models refused (except for grok, which didn't give very good writing advice anyway.)

estimator7292•18m ago
Just today I asked how to shut down a Mac with "maximal violence". I was looking for the equivalent of "systemctl shutdown -f -f" and it refused to help me do violence.

Believe me, the Mac deserved it.

shmel•10m ago
It reminds me that story about a teenage learning Rust that got a refusal because he had asked about "unsafe" code =)
robotnikman•41m ago
There is a subreddit called /r/myboyfriendisAI, you can look through it and see for yourself.
ekianjo•21m ago
There is a huge book market for sexual stories, in case you were not aware.
chilmers•1h ago
Sexual and intimate chat with LLMs will be a huge market for whoever corners it. They'd be crazy to leave that money on the table.
thayne•1h ago
If your goal is to make money, sure. If your goal is to make AI safe, not so much.
koakuma-chan•59m ago
It will be an even bigger market when robotics are sufficiently advanced.
palmotea•52m ago
That's why laws against drugs are so terrible, it forces law-abiding businesses to leave money on the table. Repeal the laws and I'm sure there will be tons of startups to profit off of drug addiction.
georgemcbay•46m ago
> Repeal the laws and I'm sure there will be tons of startups to profit off of drug addiction.

Worked for gambling.

(Not saying this as a message of support. I think legalizing/normalizing easy app-based gambling was a huge mistake and is going to have an increasingly disastrous social impact).

LPisGood•6m ago
Why do you think it will be increasingly bad? It seems to me like it’s already as bad as it’s capable of getting.
chilmers•40m ago
There are many companies making money off alcohol addiction, video game addiction, porn addiction, food addiction, etc. Should we outlaw all these things? Should we regulate them and try to make them safe? If we can do that for them, can't we do it for AI sex chat?
shmel•11m ago
what about laws against porn? Oh, wait, no, that's a legitimate business.
ekianjo•22m ago
That market is for local models right now.
thayne•1h ago
It says what to do if you are over 18, but thinks you are under 18. But what if it identifies someone under 18 as being older?

And what if you are over 18, but don't want to be exposed to that "adult" content?

> Viral challenges that could push risky or harmful behavior

And

> Content that promotes extreme beauty standards, unhealthy dieting, or body shaming

Seem dangerous regardless of age.

GoatInGrey•56m ago
Pornographic use has long been the "break glass in case of emergency" for the LLM labs when it comes to finances.

My personal opinion is that while smut won't hurt anyone in of itself, LLM smut will have weird and generally negative consequences. As it will be crafted specifically for you on top of the intermittent reinforcement component of LLM generation.

estimator7292•20m ago
While this is a valid take, I feel compelled to point out Chuck Tingle.

The sheer amount and variety of smut books (just books) is vastly larger than anyone wants to realize. We passed the mark decades ago where there is smut available for any and every taste. Like, to the point that even LLMs are going to take a long time to put a dent in the smut market. Humans have been making smut for longer than we've had writing.

But again I don't think you're wrong, but the scale of the problem is way distorted.

MBCook•9m ago
That’s all simple one way consumption though. I suspect the effect on people is very different when it’s interactive in the way an LLM can be that we’ve never had to recon with before.

That’s where the danger may lie.

pixl97•4m ago
Alien 1: "How did the earthlings lose control of their own planet?"

Alien 2: "AI generated porn"

chasd00•48m ago
eh there's an old saying that goes "no Internet technology can be considered a success until it has been adopted by (or in this case integrated with) the porn industry".
europeanNyan•1h ago
After they pushed the limits on the Thinking models to 3000 per week, I haven't touched anything else. I am really satisfied with their performance and the 200k context windows is quite nice.

I've been using Gemini exclusively for the 1 million token context window, but went back to ChatGPT after the raise of the limits and created a Project system for myself which allows me to have much better organization with Projects + only Thinking chats (big context) + project-only memory.

Also, it seems like Gemini is really averse to googling (which is ironic by itself) and ChatGPT, at least in the Thinking modes loves to look up current and correct info. If I ask something a bit more involved in Extended Thinking mode, it will think for several minutes and look up more than 100 sources. It's really good, practically a Deep Research inside of a normal chat.

tgtweak•1h ago
I find Gemini does the most searching (and the quickest... regularly pulls 70+ search results on a query in a matter of seconds - likely due to googlebot's cache of pretty much every page). Chatgpt seems to only search if you have it in thinking/research mode now.
toxic72•1h ago
I REALLY struggle with Gemini 3 Pro refusing to perform web searches / getting combative with the current date. Ironically their flash model seems much more likely to opt for web search for info validation.

Not sure if others have seen this...

I could attribute it to:

1. It's known quantity with the pro models (I recall that the pro/thinking models from most providers were not immediately equipped with web search tools when they were released originally)

2. Google wants you to pay more for grounding via their API offerings vs. including it out of the box

eru•27m ago
Gemini refused to believe that I was using MacOS 26.
jackblemming•1h ago
They should open source GPT-4o.
ClassAndBurn•1h ago
They will have to update the openai. Com footer I guess

Latest Advancements

GPT-5

OpenAI o3

OpenAI o4-mini

GPT-4o

GPT-4o mini

Sora

tgtweak•1h ago
5.2 is back to being a sycophantic hallucinating mess for most use cases - I've anecdotally caught it out on many of the sessions I've had where it apologizes "You're absolutely right... that used to be the case but as of the latest version as you pointed out, it no longer is." when it never existed in the first place. It's just not good.

On the other hand - 5.0-nano has been great for fast (and cheap) quick requests and there doesn't seem to be a viable alternative today if they're sunsetting 5.0 models.

I really don't know how they're measuring improvements in the model since things seem to have been getting progressively worse with each release since 4o/o4 - Gemini and Opus still show the occasional hallucination or lack of grounding but both readily spend time fact-checking/searching before making an educated guess.

I've had chatgpt blatantly lie to me and say there are several community posts and reddit threads about an issue then after failing to find that, asked it where it found those and it flat out said "oh yeah it looks like those don't exist"

650REDHAIR•1h ago
That’s been my experience and has lead to hours of wasted time. It’s faster for me to read through docs and watch YouTube.

Even if I submit the documentation or reference links they are completely ignored.

perardi•1h ago
OK, everyone is (rightly) bringing up that relatively small but really glaringly prominent AI boyfriend subreddit.

But I think a lot more people are using LLMs for relationship surrogates than that (pretty bonkers) subreddit would suggest. Character AI (https://en.wikipedia.org/wiki/Character.ai) seems quite popular, as do the weird fake friend things in Meta products, and Grok’s various personality mode and very creepy AI girlfriends.

I find this utterly bizarre. LLMs are peer coders in a box for me. I care about Claude Code, and that’s about it. But I realize I am probably in the vast minority.

razodactyl•1h ago
We're very echo-chambered here. That graph OpenAI released had coding at 4% or something.
siquick•1h ago
2 weeks notice to migrate to a different style of model (“normal” 4.1-mini to reasoning 5.1) is bad form.
htrp•54m ago
Sora + OpenAI voice Cloning + AdultGPT = Virtual Girlfriend/Boyfriend

(Upgrade for only 1999 per month)

thedudeabides5•41m ago
will this nuke my old convos?

opus 4.5 is better at gpt on everything except code execution (but with pro you get a lot of claude code usage) and if they nuke all my old convos I'll prob downgrade from pro to freee

renewiltord•26m ago
Oh good. Not in the API. The 4o-mini is super cheap and useful for a bunch of things I do (evaluating post vector-search for relevancy).