frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Updates to Advanced Voice Mode for paid users

https://help.openai.com/en/articles/6825453-chatgpt-release-notes
33•mfiguiere•4h ago

Comments

kubb•4h ago
I have the feeling that the Advanced Voice Mode is significantly worse than when I used it earlier this week. The voice sounds disinterested, and has weird intonation. It used to be excellent for foreign language conversation practice, now significantly worse.

Edit: After using up my 15 minutes for testing, I have to say that the new voice is actually not bad, although I was used to something else. But it has a very clear "artificial" quality to it. It also sometimes misinterprets my input as something completely different than what I said, for example "please like my video and subscribe to my channel".

bigshot•3h ago
There’s a 15 minute limit?
kubb•3h ago
In the Plus subscription yes. You can also pay 200 dollars per month for Pro, and in that plan, the advanced voice mode is unlimited. 200 bucks is quite a lot, I've gotta say. I wish there was a middle ground option, but even for the 20 dollars for Pro, they should give you more than 15 minutes.
vunderba•3h ago
Is this new? I'm on the Plus plan and just a few days ago carried on a conversation for around 45 minutes while on a walk with my dog.

Agreed though, the new voice (at least for Sol) accent sounds significantly degraded particularly when conversing in Chinese.

kubb•3h ago
Apparently it's 6 months old [1]. You might be using the standard voice mode (the advanced one has just 1 voice IIUC).

[1] https://www.reddit.com/r/OpenAI/comments/1hdamrm/so_advanced...

vunderba•3h ago
Thanks. OpenAI's docs are frustratingly vague about the whole thing. It seems (assuming the 15 minute hard limit holds true) that I must have been conversing with advanced mode for 15 minutes since Advanced is the default for Plus subscribers on the mobile app, and then it must have possibly handed it off to the standard voice mode after that.

Advanced https://help.openai.com/en/articles/9617425-advanced-voice-m...

Standard https://help.openai.com/en/articles/8400625-voice-mode-faq

arthurcolle•2h ago
No advanced voice mode has multiple voices
doctorhandshake•40m ago
Stumbled across the new voice this afternoon after months of not using voice mode and after being impressed by the naturalness, was also let down by the disinterested tone. That combined with the platitudes and tendency to repeat back to me what I was saying without new information left me disappointed with the update.
tallytarik•4h ago
> Additionally, rare hallucinations in Voice Mode persist with this update, resulting in unintended sounds resembling ads, gibberish, or background music.

This would be really funny if it weren’t real life.

zaptrem•4h ago
> Additionally, rare hallucinations in Voice Mode persist with this update, resulting in unintended sounds resembling ads, gibberish, or background music. We are actively investigating these issues and working toward a solution.

Would be cool to hear some samples of this. I remember there was some hallucinated background music during the meditation demo in the original reveal livestream but haven't seen much beyond that. Artifact of training on podcasts to get natural intonation.

transcriptase•2h ago
I use advanced voice a lot and have come across many weird bugs.

1) Every response would be normal except end with a “whoosh” like one of those sound effects some mail clients use when an message is sent, and the model itself either couldn’t or wouldn’t acknowledge it.

2) The same except with someone knocking on a door. Like someone would play on a soundboard.

3) The entire history in the conversation disappearing after several minutes of back and forth, leading to the model having no idea what I’m talking about and acting as if it’s a fresh conversation.

4) Advanced voice mode stuttering because it hears its own voice and thinks it’s me interrupting (on a brand new iPhone 16 Pro, medium-low built in speaker volume and built-in mic).

5) Really weird changes in pronunciation or randomly saying certain words high-pitched, or suddenly using a weird accent.

And all of this was prior to these most recent changes.

It also stutters and repeats sometimes and says poor connection even though I know the connection is near-ideal.

zaptrem•2h ago
I may know why that first one happens! They’re not correctly padding the latent in their decoder (by default torch pads with zeros, they should pad with whatever their latent’s representation of silence is). You can hear the same effect in songs generated with our music model: https://sonauto.ai/

Yeah we’re too lazy to fix it too

transcriptase•1h ago
I’m super curious now, how does padding lead to repeatedly ending tts replies with what seem to be an actual non-speech sound effect?
Centigonal•1h ago
If you pad your output with something that doesn't represent silence, then any outputs that happen to have a non-standard length (i.e. nearly all outputs) will end with whatever sound your padding bits represent in the model's embedding space. if "0000" represents "Whoosh," then most of your outputs will end in "whoosh."

Here's a non-AI example: If all HN comments had to be some multiple of 50 characters long and comments were padded with the letter "A," then most HN comments would look like the user was screaming at the end. AAAAAAAAAAAAAAAAAA

zaptrem•1h ago
In addition to what Centigonal said, even if the autoencoder was trained on only speech data, an all zero vector is probably just be out of distribution (decoder has never seen it before) and causes weird sounds. However, given the hallucinations we're seeing, the AE has (maybe unintentionally) likely seen a bunch of non-speech data like music and sound effects too.
arthurcolle•2h ago
they still need to post-train out the emissions of all the trapped souls
automationist•1h ago
If anyone's wondering, here's a short sample. It quietly updated last night, and I ended up chatting for like an hour. It sounds as smart as before, but like 10x more emotionally intelligent. Laughter is the biggest giveaway, but the serious/empathetic tones for more therapy-like conversations are noticeable, too. https://drive.google.com/file/d/16kiJ2hQW3KF4IfwYaPHdNXC-rsU...
candiddevmike•51m ago
Did it really say partwheel or is it garbled?
TheTaytay•4h ago
I wish they still had the voice mode that was _only_ text-to-speech, and speech-to-text. It didn't sound as good, but it was as smart as the underlying model. The advanced voice mode regularly goes off the rails for me, makes the same mistake repeatedly, and other things that the text-version of advanced LLMs hasn't done for months now.
adeelk93•4h ago
Don’t they? Press the microphone button for speech-to-text, and the speaker button for text-to-speech
og_kalu•3h ago
In the App:

Settings> Personalization> Custom Instructions then Advanced Dropdown. Uncheck Advanced Voice

On Desktop site:

Profile Button> Customize ChatGPT then Advanced Dropdown. Uncheck Advanced Voice

cladopa•3h ago
Today I used ChatGPT and the voice was disgusting for the first time since I use ChatGPT(months).

It was the voice of someone(a woman) that was confrontational, someone who does not like you.

It made me want to close and remove the chat immediately.

transcriptase•1h ago
I don’t suppose you have a bunch of custom instructions telling ChatGPT to be concise, terse, etc do you? Those impact the voice model too and it turns out the “get to the point I’m not an idiot” pre-prompts people have been recommending really don’t translate well when the voice mode uses it as a personality.
ed_mercer•3h ago
I keep using standard voice mode (Cove) because I like its grounded voice a lot. The advanced Cove’s voice sounds too much like an overly happy guy. I wish I could tell it to chill and talk normally but it won’t.
arnaudsm•2h ago
If there's an OpenAI PM reading this: please add the model selector for voice modes. 80% of this thread is users confused about which model they're using.
dedicate•59m ago
In my daily use, I just want the answer, not a performance. I'd rather it sound like a smart assistant, not my best friend.
patwolf•21m ago
I was using it earlier today and noticed something was different. It sounded more lethargic, and added a lot more "umms". It's not necessary bad, just something I need to get used to.

I always get a laugh asking it to talk like an Ent, and I made sure to check that it could still do that.

OpenAI forced to preserve ChatGPT chats

https://malware.news/t/openai-forced-to-preserve-chatgpt-chats/95239
1•WaitWaitWha•3m ago•0 comments

Old smartphones can have a new life as tiny data centers

https://techxplore.com/news/2025-06-smartphones-life-tiny-centers.html
1•e2e4•5m ago•0 comments

Fight for Your Right – To Focus

https://www.gapingvoid.com/fight-for-your-right-to-focus/
1•fogzen•6m ago•0 comments

Should I Use a Carousel?

https://shouldiuseacarousel.com/
1•coffeecoders•6m ago•0 comments

Nail Your Raise – Luring VCs from Vinod Khosla

https://www.khoslaventures.com/nail-your-raise-luring-vcs/
1•wegit•13m ago•0 comments

Historical Tech Tree

https://www.historicaltechtree.com/
1•kawera•20m ago•0 comments

The Future Ain't What It Used to Be for These Funds

https://www.wsj.com/finance/investing/hamilton-lane-private-assets-alternative-funds-8862f32e
1•tzury•39m ago•0 comments

Initial thoughts on a £18 Colmi R09 smart ring and Gadgetbridge

https://neilzone.co.uk/2025/06/initial-thoughts-on-a-18-colmi-r09-smart-ring-and-gadgetbridge/
1•edward•39m ago•0 comments

Marianne North

https://en.wikipedia.org/wiki/Marianne_North
1•fuzztester•43m ago•1 comments

Innernet – A private network system that uses WireGuard under the hood

https://github.com/tonarino/innernet
2•baobun•1h ago•1 comments

An innovative superfamily of fonts for code

https://monaspace.githubnext.com/
4•laex•1h ago•0 comments

Tiny worms form living towers to become a super-organism

https://english.elpais.com/science-tech/2025-06-05/these-tiny-worms-form-living-towers-to-become-super-organism.html
1•belter•1h ago•0 comments

Calories Count App for iOS – Powered by AI

https://apps.apple.com/mt/app/myfoodyai/id6746223626
1•DanjelDurmo•1h ago•1 comments

I Went Inside the Factory for the Insane Printer Everyone's Talking About [video]

https://www.youtube.com/watch?v=3IBMjZDMdcc
7•YZF•1h ago•1 comments

Ask HN: What is something you deeply care about?

3•blahaj•1h ago•0 comments

Researchers genetically altered fruit flies to crave cocaine

https://www.popsci.com/science/fruit-fly-cocaine/
3•zdw•1h ago•2 comments

ChatGPT AI Can Be Fooled to Reveal Secrets

https://texttoslides.ai/blog/chatgpt-ai-reveals-secrets
6•sh_tomer•1h ago•4 comments

In Crokinole Country

https://www.cbc.ca/newsinteractives/features/crokinole-country
1•BiraIgnacio•1h ago•2 comments

Myrddin: Language Design Checklist

https://myrlang.org/lang-checklist
1•9d•1h ago•0 comments

Why Solana Remains One of Crypto's Best Long-Term Opportunities

https://www.alphaplease.com/p/why-sol-remains-one-of-cryptos-best
2•lawrenceyan•1h ago•0 comments

Don't Panic, but Douglas Adams Predicted a Lot of This

https://krisstgabriel.substack.com/p/dont-panic-but-douglas-adams-predicted
20•wrongcards•2h ago•10 comments

Tracking Starlink satellite reentries during the rising phase of solar cycle 25

https://www.frontiersin.org/journals/astronomy-and-space-sciences/articles/10.3389/fspas.2025.1572313/full
2•gnabgib•2h ago•0 comments

Kabul at risk of becoming first modern city to run out of water, report warns

https://www.theguardian.com/world/2025/jun/07/kabul-could-become-first-modern-city-to-run-out-of-water-report-warns
6•bookofjoe•2h ago•1 comments

Adventures in the Design of Ultra-Precision Machine Tools [video]

https://www.youtube.com/watch?v=vEr2CJruwEM
1•eigenform•2h ago•0 comments

Reinforcement Learning to Train Large Language Models to Explain Human Decisions

https://arxiv.org/abs/2505.11614
2•PaulHoule•2h ago•0 comments

The Computer Chronicles: HyperCard (1987)

https://www.youtube.com/watch?v=FquNpWdf9vg
2•gdubs•2h ago•0 comments

The US is turning into a mass techno-surveillance state

https://english.elpais.com/usa/2025-06-05/how-the-us-is-turning-into-a-mass-techno-surveillance-state.html
31•geox•2h ago•7 comments

Scaling Helix: A New State of the Art in Humanoid Logistics

https://www.figure.ai/news/scaling-helix-logistics
3•jk_tech•2h ago•0 comments

Germany plans rapid bunker expansion amid fears of Russian attack

https://old.reddit.com/r/worldnews/comments/1l5pcnl/germany_plans_rapid_bunker_expansion_amid_fears/
2•Teever•2h ago•1 comments

Garmin rolls out feature-packed Forerunner 570 smartwatch at a reasonable price

https://www.phonearena.com/news/garmin-forerunner-570-smartwatch-official-specs-features-price-release-date_id170417
2•teleforce•2h ago•0 comments