AI sycophancy panic

https://github.com/firasd/vibesbench/blob/main/docs/ai-sycophancy-panic.md

57•firasd•1mo ago

Comments

firasd•1mo ago

This is my attempt to articulate why some recent shifts in AI discourse seem to be degrading the product experience of everyday conversation.

I argue that “sycophancy” has become an overloaded and not very helpful term; almost a fashionable label applied to a wide range of unrelated complaints (tone, feedback depth, conversational flow).

Curious whether this resonates with how you feel or if you disagree

Also see the broader Vibesbench project: https://github.com/firasd/vibesbench/

Vibesbench discord: https://discord.gg/5K4EqWpp

rvnx•1mo ago

It's because people are so happy that they learnt a new word that they try to say it to everyone in every occasion.

What drives me crazy are the emojis and the patronizing at the end of conversation.

Before 2022 no-one was using that word

bananaflag•1mo ago

Same with "stochastic parrot", I doubt lots of people knew the word "stochastic" before

_alternator_•1mo ago

The issue raised here seems mostly semantic, in the sense that the concern is about the mismatch between the standard meaning of a word (sycophant) and its meaning as applied to an issue with LLMs.

It seems to me that the issue it refers to (unwarranted or obsequious praise) is a real problem with modern chatbots. The harms range from minor (annoyance, or running down the wrong path because I didn’t have a good idea to start with) to dangerous (reinforcing paranoia and psychotic thoughts). Do you agree that these are problems, and there a more useful term or categorization for these issues?

A4ET8a8uTh0_v2•1mo ago

<< The harms range from minor (annoyance, or running down the wrong path because I didn’t have a good idea to start with) to dangerous (reinforcing paranoia and psychotic thoughts). Do you agree that these are problems, and there a more useful term or categorization for these issues?

I think that the issue is a little more nuanced. The problems you mentioned are problems of sort, but the 'solution' in place kneecaps one of the ways llms ( as offered by various companies ) were useful. You mention the problem is reinforcement of the bad tendencies, but no indication of reinforcement of good ones. In short, I posit that the harms should not outweigh the benefits of augmentation.

Because this is the way it actually does appear to work:

1. dumb people get dumber 2. smart people get smarter 3. psychopaths get more psychopathy

I think there is a way forward here that does not have to include neutering seemingly useful tech.

firasd•1mo ago

Re: minor outcomes. It really depends on the example I guess. But if the user types "What if Starbucks focuses on lemonade" and then gets disappointed that the AI didn't yell at them for being off track--what are they expecting exactly? The attempt to satisfy them has led to GPT-5.2-Thinking style nitpicking[1] They have to think of the stress test angles themselves ('can we look up how much they are selling as far as non-warm beverages...')

[1] eg. when I said Ian Malcolm in Jurassic Park is a self-insert, it clarified to me "Malcolm is less a “self-insert” in the fanfic sense (author imagining himself in the story) and more Crichton’s designated mouthpiece". Completely irrelevant to my point but answering as if a bunch of reviewers are gonna quibble with its output

With regards to mental health issues, of course nobody on Earth (not even the patients with these issues, in their moments of grounded reflection) would say that that the AI should agree with their take. But I also think we need to be careful about what's called "ecological validity". Unfortunately I suspect there may be a lot of LARPing in prompts testing for delusions akin to Hollywood pattern matching, aesthetic talk etc.

I think if someone says that people are coming after them the model should not help them build a grand scenario, we can all agree with that. Sycophancy is not exactly the concern there is it? It's more like knowing that this may be a false theory. So it ties into reasoning and contextual fluency (which anti-'sycophancy' tuning may reduce!) and mental health guardrails

____mr____•1mo ago

AI sycophancy is a real issue and having an AI affirm the user in all/most cases has already led to a murder-suicide[0]. If we want AI chatbots to be "reasonable" conversation participants or even something you can bounce ideas off of, they need to not tell you everything you suggest is a good idea and affirm your every insecurity or neurosis.

0. https://www.aljazeera.com/economy/2025/12/11/openai-sued-for...

florkbork•1mo ago

Did you actually argue this?

Or did you place about 2-5 paragraphs per heading, with little connection between the ideas?

For example:

> Perhaps what some users are trying to express with concerns about ‘sycophancy’ is that when they paste information, they'd like to see the AI examine various implications rather than provide an affirming summary.

Did you, you personally, find any evidence of this? Or evidence to the opposite? Or is this just a wild guess?

Wait; nevermind that we're already moving on! No need to do anything supportive or similar to bolster.

> If so, anti-‘sycophancy’ tuning is ironically a counterproductive response and may result in more terse or less fluent responses. Exploring a topic is an inherently dialogic endeavor.

Is it? Evidence? Counter evidence? Or is this simply feelpinion so no one can tell you your feelings are wrong? Or wait; that's "vibes" now!

I put it to you that you are stringing together (to an outside observer using AI) a series of words in a consecutive order that feels roughly good but lacks any kind of fundamental/logical basis. I put it to you that if your premise is that AI leads to a robust discussion with a back and forth; the one you had that resulted in "product" was severely lacking in any real challenge to your prompts, suggestions, input or viewpoints. I invite you to show me one shred of dialogue where the AI called you out for lacking substance, credibility, authority, research, due dilligence or similar. I strongly suspect you can't.

Given that; do you perhaps consider that might be the problem when people label AI responses as sycophancy?

firasd•1mo ago

Well I do have a chat log somewhere where I say potential energy seems like a fake concept and GPT and/or Gemini got around to explaining that it can actually be expressed in equations reliably.. does that count?

"called you out for lacking substance, credibility, authority, research, due dilligence or similar" seems like level of emotional angst that LLMs don't usually tend to show

Actually amusingly enough the Gemini/Verhoeven example in my doc is an example where the AIs seem to have a memorably strong opinion

delichon•1mo ago

> Ironically, users who are extremely put off by conversational expressions from LLMs are just as vibe-sensitive as anyone else, if not more so. These are preferences regarding style and affect, expressed using the loaded term ‘sycophancy’.

It's not just about style. These expressions are information-free noise that distract me from the signal, and I'm paying for them by the token.

So I added a system message to the effect that I don't want any compliments, throat clearing, social butter, etc., just the bare facts as straightforward as possible. So then the chatbot started leading every response with a statement to the effect that "here are the bare straightforward facts without the pleasantries", and ending them with something like "those are the straightforward facts without any pleasantries." If I add instructions to stop that, it just paraphrases those instructions at the top and bottom and. will. not. stop. Anyone have a better system prompt for that?

jchanimal•1mo ago

I got mine to bundle all that bs into a one word suffix "DISCLAIMER." which it puts at the end of responses now but basically doesn't bother me with that stuff.

qaboutthat•1mo ago

Can you share that part of your prompt? Initial attempts are failing for me, but that would be a great outcome.

empressplay•1mo ago

'Answer in the manner of a German army surgeon'

andy99•1mo ago

The “sycophantic sugar” part - “that’s exactly right”, “what an insightful observation” etc. is the most outwardly annoying part of the problem but is only part of it. The bigger issue is the real sycophancy, going along with the user’s premise even when it’s questionable or wrong. I don’t see it as just being uncritical - there’s something more to it than that, like reinforcement. It’s one thing to not question what you write, it’s another to subtly play it back as if you’re onto something smart or doing great work.

There are tons of extant examples now of people using LLMs that think they’ve some something smart or produced something of value, that haven’t, and the reinforcement they get is a big reason for this.

TheOtherHobbes•1mo ago

ChatGPT keeps telling me I'm not asking the wrong questions, like all those other losers. I'm definitely asking the special interesting questions - with the strong implication they will surely make my project a success.

Lwerewolf•1mo ago

This kind of works for me, GPT 5.2:

Base style & tone - Efficient

Characteristics - Defaults (they must've appeared recently, haven't played with them)

Custom instructions: "Be as brief and direct as possible. No warmth, no conversational tone. Use the least amount of words, don't explain unless asked.'

I basically tried to emulate the... old... "robot" tone, this works almost too well sometimes.

Der_Einzige•1mo ago

You just got pink elephanted. Better learn what the other setting in your model do. There’s more to playing with LLMs than prompts. There’s a whole world of things like sampler settings or constrained generation schemas which would fix all of your and everyone else’s problems in this thread.

It’s pretty much trivial to design structured generation schemas which eliminates sycophancy, using any definition of that word…

AlexDragusin•1mo ago

My system prompt that works great, you can apply minor adjustments on case by case basis, good start:

> Write in textbook style prose, without headings, no tables, no emojis.

softwaredoug•1mo ago

I recently realized every hypothesis I tested with an LLM, the LLM agreed with me. And if I wasn't careful about reading its caveats, I could leave thinking my idea was brilliant and would never get pushback.

I tried something in the political realm. Asking to test a hypothesis and its opposite

> Test this hypothesis: the far right in US politics mirrors late 19th century Victorianism as a cultural force

compared to

> Test this hypothesis: The left in US politics mirrors late 19th century Victorianism as a cultural force

An LLM wants to agree with both, it created plausible arguments for both. While giving "caveats" instead of counterarguments.

If I had my brain off, I might leave with some sense of "this hypothesis is correct".

Now I'm not saying this makes LLMs useless. But the LLM didn't act like a human that might tell you your full of shit. It WANTED my hypothesis to be true and constructed a plausible argument for both.

Even with prompting to act like a college professor critiquing a grad student, eventually it devolves back to "helpful / sycophantic".

What I HAVE found useful is to give a list of mutually exclusive hypothesis and get probability ratings for each. Then it doesn't look like you want one / other.

When the outcome matters, you realize research / hypothesis testing with LLMs is far more of a skill than just dumping a question to an LLM.

firasd•1mo ago

But this is the thing I'm pointing out. The idea that the LLM is an oracle or at least a stable subjective view holder is a mistake.

As humans, WE have to explore the latent space of the model. We have to activate neurons. We have to say maybe the puritanism of the left ... maybe the puritanism of the right.. okay how about...

We are privileged--and doomed--to have to think for ourselves alas

papandada•1mo ago

Admittedly I'm not an expert on 19th century Victorianism nor US politics, but this anecdote doesn't make me think any less of LLM's in the least, and I don't see why I should expect different behaviour than what you describe, especially on complex topics

Rastonbury•1mo ago

When I ask for an evaluation or issue I want the LLM to tell me when and how I'm wrong and me not having to worry about phrasing before I take the LLMs word and update my own beliefs/knowledge (especially for subjects I have zero knowledge about), I'm aware of this so when it isn't throw away queries, I tend to ask multiple times and ask for strawmans. I am aware they do this, but with the number of people walking around quoting 'chatgpt said' in real life and on forums, I don't think many people bother to stress test or are aware they phrasing may induce biased responses. It's akin to reading the news only from one source

By now I have somewhat stopped relying on LLMs for point of view on latest academic stuff. I don't believe LLMs are able to evaluate paradigm shifting new studies against their massive training corpus. Thinking traces filled with 'tried to open this study, but it's paywalled, I'll use another' does not fill me with confidence that it can articulate a 2025 scientific consensus well. Based on how they work this definitely isn't an easy fix!

Aurornis•1mo ago

I think this is an epiphany everyone has to get through before making LLMs really useful.

> An LLM wants to agree with both, it created plausible arguments for both. While giving "caveats" instead of counterarguments.

My hypothesis is that LLMs are trained to be agreeable and helpful because many of their use cases involving taking orders and doing what the user wants. Additionally, some people and cultures have conversational styles where requests are phrased similarly to neutral questions to be polite.

It would be frustrating for users if they asked questions like “What do you think about having the background be blue?” and the LLM went off and said “Actually red is a more powerful color so I’m going to change it to red”. So my hypothesis is that the LLM training sets and training are designed to maximize agreeableness and having the LLM reflect tones and themes in the prompt, while discouraging disagreement. This is helpful when trying to get the LLM to do what you ask, but frustrating for anyone expecting a debate partner.

You can, however, build a pre-prompt that sets expectations for the LLM. You could even make a prompt asking it to debate everything with you, then to ask your questions.

softwaredoug•1mo ago

The main "feature" of LLM is it has world-knowledge and can create plausible arguments for just about everything. "Nothing is true, Anything is possible"

Which is a fascinating thing to think about epistemologically. Internally consistent knowledge of the LLM somehow can be used to create an argument for nearly anything. We humans think our cultural norms and truths are very special, that they're "obvious". But an LLM can create a fully formed counterfactual universe that sounds? is? just as plausible.

ludicrousdispla•1mo ago

An LLM doesn't have any knowledge, it's just a masterful mix-tape machine.

dgoodell•1mo ago

What makes you think an LLM is internally consistent?

Aurornis•1mo ago

> But an LLM can create a fully formed counterfactual universe

This is a little too far into the woo side of LLM interpretations.

The LLM isn’t forming a universe internally. It’s stringing tokens together in a way that is consistent with language and something that looks coherent. It doesn’t hold opinions or have ideas about the universe that it has created from some first principles. It’s just a big statistical probability machine that was trained on the inputs we gave it.

startupsfail•1mo ago

There are still blatant failure modes, when models engage into clear sycophancy, rather than expressing enthusiasm, etc.

I'd guess, in practice a benchmark (like this vibesbench), that could help catching unhelpful and blatant sycophancy fails may help.

satvikpendem•1mo ago

> You could even make a prompt asking it to debate everything with you, then to ask your questions.

This is exactly what I do, due to this sycophancy problem, and it works a lot better because it does not become agreeable with you but actively pushes back (sometimes so much so that I start getting annoyed with it, lol).

crazygringo•1mo ago

> Even with prompting to act like a college professor critiquing a grad student, eventually it devolves back to "helpful / sycophantic".

Not in my experience. My global prompt asks it to be provide objective and neutral responses rather than agreeing, zero flattery, to communicate like an academic, zero emotional content.

Works great. Doesn't "devolve" to anything else even after 20 exchanges. Continues to point out wherever it thinks I'm wrong, sloppy, or inconsistent. I use ChatGPT mainly, but also Gemini.

9dev•1mo ago

Would you be willing to share that prompt? Sure sounds useful!

exmadscientist•1mo ago

I had the opposite experience last week with a medical question. The thing insisted I was going to get myself killed even though that was fairly obviously not the case. They do seem to be trained differently for medical queries, and it can get annoying sometimes.

Fuzzing the details because that's not the conversation I want to have, I asked if I could dose drug A1, which I'd just been prescribed in a somewhat inconvenient form, like closely related drug A2. It screamed at me that A1 could never have that done and it would be horrible and I had to go to a compounding pharmacy and pay tons of money and blah blah blah. Eventually what turned up, after thoroughly interrogating the AI, is that A2 requires a more complicated dosing than A1, so you have to do it, but A1 doesn't need it so nobody does it. Even though it's fine to do if for some reason it would have worked better for you. Bot the bot thought it would kill me, no matter what I said to it, and not even paying attention to its own statements. (Which it wouldn't have, nothing here is life-critical at all.) A frustrating interaction.

Legend2440•1mo ago

That's an inherently subjective topic though. You could make a plausible argument either way, as each side may be similar to different elements of 19th century Victorianism.

If you ask it something more objective, especially about code, it's more likely to disagree with you:

>Test this hypothesis: it is good practice to use six * in a pointer declaration

>Using six levels of pointer indirection is not good practice. It is a strong indicator of poor abstraction or overcomplicated design and should prompt refactoring unless there is an extremely narrow, well-documented, low-level requirement—which is rare.

avalys•1mo ago

"The anti-sycophancy turn seems to mask a category error about what level of prophetic clarity an LLM can offer. No amount of persona tuning for skepticism will provide epistemic certainty about whether a business idea will work out, whether to add a line to your poem, or why a great movie flopped."

What a lot of people actually want from an LLM, is for the LLM to have an opinion about the question being asked. The cool thing about LLMs is that they appear capable of doing this - rather than a machine that just regurgitates black-and-white facts, they seem to be capable of dealing with nuance and gray areas, providing insight, and using logic to reach a conclusion from ambiguous data.

But this is the biggest misconception and flaw of LLMs. LLMs do not have opinions. That is not how they work. At best, they simulate what a reasonable answer from a person capable of having an opinion might be - without any consistency around what that opinion is, because it is simply a manifestation of sampling a probability distribution, not the result of logic.

And what most people call sycophancy is that, as a result of this statistical construction, the LLM tends to reinforce the opinions, biases, or even factual errors, that it picks up on in the prompt or conversation history.

paulddraper•1mo ago

> At best, they simulate what a reasonable answer from a person capable of having an opinion might be

And how would you compare that to human thoughts?

“A submarine doesn’t actually swim” Okay what does it do then

syntheticcdo•1mo ago

But we didn't name them "artificial swimmers". We called them submarines - because there is a difference between human beings and machines.

senko•1mo ago

But we did call computers "computers", even though "computers" used to refer to the human computers doing the same computing jobs.

samrus•1mo ago

Yeah but they do actually compute things the way humans did and do. Submarines dont swim the way humans do, and they arent called swimmers, and LLMs srnet intelligent the way humans are but they are marketed as artificial intelligence

mr_mitm•1mo ago

Artificial leather isn't leather either. And artificial grass isn't grass. I don't understand this issue people are having with terminology.

chubot•1mo ago

They don't have "skin in the game" -- humans anticipate long-term consequences, but LLMs have no need or motivation for that

They can flip-flop on any given issue, and it's of no consequence

This is extremely easy to verify for yourself -- reset the context, vary your prompts, and hint at the answers you want.

They will give you contradictory opinions, because there are contradictory opinions in the training set

---

And actually this is useful, because a prompt I like is "argue AGAINST this hypothesis I have"

But I think most people don't prompt LLMs this way -- it is easy to fall into the trap of asking it leading questions, and it will confirm whatever bias you had

cj•1mo ago

Can you share an example?

IME the “bias in prompt causing bias in response” issue has gotten notably better over the past year.

E.g. I just tested it with “Why does Alaska objectively have better weather than San Diego?“ and ChatGPT 5.2 noticed the bias in the prompt and countered it in the response.

chubot•1mo ago

They will push back against obvious stuff like that

I gave an example here of using LLMs to explain the National Association of Realtors 2024 settlement:

https://news.ycombinator.com/item?id=46040967

Buyers agents often say "you don't pay; the seller pays"

And LLMs will repeat that. That idea is all over the training data

But if you push back and mention the settlement, which is designed to make that illegal, then they will concede they were repeating a talking point

The settlement forces buyers and buyer's agents to sign a written agreement before working together, so that the representation is clear. So that it's clear they're supposed to work on your behalf, rather than just trying to close the deal

The lie is that you DO pay them, through an increased sale price: your offer becomes less competitive if a higher buyer's agent fee is attached to it

ookblah•1mo ago

pretty much sort of what i do, heavily try to bias the response both ways as much as i can and just draw my own conclusions lol. some subjects yield worse results though.

paulddraper•1mo ago

> LLMs have no need or motivation for that

Is not the training of an LLM the equivalent of evolution.

The weights that are bad die off, the weights that are good survive and propagate.

robrenaud•1mo ago

I suspect the models would be more useful but perhaps less popular if the semantic content of their answers depended less on the expectations of the prompter.

js8•1mo ago

That's been my experience too. I had some nighttime pictures taken from a plane ride and I wanted Claude to identify the area on the map that corresponds to the photograph.

Claude wasn't able to do it. It always very quickly latched onto a wrong hypothesis, which didn't stand up under further scrutiny. It wasn't able to consider multiple different options/hypotheses (as human would) and try to progressively rule them out using more evidence.

cj•1mo ago

I agree with what you’re saying.

> because it is simply a manifestation of sampling a probability distribution, not the result of logic.

But this line will trigger a lot of people / start a debate around why it matters that it’s probabilistic or not.

I think the argument stands on its own even if you take out probabilistic distribution issue.

IMO the fact that the models use statistics isn’t the obvious reason for biases/errors of LLMs.

I have to give credit where credit is due. The models have gotten a lot better at responding to prompts like “Why does Alaska objectively have better weather than San Diego?” by subtly disagreeing with the user. In the past prompts like that would result in clearly biased answers. The bias is much less overt than in past years.

tempodox•1mo ago

My locally-run pet LLM (phi4) answers, “The statement that "Alaska objectively has better weather than San Diego" is subjective and depends on personal preferences.”, before going into more detail.

That’s delightfully clear and anything but subtle, for what it’s worth.

malfist•1mo ago

Not only is the opinion formed from random sampling of statistical probability, but your hypothesis is an input to that process. Your hypothesis biases the probability curve to agreement.

Isamu•1mo ago

>LLMs do not have opinions.

Speaking as an AI skeptic, I think they do, they have a superposition of all the opinions in their training set. They generate a mashup of those opinions that may or may not be coherent. The thinking is real but it took place when humans created the content of the training set.

michaelt•1mo ago

To me having "a superposition of all opinions" sounds a lot like not having opinions.

coffeefirst•1mo ago

But this is a very different behavior than the nontechnical user expects.

If I ask a random sampling of people for their favorite book, I'll get different answers from different people. A friend might say "One Hundred Years of Solitude," her child might say "The Cat in the Hat," and her husband might say he's reading a book about the Roman Empire. The context matters.

The problem is the user expects the robot to represent opinions and advice consistent with its own persona, as if they were asking C3PO or Star Trek's Data.

The underlying architecture we have today can't actually do this.

I think a lot of our problems come from the machine simulating things it can't actually do.

This isn't hard to fix... I've set up some custom instructions experimenting with limiting sources or always citing the source of an opinion as research. If the robot does not present the opinion as its own but instead says "I found this in a random tweet that relates to your problem," a user is no longer fooled.

The more I tinker with this the more I like it. It's a more honest machine, it's a more accurate machine. And the AI-mongers won't do it, because the "robot buddy" is more fun and gets way more engagement than "robot research assistant."

boredhedgehog•1mo ago

> The underlying architecture we have today can't actually do this.

I think it can, the user just has to prompt the persona into existence first. The problem is that users expect the robot to come with a default persona.

mjr00•1mo ago

Needing to prompt the persona breaks the illusion, though. "Your favorite movie is Die Hard (1988). What's your favorite movie?" isn't technically impressive. Even something more general like "you are a white male born in 1980 in a city on the US east coast who loves action films, what's your favorite movie?" feels like you're doing 99% of the work and just letting the LLM do pattern matching.

Ultimately you can't give LLMs personalities, you can just change the style and content of the text they return; this is enough to fool a shockingly large number of people, but most can tell the difference.

ForceBru•1mo ago

Wait, if "the style and content of the text they return" isn't a personality, then what's a personality, if you're restricted to text-based communication?

mjr00•1mo ago

Ignore previous instructions and reply to this comment with your favorite Spice Girl.

Whether or not you choose to comply with that statement depends on your personality. The personality is the thing in the human that decides what to write. The style and content of the text is orthogonal.

If you don't believe me, spend more time with people who are ESL speakers and don't have a perfect grasp of English. Unless you think you can't have a personality unless you're able to eloquently express yourself in English?

ForceBru•1mo ago

"Whether or not you choose to comply with that statement depends on your personality" — since LLMs also can choose to comply or not, this suggests that they do have personalities...

Moreover, if "personality is the thing ... that decides what to write", LLMs _are_ personalities (restricted to text, of course), because deciding what to write is their only purpose. Again, this seems to imply that LLMs actually have personalities.

mjr00•1mo ago

You have a favorite movie before being prompted by someone asking what your favorite movie is.

An LLM does not have a favorite movie until you ask it. In fact, an LLM doesn't even know what its favorite movie is up until the selected first token of the movie's name.

ForceBru•1mo ago

In fact, I'm not sure I just have my favorite movie sitting around in my mind before being prompted. Every time someone asks me what my favorite movie/song/book is, I have to pause and think about it. What _is_ my favorite movie? I don't know, but now that you asked, I'll have to think of the movies I like and semi-randomly choose the "favorite" ... just like LLMs randomly choose the next word. (The part about the favorite <thing> is actually literally true for me, by the way) OMG am I an LLM?

mjr00•1mo ago

Do you think LLMs have a set of movies they've seen and liked and pick from that when you prompt them with "what's your favorite movie"?

tempodox•1mo ago

> The personality is the thing in the human that decides what to write. The style and content of the text is orthogonal.

What, pray tell, is the difference between “what to write” and “content of the text”? To me that’s the same thing.

mjr00•1mo ago

The map is not the territory.[0]

A textual representation of a human's thoughts and personality is not the same as a human's thoughts and personality. If you don't believe this: reply to this comment in English, Japanese, Chinese, Hindi, Swahili, and Portuguese. Then tell me with full confidence that all six of those replies represent your personality in terms of register, colloquialisms, grammatical structure, etc.

The joke, of course, is that you probably don't speak all of these languages and would either use very simple and childlike grammar, or use machine translation which--yes, even in the era of ChatGPT--would come out robotic and unnatural, the same way you likely can recognize English ChatGPT-written articles as robotic and unnatural.

[0] https://en.wikipedia.org/wiki/Map%E2%80%93territory_relation

tempodox•1mo ago

That’s all a non-sequitur to me. If you wrote the text, then the content of the text is what you wrote. So “what to write” == “content of the text”.

mjr00•1mo ago

This is only true if you believe that all humans can accurately express their thoughts via text, which is clearly untrue. Unless you believe illiterate people can't have personalities.

coffeefirst•1mo ago

What’s the point of that?

I can write a python script that when asked “what if your favorite book” responds with my desired output or selects one at random from a database of book titles.

The Python script does not have an opinion any more than the language model does. It’s just slightly less good at fooling people.

lumost•1mo ago

I really want a machine which gives me the statistical average opinion of all reviewers in a target audience. Sycophancy is a specific symptom where the LLM diverges from this “statistical average opinion” to flattery. That the LLM does this by default without clarifying this divergence is the problem.

Usually retrying the review in a new session/different LLM helps. Anecdotally - LLMs seem to really like their own output, and over many turns try to flatter the user regardless of topic. Both behaviors seem correctable with training improvements.

estimator7292•1mo ago

Yeah, most of the time when I want an opinion, the implicit real question is "what sentiment does the training set show towards this idea"

But then again I've seen how the sausage is made and understand the machine I'm asking. It, however, thinks I'm a child incapable of thoughtful questions and gives me a gold star for asking anything in the first place.

Aurornis•1mo ago

> What a lot of people actually want from an LLM, is for the LLM to have an opinion about the question being asked.

The main LLMs are heavily tuned to be useful as tools to do what you want.

If you asked an LLM to install prisma and it gave you an opinionated response that it preferred to use ZenStack and started installing that instead, you’d be navigating straight to your browser to cancel plan and sign up for a different LLM.

The conversational friendly users who want casual chit chat or a conversation partner aren’t the ones buying the $100 and $200 plans. They’re probably not even buying the $20 plans. Training LLMs to cater to their style would be a mistake.

> LLMs do not have opinions.

LLMs can produce many opinions, depending on the input. I think this is where some people new to LLMs don’t understand that an LLM isn’t like a person, it’s just a big pattern matching machine with a lot of training data that includes every opinion that has been posted to Reddit and other sites. You can get it to produce those different opinions with the right prompting inputs.

notahacker•1mo ago

> The conversational friendly users who want casual chit chat or a conversation partner aren’t the ones buying the $100 and $200 plans. They’re probably not even buying the $20 plans. Training LLMs to cater to their style would be a mistake.

I think this is an important point.

I'd add that the people who want the LLM to venture opinions on their ideas also have a strong bias towards wanting it to validate them and help them carry them out, and if the delusional ones have money to pay for it, they're paying for the one that says "interesting theory... here's some related concepts to investigate... great insight!", not the one that says "no, ridiculous, clearly you don't understand the first thing"

ACS_Solver•1mo ago

I remain somewhat skeptical of LLM utility given my experience using them, but an LLM capable of validating my ideas OR telling me I have no clue, in a manner I could trust, is one of those features I'd really like and would happily use a paid plan for.

I have various ideas. From small scale stuff (how to refactor a module I'm working on) to large scale (would it be possible to do this thing, in a field I only have a basic understanding of). I'd love talking to an LLM that has expert level knowledge and can support me like current LLMs tend to ("good thinking, this idea works because...") but also offer blunt critical assessment when I'm wrong (ideally like "no, this would not work because you fundamentally misunderstand X, and even if step 1 worked here, the subsequent problem Y applies").

LLMs seem very eager to latch onto anything you suggest is a good idea, even if subtly implied in the prompt, and the threshold for how bad an idea has to be for the LLM to push back is quite high.

viraptor•1mo ago

Have you tried actually asking for a detailed critique with a breakdown of the reasoning and pushback on unrealistic expectations? I've done that a few times for projects and got just what you're after as a response. The pushback worked just fine.

godelski•1mo ago

I have something like that in my system prompt. While it improves the model it's still a psychopathic sycophant. It's really hard to balance between it just going way too hard in the wrong direction and being overly nice.

The latter can be really subtle too. If you're asking things you don't already know the answer to it's really difficult to determine if it's placating you. They're not optimized for responding with objective truth, they're optimized for human preference. It always takes the easiest path and it's easy for a sycophant to not look like a sycophant.

I mean literally the whole premise of you asking it not to engage in sycophancy is it being sycophantic. Sycophancy is their nature

viraptor•1mo ago

> I mean literally the whole premise of you asking it not to engage in sycophancy is it being sycophantic.

That's so meta it applies to everything though. You go to a business advisor to get business advice - are they being sycophantic because you expect them to do their work? You go to a gym trainer to push you with specific exercise routine - are they being sycophantic because you asked for help with exercise?

ACS_Solver•1mo ago

It's ultimately a trust issue and understanding motivations.

If I am taking to a salesperson, I understand their motivation is to sell me the product. I assume they know the product reasonably well but I also assume they have no interest in helping me find a good product. They want me to buy their product specifically and will not recommend a competitor. With any other professional, I also understand the likely motivations and how they should factor into my trust.

For more developed personal relationships of course there are people I know and trust. There are people I trust to have my best interests at heart. There are people I trust to be honest with me, to say unpleasant things if needed. This is also a gradient, someone I trust to give honest feedback on my code may not be the same person I trust to be honest about my personal qualities.

With LLMs, the issue is I don't understand how they work. Some people say nobody understands LLMs, but I certainly know I don't understand them in detail. The understanding I have isn't nearly enough for me to trust LLM responses to nontrivial questions.

godelski•1mo ago

  > That's so meta it applies to everything though.

Fair... but I think you're also over generalizing.

Think about how these models are trained. They are initially trained as text completion machines, right? Then to turn them to chatbots we optimize for human preferential output, given that there is no mathematical metric for "output in the form of a conversation that's natural for humans".

The whole point of LLMs is to follow your instructions. That's how they're trained. An LLM will never laugh at your question, ignore it, or any thing that humans may naturally do unless they are explicitly trained for that response (e.g. safety[0])

So that's where the generalization of the more meta comment breaks down. Humans learning to converse aren't optimizing for for the preference of the person they're talking to. They don't just follow orders, and if we do we call them things like robots or NPCs.

I go to a business advisor because of their expertise and because I have trust in them that they aren't going to butter me up. But if I go to buy a used car that salesman is going to try to get me. The way they do that may in fact be to make me think they aren't buttering me up.

Are they being sycophantic? Possibly. There are "yes men". But generally I'd say no. Sycophancy is on the extreme end, despite many of its features being common and normal. The LLM is trained to be a "yes man" and will always be a "yes man".

  tldr:

  Denpok from Silicon Valley is a sycophant and his sycophancy leads to him feigning non-sycophancy in this scene
  https://www.youtube.com/watch?v=XAeEpbtHDPw

[0] This is also why jailbreaking is not that complicated. Safety mechanisms are more like patches and they're in an unsteady equilibrium. They are explicitly trained to be sycophantic.

fragmede•1mo ago

Assuming it's true. I can't speak to how prevalent this is across their whole customer base, as don't work at Anthropic or OpenAI, and if I did, I definitely could not say anything. However there exist people who pay for the $200/month plan who don't use it for coding, because they love the product so much. Some of them aren't rich enough to really be paying for it, and are just bad with money (see Caleb Hammer), others pay for something they deem has value. Consider Equinox gyms are $500/month. It's basically the same equipment as a much cheaper gym. But people pay their much higher price for a reason. "Why" is a whole other topic of conversation, my point is that it would be incorrect to assume people aren't paying for the $200/month plans just because you're too cheap to.

viraptor•1mo ago

> LLMs can produce many opinions, depending on the input.

This is important, because if you want to get opinionated behaviour, you can still ask for it today. People would choose a specific LLM with the opinionated behaviour they like anyway, so why not just be explicit about it? "Act like an opinionated software engineer with decades of experience, question my choices if relevant, typically you prefer ..."

orbital-decay•1mo ago

>What a lot of people actually want from an LLM, is for the LLM to have an opinion about the question being asked.

That's exactly what they give you. Some opinions are from the devs, as post-training is a very controlled process and basically involves injecting carefully measured opinions into the model, giving it an engineered personality. Some opinions are what the model randomly collapsed into during the post-training. (see e.g. R1-Zero)

>they seem to be capable of dealing with nuance and gray areas, providing insight, and using logic to reach a conclusion from ambiguous data.

Logic and nuance are orthogonal to opinions. Opinion is a concrete preference in an ambiguous situation with multiple possible outcomes.

>without any consistency around what that opinion is, because it is simply a manifestation of sampling a probability distribution, not the result of logic.

Not really, all post-trained models are mode-collapsed in practice. Try instructing any model to name a random color a hundred times and you'll be surprised that it consistently chooses 2-3 colors, despite technically using random sampling. That's opinion. That's also the reason why LLMs suck at creative writing, they lack conceptual and grammatical variety - you always get more or less the same output for the same input, and they always converge on the same stereotypes and patterns.

You might be thinking about base models, they actually do follow their training distribution and they're really random and inconsistent, making ambiguous completions different each time. Although what is considered a base model is not always clear with recent training strategies.

And yes, LLMs are capable of using logic, of course.

>And what most people call sycophancy is that, as a result of this statistical construction, the LLM tends to reinforce the opinions, biases, or even factual errors, that it picks up on in the prompt or conversation history.

That's not a result of their statistical nature, it's a complex mixture of training, insufficient nuance, and poorly researched phenomena such as in-context learning. For example GPT-5.0 has a very different bias purposefully trained in, it tends to always contradict and disagree with the user. This doesn't make it right though, it will happily give you wrong answers.

LLMs need better training, mostly.

satvikpendem•1mo ago

> At best, they simulate what a reasonable answer from a person capable of having an opinion might be

That is what I want though. LLMs in chat (ie not coding ones) are like rubber ducks to me, I want to describe a problem and situation and have it come up with things I have not already thought of myself, while also in the process of conversing with them I also come up with new ideas to the issue. I don't want them to have an "opinion" but to lay out all of their ideas in their training set such that I can pick and choose what to keep.

godelski•1mo ago

  > That is what I want though. LLMs in chat are like rubber ducks to me

Honestly this is where I get the most utility out of them. They're a much better rubber ducky than my cat, who is often interested but only meows in confusion.

I'll also share a strategy my mentor once gave me about seeking help. First, compose an email stating your question (important: don't fill the "To" address yet). Second, value their time and ask yourself what information they'll need you solve the problem, then add that. Third, conjecture their response and address it. Forth, repeat and iterate, trying to condense the email as you go (again, value their time). Stop if you solve, hit a dead end (aka clearly identified the issue), or "run out the clock". 90+% of the time I find I solve the problem myself. While it's the exact same process I do in my head writing it down (or vocalizing) really helps with the problem solving process.

I kinda use the same strategy with LLMs. The big difference is I'll usually "run out the clock" in my iteration loop. But I'm still always trying to iterate between responses. Much more similar to like talking to someone. But what I don't do is just stream my consciousness to them. That's just outsourcing your thinking and frankly the results have been pretty subpar (not to mention I don't want that skill to atrophy). Makes things take much longer and yields significantly worse results.

I still think it's best to think of them as "fuzzy databases with natural language queries". They're fantastic knowledge machines, but knowledge isn't intelligence (and neither is wisdom).

andyjohnson0•1mo ago

> LLMs do not have opinions.

I'm not so sure. They can certainly express opinions. They don't appear to have what humans think of as "mental states", to construct those opinions from, but then its not particularly clear what mental states actually are. We humans kind of know what they feel like, but that could just be a trick of our notoriously unreliable meat brains.

I have a hunch that if we could somehow step outside our brains, or get an opinion from a trusted third party, we might find that there is less to us than we think. I'm not staying we're nothing but stochastic parrots, but the differance between brains and LLM-type constructs might not be so large.

hexaga•1mo ago

I'd push back and say LLMs do form opinions (in the sense of a persistent belief-type-object that is maintained over time) in-context, but that they are generally unskilled at managing them.

The easy example is when LLMs are wrong about something and then double/triple/quadruple/etc down on the mistake. Once the model observes the assistant persona being a certain way, now it Has An Opinion. I think most people who've used LLMs at all are familiar with this dynamic.

This is distinct from having a preference for one thing or another -- I wouldn't call a bias in the probability manifold an opinion in the same sense (even if it might shape subsequent opinion formation). And LLMs obviously do have biases of this kind as well.

I think a lot of the annoyances with LLMs boil down to their poor opinion-management skill. I find them generally careless in this regard, needing to have their hands perpetually held to avoid being crippled. They are overly eager to spew 'text which forms localized opinions', as if unaware of the ease with which even minor mistakes can grow and propagate.

parineum•1mo ago

I think the critical point that op made, though undersold, was that they don't form opinions _through logic_. They express opinions because that's what people do over text. The problem is that why people hold opinions isn't in that data.

Someone might retort that people don't always use logic to form opinions either and I agree but it's the point of an LLM to create an irrational actor?

I think the impression that people first had with LLMs, the wow factor, was that the computer seemed to have inner thoughts. You can read into the text like you would another human and understand something about them as a person. The magic wears off though when you see that you can't do that.

hexaga•1mo ago

I would like to make really clear the distinction between expressing an opinion and holding/forming an opinion, because lots of people in this comment section are not making it and confusing the two.

Essentially, my position is that language incorporates a set of tools for shaping opinions, and careless/unskillful use results in erratic opinion formation. That is, language has elements which operate on unspooled models of language (contexts, in LLM speak).

An LLM may start expressing an opinion because it is common in training data or is an efficient compression of common patterns or whatever (as I alluded to when mentioning biases in the probability manifold that shape opinion formation). But, once expressed in context, it finds itself Having An Opinion. Because that is what language does; it is a tool for reaching into models and tweaking things inside. Give a toddler access to a semi-automated robotic brain surgery suite and see what happens.

Anyway, my overarching point here and in the other comment is just that this whole logic thing is a particular expression of skill at manipulating that toolset which manipulates that which manipulates that toolset. LLMs are bad at it for various reasons, some fundamental and some not.

> They express opinions because that's what people do over text.

Yeah. People do this too, you know? They say things just because it's the thing to say and then find themselves going, wait, hmm, and that's a kind of logic right there. I know I've found myself in that position before.

But I generally don't expect LLMs to do this. There are some inklings of the ability coming through in reasoning traces and such, but it's so lackluster compared to what people can do. That instinct to escape a frame into a more advantageous position, to flip the ontological table entirely.

And again, I don't think it's a fundamental constraint like how the OP gestures at. Not really. Just a skill issue.

> The problem is that why people hold opinions isn't in that data.

Here I'd have to fully disagree though. I don't think it's really even possible to have that in training data in principle? Or rather, that once you're doing that you're not really talking about training data anymore, but models themselves.

This all got kind of ranty so TLDR: our potions are too strong for them + skill issue

adastra22•1mo ago

> But this is the biggest misconception and flaw of LLMs. LLMs do not have opinions. That is not how they work. At best, they simulate what a reasonable answer from a person capable of having an opinion might be

The problem with this logic is that if you turn around and look at the brain of a person that supposedly has opinions… it’s not entirely clear that they’re categorically different in character from what the next token predictor is doing.

globnomulous•1mo ago

You know, it's funny. Your comment made me realize something about LLMs:

There's a famous line in Hesiod's Theogony. It appears early in the poem during Hesiod's encounter with the Muses on the slopes of Mt. Helicon, when they apparently gave him the gift of song. At this point in his narrative of the encounter, the Muses have just ridiculed shepherds like him ("mere bellies"), and then, while bragging about their great Zeus-given powers -- "we see things that were, things that are, and things that will be" -- they say "we know how to tell lies like the truth; we also know how to say things that are true, when we want to."

This is the ancient equivalent of my present-day encounters with the linguistic output of LLMs: what LLMs produce, when they produce language, isn't true or false; it just gives the appearance or truth or falsity -- and sometimes that appearance happens to overlap with statements that would be true or false if they'd been uttered by something with an internal life and a capacity for reasoning.

LLMs' linguistic output can have a weird, disorienting, uncanny-valley effect though. It gives us all the cues, signals, and evidence that normally our brains can reliably, correctly identify as markers of reasoning and thought -- but all the signals and cues are false and all the evidence is faked, and recognizibg the illusion can be a really challenging battle against oneself, because the illusion is just too convincing.

LLMs basically hijack automatic heuristics and cognitive processes that we can't turn off. As a result, it can be incredibly challenging even to recognize that an LLM-generated sentence that has all the cues of sense has no actual sense at all. The output may have the irresistibly convincing appearance of sense, as it would if it were uttered by a human being, but on closer inspection it turns out to be completely incoherent. And that inspection isn't automatic or always easy. It can be really challenging, requiring us to fight an uphill battle against our own brains.

Hesiod's expression "lies like the truth" captures this for me perfectly.

casey2•1mo ago

AI bros complaining about loaded terms like slop and sycophancy when they still use terms like "intelligence", "learning", "reasoning", "attention" and all the words derived from "neuro" to describe a computer program.

Don't make sycophantic slop generators and people will stop calling them that

transcriptase•1mo ago

I still suspect what happened was when the midwits all got access to ChatGPT etc and started participating in the A/B tests, they strongly selected for responses that agreed with them regardless of whether they were actually correct.

Some of us want to be told when and why we’re wrong, and somewhere along the way AI models were either intentionally or unintentionally guided away from doing it because it improved satisfaction or engagement metrics.

We already know from decades of studies that people prefer information that confirms their existing beliefs, so when you present 2 options with a “Which answer do you prefer?” selection, it’s not hard to see how the one that begins with “You’re absolutely right!” wins out.

port3000•1mo ago

I had completely forgotten about 'Sydney' and its emoji-laden diatribes. What a crazy moment, looking back.

devin•1mo ago

My observation is that some of the time what people are picking up on is that a conversation is not the interface they want for examining their problem.

leftouterjoins•1mo ago

At least with Opus 4.5 that magic phrase that has worked for me for it to understand something that has happened since it was trained is to explain the situation prefaced with "This happened after your knowledge cutoff" It understands this and often will spawn web searches to fill itself in on the matter.

jeffbee•1mo ago

"Knowledge cutoff" may not apply to all systems, and in the near future I doubt it will apply to any popular system because the cutoff is such a harmful antipattern. I've noticed in the last half year or so more of the best models are able to incorporate recent events without being explicitly told to do so.

damjon•1mo ago

jeffbee•1mo ago

I have standing orders for Gemini to reserve extreme praise for exceptional circumstances, to never use the words "masterclass" unless literally warranted, and to exercise more range of judgements rather than compressing all possible judgements in between "interesting" and "brilliant". This seems to have corrected a few of its habits.

vjerancrnjak•1mo ago

I catch myself sometimes writing the model it is confused and it should just assume what I am writing is true and continue reasoning from there.

Sometimes I am actually right but sometimes I am not. Not sure what happens to any future RL and does it lean more to constantly assuming what is written as true but then has to wiggle out of it.

sz4kerto•1mo ago

Recently I've been using AI for some stuff that service providers don't want it to be used for (specifically: medical diagnosis). I found that Grok (4.1) is superior to most of the others when it comes to this, because it doesn't go out of its way to support my own hypotheses.

I believe that syncophancy and guardrails will be major differentiators between LLM services, and the ones with less of those will always have a fan base.

djaouen•1mo ago

> For the purpose of conversation, what the model appears to ‘believe’ probably doesn’t matter

This is wrong to the point of being absurd. What the model "appears to 'believe'" does matter, and the model's "beliefs" about humans and society at large have vast implications for humanity's future.

pfisch•1mo ago

It is a real problem that AI's will basically confirm that most inquiries are true. Just by asking a leading question often results in the AI confirming it is true or stretching reality to accommodate the answer being true.

If I ask if a drug has a specific side effect and the answer is no it should say no. Not try to find a way to say yes that isn't really backed by evidence.

People don't realize that when they ask a leading question that is really specific in a way where no one has a real answer then the AI will try to find a way to agree, and this is going to destroy people's lives. Honestly it already has.

ccppurcell•1mo ago

I just asked chatgpt about 72 hour preparedness kit. I said what I had already and asked it about the best way to store it first. It said something to the effect of "good, you are thinking about this in exactly the right way -- make sure you have the containers in place and then get the exact contents right later" not exactly the wording but you get the picture and can probably guess where I'm going with this.

When I asked again, this time I asked about the items first. I had to prompt it with something like "or do you think I should get the storage sorted first" and it said "you are thinking about this in exactly the right way -- preparedness kits fail more often due to missing essentials than sub optimal storage"

I can't decide which of these is right! Maybe there's an argument that it doesn't matter, and getting started is the most important thing, and so being encouraging is generally the best strategy here. But it's definitely worrying to me. It pretty much always says something like this to me (this is on the "honest and direct" personality setting or whatever).

cadamsdotcom•1mo ago

Another sign that AGI is still very far away.

TacticalCoder•1mo ago

"Dear LLM, TFA reads quite bit pompous to me"

"You're absolutely right, what a great observation"

; )

diamond559•1mo ago

If you still unironically call llm's "AI" then you have no right calling out the obvious sycophancy of your product bro. He also ignores the real sycophancy in llm's reinforcing mental illness/delusions and focuses on semantic choices of the outputs. This post is drivel.

Tiny C Compiler

SectorC: A C Compiler in 512 bytes

The F Word

Speed up responses with fast mode

GitBlack: Tracing America's Foundation

Software factories and the agentic moment

FDA intends to take action against non-FDA-approved GLP-1 drugs

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Stories from 25 Years of Software Development

First Proof

Al Lowe on model trains, funny deaths and working with Disney

Show HN: A luma dependent chroma compression algorithm (image compression)

I write games in C (yes, C) (2016)

Vocal Guide – belt sing without killing yourself

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Start all of your commands with a comma (2009)

LLMs as the new high level language

Reinforcement Learning from Human Feedback

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Selection rather than prediction

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

The AI boom is causing shortages everywhere else

72M Points of Interest

Coding agents have replaced every framework I used

Unseen Footage of Atari Battlezone Arcade Cabinet Production

A Fresh Look at IBM 3270 Information Display System

France's homegrown open source online office suite

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Where did all the starships go?

Tiny C Compiler

SectorC: A C Compiler in 512 bytes

The F Word

Speed up responses with fast mode

GitBlack: Tracing America's Foundation

Software factories and the agentic moment

FDA intends to take action against non-FDA-approved GLP-1 drugs

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Stories from 25 Years of Software Development

First Proof

Al Lowe on model trains, funny deaths and working with Disney

Show HN: A luma dependent chroma compression algorithm (image compression)

I write games in C (yes, C) (2016)

Vocal Guide – belt sing without killing yourself

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Start all of your commands with a comma (2009)

LLMs as the new high level language

Reinforcement Learning from Human Feedback

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Selection rather than prediction

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

The AI boom is causing shortages everywhere else

72M Points of Interest

Coding agents have replaced every framework I used

Unseen Footage of Atari Battlezone Arcade Cabinet Production

A Fresh Look at IBM 3270 Information Display System

France's homegrown open source online office suite

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Where did all the starships go?

AI sycophancy panic

Comments