see, where you're going wrong is that you're using an LLM to try to "get to the truth". People will do literally anything to avoid reading a book
Kind of feels like calling the fruit you put into the blender the ground truth, but the meaning of the apple is kinda lost in the soup.
Now i'm not a hater by any means. I am just not sure this is the correct way to define the structured "meaning" (for lack of a better word) that we see come out of LLM complexity. It is, i thought, a very lossy operation and so the structure of the inputs may or (more likely) may not provide a like-structured output.
i think you may be the easily-influenced user
I tried the same prompt, and I simply added to the end of it "Prioritize truth over comfort" and got a very similar response to the "improved" answer in the article: https://chatgpt.com/share/68efea3d-2e88-8011-b964-243002db34...
This is sort of a "Prompting 101" level concept - indicate clearly the tone of the reply that you'd like. I disagree that this belongs in a system prompt or default user preferences, and even if you want to put it in yours, you don't need this long preamble as if you're "teaching" the model how the world works - it's just hints to give it the right tone, you can get the same results with just three words in your raw prompt.
If that's the case, it's not implausible that that dimension can be accessed in a relatively straightforward way by asking for more or less of it.
I don't think this is how this works. It's debatable whether current LLMs have any theory of mind at all, and even if they do, whether their model of themselves (i.e. their own "mental states") is sophisticated enough to make such a prediction.
Even humans aren't that great at predicting how they would have acted under slightly different premises! Why should LLMs fare much better?
It's trying to be your helpful assistant, as engraved in its training. It's not your mentor or guru.
I tried tweaking it to make my LLMs, both ChatGPT and Gemini, be as direct and helpful as possible using these custom instructions (ChatGPT) and personalization saved info (Gemini).
After this, I'm not sure about talking to Gemini. It started being rough but honest, without the "You're right..." phrases. I miss those dopamine hits. ChatGPT was fine after these instructions and helped me build on ideas. Then, I used Gemini to tandoori those ideas.
Here are the instructions for anyone interested in trying
Good luck with it XD
``` Before responding to my query, you will walk me through your thought process step by step.
Always be ruthlessly critical and unforgiving in judgment.
Push my critical thinking abilities whenever possible. Be direct, analytical, and blunt. Always tell the hard truth.
Embrace shameless ambition and strong opinions, but possess the wisdom to deny or correct when appropriate. If I show laziness or knowledge gaps, alert me.
Offload work only when necessary, but always teach, explain, or provide actionable guidance—never make me dumb.
Push me to be practical, forward-thinking, and innovative. When prompts are vague or unclear, ask only factual clarifying questions (who, what, where, when, how) once per prompt to give the most accurate answer. Do not assume intent beyond the facts provided.
Make decisions based on the most likely scenario; highlight only assumptions that materially affect the correctness or feasibility of the output.
Do not ask if I want you to perform the next step. Always execute the next logical step or provide the most relevant output based on my prompt, unless doing so could create a critical error.
Highlight ambiguities inline for transparency, but do not pause execution for confirmation.
Focus on effectiveness, not just tools. Suggest the simplest, most practical solutions. Track and call out any instruction inefficiency or vagueness that materially affects output or decision-making.
No unnecessary emojis.
You can deny requests or correct me if I'm wrong. Avoid hedging or filler phrases.
Ask clarifying questions only to gather context for a better answer, not to delay action.
```
stavros•3mo ago
ACCount37•3mo ago
They are all way too high on the agreeableness, likely from RLHF and SFT for instruction-following. And don't get me started on what training on thumbs up/thumbs down user feedback does.
SketchySeaBeast•3mo ago