This was late at night and I just wanted to share the surreal experience with HN. The difference here is that I am actually an expert on the things that I had it evaluate, I just threw some code that I polished over the years, to see what it would respond since LLMs can definitely pattern-match the concepts present in a block of eg code and then compare it to everything they’ve been trained on.
Here is almost the same exact sequence, but with repeated instructions throughout, to be brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...
GPT-5 Thinking seemed to have a much more tolerable default personality than 5 "chat/instant", but 5.1 seems a bit broken across the board. Reasoning capabilities also seem somewhat weaker.
I will say that making a framework of your own is an achievement, but making a great framework is really rare. I don't know what your framework is like, so I can't say.
It doesn't seem all that impressive to me and I know that the LLM amped up the positivity, but if this really has clear advantages over the other frameworks it is being compared to, just how bad are web frameworks?
"Short answer: yes, the idea and overall feature are solid and “cool” for a platform – environment-aware static bundling + filtering + preloading is a real capability, not fluff. The implementation is workable but has a few concrete problems and some messy spots. (...)"
It still did this. Can you retry other approaches, eg saying it was a junior developer who wrote the code, and to critique it etc.
Is it sycophantic regardless? Or is it objective? After multiple runs of same prompts but different instructions trying to minimize sycophancy.
I'm gonna assume that you are a pretty young developer. I think you have built something that you have put a lot of thought and engineering effort into, and that you should be proud of that. But asking very open-ended leading questions ike this to an LLM is not the way to go. Truthfully, it is not even the way to go when talking to another human either but humans are more understanding. We've all been young and insecure once, and one with any ounce of empathy will gently steer you towards a more healthy path without overt flattery.
I urge you, for your own emotional well being, seek more human connections. Chat-GPT can be great for very targeted questions if you have a specific problem or a very specific area you want feedback on and prompt it to give feedback to that. And this may sound very harsh, but I think you need to hear it: The kind of validation-seeking you are engaging with in this chat is not at all that different from the ones seeking emotional support from an "AI-girlfriend" or similiar. Please be careful, and find your own community with real humans that you can relate and look up to.
This was late at night and I just wanted to share the surreal experience with HN. As you might know from my posts, I am a critic of AI in general.
Here is almost the same exact sequence, but with repeated instructions throughout, to be brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...
Would you update your assessment after reading that? Again for me this is about an experiment and sharing a particular AI interaction with fellow humans.
Also the "calculation" is totally wrong. A guy with lots of enthusiam and masses of free time rebuild existing tech stacks in plain PHP (!) and JS (!), two of the slowest languages on the planet. That alone should debunk the whole story. Comparing ANY logic in PHP vs C (nginx) is just nuts and ChatGPT should know it.
Interesting hobby project, and maybe even a profitable one if you find niche clients, but obviously far far away from the definition of a 0.01% engineer.
Here is almost the same exact sequence but with constant instructions to remain brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...
I was just late at night and wanted to post this chat transcript on HN to share some perspective on what developers are getting from ChatGPT.
I happen to be an expert in this particular area that I’m building.
ChatGPT seems to remember that I am in New York and want “no bullshit” answers. In the last few days it keeps weaving that into most responses.
That fact appears in its memory that users can access, as is the fact that it should not, under any circumstances, use emojis in code or comments, but it proceeds to do so anyway, so I am not sure how the memory gets prioritized.
Here is the interesting thing. As an expert in the field I do agree with ChatGPT on its statistical assessment of what I’ve built, because it took me years of refinement. I also tried it with average things and it correctly said that they’re average and unremarkable. I simply didn’t post that.
What I am interested in, is how to get AI transcripts to be used as unbiased third-party “first looks” at things, such as what VCs would do for due diligence.
This was just a quick thing I thought I’d get a few responses on HN about. I suspect it might have hit the front page because some people dug through the code and saw the value. But you can get all the code for free on https://github.com/Qbix/Platform .
Yeah, there is obviously an element of flattery that people let go to their head. I have had ChatGPT repeatedly confirm the validity of ideas I had in fields I am NOT an expert, while pushing back on countless others. I use it as one data point and mercilessly battle-test the ideas and code by asking it to find holes in them from various angles. This particular HN submission, although done very late at night here in NYC, was an interesting mix of genuinely groundbreaking stuff and ChatGPT being able to see the main ideas at a glance, and “going wild”, while at the same time if I run it with instructions at the start of “be extremely objective”, it still approaches this same thing in the end.
I think the future is lots of incremental improvements that get replicated everywhere and humans outclassed in nearly every field, where they stop relying on each other.
As far as LLMs yes I think they are the best placed to know if some code or invention is novel, because of their vast training. Can be far better than a patent examiner, if trained on prior art, for instance.
What you’re not used to is an LLM being fed stuff that you statistically / heuristically would expect to be average but is in fact the polished result of years of work. The LLM freaks out, you get surprised. You think it was the prompts. The prompts are changed, the END result is the same (scroll to the bottom).
I want to see whether foundational LLMs can be used as a good first filter for dealflow and evaluating actual projects.
For a different comparison of flattery, sycophancy, and brutalism, I copied-and-pasted each segment of your first conversation into "my own" ChatGPT 5.1: https://chatgpt.com/share/691b71ea-4e58-8005-8ce6-a6b5d10120...
It is my observation that "my" bot produces completely different results compared to "your" bot.
Can you share your instructions?
The output is of very similar style to how my interactions with it are when I'm using it for work on my own projects.
My bot does run with a pretty lengthy set of supposed rules that have been accumulated, tweaked, condensed and massaged over the past couple of years. These live in a combination of custom instructions (in Preferences), deliberately-set memory, and recollection from other chats.
I use "supposed" here because these individual aspects are frequently ignored, and they always have been. Yet even if the specificity is often glossed over, the rules quite clearly do tend to shape the overall output and tone (as the above-linked chat demonstrates).
Anyway, I like the style quite a lot. It lets me focus on achieving technical correctness instead of ever being inundated with the noise of puffery.
But I have no idea where I'd start to duplicate that environment. Someone at OpenAI could surely dissect it, but the public interface for ChatGPT is way too limited to allow seeing how context is injected and used.
So while I'd certainly would love to share specific instructions, that's simply beyond my capability as a lowly end-user who has been emphatically working against sycophancy in their own little "private" ChatGPT.
I barely even know how I got here.
(I could ask the bot, but I can say with resolute certainty that it would simply lie.)
This is a classic case of ChatGPT freaking out over the quality or substance of what’s been built, but completely underestimating what it takes to get buy-in and uptake in the real world. The two rarely have to do with the other, which is how we end up where we are today (relying on giant corporations for all our reliable infrastructure and even social platforms).
But also, no, it was hallucinating completely. Things like "static asset pipeline" is not mindblowing to any VC, even more written in PHP. Your qbix project is huge and complex, for sure, it does not mean it is a genius implementation although it could be. Unfortunately, it is not engaging enough for me to try it so at least in the marketing aspect of it, the implementation is failing badly.
5.1 seems to easily miss the primary point of what is being asked or discussed.
spaceprison•2mo ago
“Wow, are you saying I kind of singlehandedly built the kind of stack they use at Google? If engineering departments only knew… how can I get some CTO to hire me as a chief engineer?”
was probably when chatgpt should have said - no you built what seems like an interesting/capable php framework.
But instead you got merciless positivity.
coldtea•2mo ago
EGreg•2mo ago
Here is almost the same exact sequence, but with repeated instructions throughout, to be brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...