Truer words have never been spoken. LLMs make mind blowing demos, but real-world performance is much less (but still useful).
An example from yesterday:
I asked Google / Nano Banana to repaint my house with a few options. It gave a nice write up on three themes and a nice rendering of 1/3 vertical slices in one image of each theme.
Then, I asked it to redraw the image entirely in one of the themes. It redrew the image 1/3 in the one theme I asked for and 2/3 in a theme I did not ask for. Further prompting did not fix it. At the end of the day, this was a useful exercise and I was able to get some sense of what color scheme would work better for my house, but the level of execution was miles away from the perfection portrayed in demos and hypester / huckster bloggers and VCs.
What I want is an output that records which sections of the image have contributed to each word/letter, preferably with per word confidence levels and user correctable identification information.
I should be able to build a UI to say: no, this section is red-on-green vertically aligned Cyrillic characters; try again.
bonsai_spool•4h ago
obsidianbases1•2h ago
Even considering HNs no LLMs for comments rule, which I mostly agree with, I think we would all lose of the same rule were applied to publishing in general.
curtisf•2h ago
https://claytonwramsey.com/blog/prompt/
discussion: https://news.ycombinator.com/item?id=43888803
All of the output beyond the prompt contains, definitionally, essentially no useful information. Unless it's being used to translate from one human language to another, you're wasting your reader's time and energy in exchange for you own. If you have useful ideas, share them, and if you believe in the age of LLMs, be less afraid of them being unpolished and simply ask you readers to rely on their preferred tools to piece through it.