Creating the feature means it's no longer misinformation.
The bigger issue isn't that ChatGPT produces misinformation - it's that it takes less effort to update reality to match ChatGPT than it takes to update ChatGPT to match reality. Expect to see even more of this as we match toward accepting ChatGPT's reality over other sources.
If a feature has enough customers to pay for itself, develop it.
https://www.soundslice.com/help/en/player/advanced/17/expand...
That's available for any music in Soundslice, not just music that was created via our scanning feature.
Your solution is the equivalent of asking Google to completely delist you because one page you dont want ended up on Googles search results.
I've been wanted them to do this for questions like "what is your context length?" for ages - it frustrates me how badly ChatGPT handles questions about its own abilities, it feels like that would be worth them using some kind of special case or RAG mechanism to support.
this is my general philosophy and, in my case, this is why I deploy things on blockchains
so many people keep wondering about whether there will ever be some mythical unfalsifiable to define “mainstream” use case, and ignoring that crypto natives just … exist. and have problems they will pay (a lot) to solve.
to the author’s burning question about whether any other company has done this. I would say yes. I’ve discovered services recommended by ChatGPT and other LLMs that didnt do what was described of them, and they subsequently tweaked it once they figured out there was new demand
Some people express concerns about AGI creating swarms of robots to conquer the earth and make humans do its bidding. I think market forces are a much more straightforward tool that AI systems will use to shape the world.
> Correct feature almost exists
> Creator profile: analytical, perceptive, responsive;
> Feature within product scope, creator ability
> Induce demand
> await "That doesn't work" => "Thanks!"
> update memory
- Do you keep bolting on new updates to match these hallucinations, potentially breaking existing behavior?
- Or do you resign yourself to following whatever spec the AI gods invent next?
- And what if different LLMs hallucinate conflicting behavior for the same endpoint?
I don’t have a great solution, but a few options come to mind:
1. Implement the hallucinated endpoint and return a 200 OK or 202 Accepted, but include an X-Warning header like "X-Warning: The endpoint you used was built in response to ChatGPT hallucinations. Always double-check an LLM's advice on building against 3rd-party APIs with the API docs themselves. Refer to https://api.example.com/docs for our docs. We reserve the right to change our approach to building against LLM hallucinations in the future." Most consumers won’t notice the header, but it’s a low-friction way to correct false assumptions while still supporting the request.
2. Fail loudly: Respond with 404 Not Found or 501 Not Implemented, and include a JSON body explaining that the endpoint never existed and may have been incorrectly inferred by an LLM. This is less friendly but more likely to get the developer’s attention.
Normally I'd say that good API versioning would prevent this, but it feels like that all goes out the window unless an LLM user thinks to double-check what the LLM tells them against actual docs. And if that had happened, it seems like they wouldn't have built against a hallucinated endpoint in the first place.
It’s frustrating that teams now have to reshape their product roadmap around misinformation from language models. It feels like there’s real potential here for long-term erosion of product boundaries and spec integrity.
EDIT: for the down-voters, if you've got actual qualms with the technical aspects of the above, I'd love to hear them and am open to learning if / how I'm wrong. I want to be a better engineer!
Also, it's not like ChatGPT or users are directly querying their API. They're submitting images through the Soundslice website. The images just aren't of the sort that was previously expected.
This is generally how you work with LLMs.
We’ve never supported ASCII tab; ChatGPT was outright lying to people. And making us look bad in the process, setting false expectations about our service.... We ended up deciding: what the heck, we might as well meet the market demand.
[...]
My feelings on this are conflicted. I’m happy to add a tool that helps people. But I feel like our hand was forced in a weird way. Should we really be developing features in response to misinformation?
The feature seems pretty useless for practicing guitar since ASCII tablature usually doesn't include the rhythm: it is a bit shady to present the music as faithfully representing the tab, especially since only
beginner guitarists would ask ChatGPT for help - they might not realize the rhythm is wrong. If ChatGPT didn't "force their hand" I doubt they would have included a misleading and useless feature.Amateur musicians often lack just one or two features in the program they use, and the devs won't respond to their pleas.
Adding support for guitar tabs has made OP's product almost certainly more versatile and useful for a larger set of people. Which, IMHO, is a good thing.
But I also get the resentment of "a darn stupid robot made me do it". We don't take kindly to being bossed around by robots.
> Hallucinations can sometimes serve the same role as TDD. If an LLM hallucinates a method that doesn’t exist, sometimes that’s because it makes sense to have a method like that and you should implement it.
— https://www.threads.com/@jimdabell/post/DLek0rbSmEM
I guess it’s true for product features as well.
> Maybe hallucinations of vibe coders are just a suggestion those API calls should have existed in the first place.
> Hallucination-driven-development is in.
https://x.com/pwnies/status/1922759748014772488?s=46&t=bwJTI...
Maybe I'll turn it into a feature request then ...
Conversely, I sometimes present it with some existing code and ask it what it does. If it gets it wrong, that's a good sign my API is confusing, and how.
These are ways to harness what neural networks are best at: not providing accurate information but making shit up that is highly plausible, "hallucination". Creativity, not logic.
(The best thing about this is that I don't have to spend my time carefully tracking down the bugs GPT-4 has cunningly concealed in its code, which often takes longer than just writing the code the usual way.)
There are multiple ways that an interface can be bad, and being unintuitive is the only one that this will fix. It could also be inherently inefficient or unreliable, for example, or lack composability. The AI won't help with those. But it can make sure your API is guessable and understandable, and that's very valuable.
Unfortunately, this only works with APIs that aren't already super popular.
Insanity driven development: altering your api to accept 7 levels of "broken and different" structures so as to bend to the will of the llms
Of course when it suggests a bad interface you shouldn't implement it.
If you automatically assume that what the LLM spits out is what the API ought to be then I agree that that’s bad engineering. But if you’re using it to brainstorm what an intuitive interface would look like, that seems pretty reasonable.
IMO this has always been the killer use case for AI—from Google Maps to Grammarly.
I discovered Grammarly at the very last phase of writing my book. I accepted maybe 1/3 of its suggestions, which is pretty damn good considering my book had already been edited by me dozens of times AND professionally copy-edited.
But if I'd have accepted all of Grammarly's changes, the book would have been much worse. Grammarly is great for sniffing out extra words and passive voice. But it doesn't get writing for humorous effect, context, deliberate repetition, etc.
The problem is executives want to completely remove humans from the loop, which almost universally leads to disastrous results.
Examples:
* Active - concise, complete info: The manager approved the proposal.
* Passive - wordy, awkward: The proposal was approved by the manager.
* Passive - missing info: The proposal was approved. [by who?]
Most experienced writers will use active unless they have a specific reason not to, e.g., to emphasize another element of the sentence, as the third bullet's sentence emphasizes approval.
-
edited for clarity, detail
Unfortunately, the resulting correlation between the passive voice and formality does sometimes lead poor writers to use the passive in order to seem more formal, even when it's not the best choice.
https://en.wikipedia.org/wiki/E-Prime
E-Prime (short for English-Prime or English Prime, sometimes É or E′) denotes a restricted form of English in which authors avoid all forms of the verb to be.
E-Prime excludes forms such as be, being, been, present tense forms (am, is, are), past tense forms (was, were) along with their negative contractions (isn't, aren't, wasn't, weren't), and nonstandard contractions such as ain't and 'twas. E-Prime also excludes contractions such as I'm, we're, you're, he's, she's, it's, they're, there's, here's, where's, when's, why's, how's, who's, what's, and that's.
Some scholars claim that E-Prime can clarify thinking and strengthen writing, while others doubt its utility.
We can smuggle in presumptions through the use of attributive adjectives. In the above comment (which you might have noticed I wrote in E-Prime) I mentioned smuggling in "covert presumptions" of "essential attributes". If I had instead written that in assembly language as follows:
I smuggled in presumptions of attributes.
The presumptions were covert.
The attributes were essential.
it would clearly violate E-Prime. And that forces you to ask: does he intend "covert" to represent an essential attribute of those presumptions, or merely a temporary or circumstantial state relative to a particular temporal context? Is "essential" intended to limit the subjects of discourse to only certain attributes (the essential ones rather than the accidental ones), and within what scope do those attributes have this purported essentiality? Universally, in every possible world, or only within the confines of a particular discourse?In these particular cases, though, I smuggled in no such presumptions! Both adjectives merely delimit the topic of discourse, to clarify that it does not pertain to overt presumptions or to presumptions of accidental attributes.
But you can use precisely the same structure to much more nefarious rhetorical ends. Consider, "Chávez kicked the squalid capitalists out of the country." Well, he kicked out all the capitalists! We've smuggled in a covert presumption of essentiality, implying that capitalism entails squalidity. And E-Prime's prohibition on the copula did not protect us at all. If anything, we lose much rhetorical force if we have to explicitly assert their squalidity, using an explicit statement that invites contradiction:
The capitalists are squalid.
We find another weak point at alternative linking verbs. It clearly violates E-Prime to say, "Your mother's face is uglier than a hand grenade," and rightly so, because it projects the speaker's subjective perceptions out onto the world. Korzybski would prefer that we say, for example, "Your mother's face looks uglier to me than a hand grenade," or possibly, "I see your mother's face as uglier than a hand grenade," thus relativizing the attribute to a single speaker's perception. (He advocated clarity of thought, not civility.)But we can cheat in a variety of ways that still smuggle in that judgment of essentiality!
Your mother's face turned uglier than a hand grenade.
We can argue this one. Maybe tomorrow, or after her plastic surgery, it will turn pretty again, rather than having ugliness as an essential attribute. Your mother's face became uglier than a hand grenade.
This goes a little bit further down the line; "became" presupposes a sort of transformation of essence rather than a mere change of state. And English has a variety of verbs that we can use like that. For example, "find", as in "Alsup found Dahmer guilty." Although in that case it connects a state, we can also use it for essential attributes: I find your mother's face uglier than a hand grenade.
Or lie, more or less, about the agent or speaker: Your mother's face finds itself uglier than a hand grenade.
And of course we can retreat to attributive adjectives again: Your mother has a face uglier than a hand grenade.
Your mother comes with an uglier face than a hand grenade.
Or we can simply omit the prepositional phrase from the statement of subjective perception, thus completely erasing the real agent: Your mother's face looks uglier [...] than a hand grenade.
Korzybski didn't care about the passive voice much, though; E-Prime makes it more difficult but, mostly, not intentionally. As an exception, erasing the agent through the passive voice can misrepresent the speaker's subjective perception as objective: Your mother's face is found uglier than a hand grenade.
And that still works if we use any of the alternative, E-Prime-permitted passive-voice auxiliary verbs: Your mother's face gets found uglier than a hand grenade.
As another example, notice all the times I've used "as" here. Many of these times smuggle in a covert assertion of essential attributes or even of identity!But I found it very interesting to notice these things when E-Prime forced me to rethink how I would say them with the copula. It seems like just the kind of mental exercise to heighten my attention to implicit assumptions of identity and essentiality that Korzybski intended.
I wrote the above in E-Prime, by the way. Just for fun.
https://youtube.com/playlist?list=PLNRhI4Cc_QmsihIjUtqro3uBk...
It means "I decided to do this, but I don't have the balls to admit it."
"A decision was made to..." is often code for "The current author didn't agree with [the decision that was made] but it was outside their ability to influence"
Often because they were overruled by a superior, or outvoted by peers.
- The Manage User menu item changes a user's status from active to inactive.
- A user's status is changed from active to inactive using the Manage User menu item.
Oh the horror. There are 2 additional words "was" and "by". The weight of those two tiny little words is so so cumbersome I can't believe anyone would ever use those words. WTF??? wordy? awkward?
Consider:
> Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.
This would not be improved by rewriting it as something like:
> Now the Confederacy has engaged us in a great civil war, testing whether that nation, or any nation whose founders conceived and dedicated it thus, can long endure.
This is not just longer but also weaker, because what if someone else is so conceiving and so dedicating the nation? The people who are still alive, for example, or the soldiers who just fought and died? The passive voice cleanly covers all these possibilities, rather than just committing the writer to a particular choice of who it is whose conception and dedication matters.
Moreover, and unexpectedly, the passive voice "we are engaged" takes responsibility for the struggle, while the active-voice rephrasing "the Confederacy has engaged us" seeks to evade responsibility, blaming the Rebs. While this might be factually more correct, it is unbefitting of a commander-in-chief attempting to rally popular support for victory.
(Plausibly the active-voice version is easier to understand, though, especially if your English is not very good, so the audience does matter.)
Or, consider this quote from Ecclesiastes:
> For there is no remembrance of the wise more than of the fool for ever; seeing that which now is in the days to come shall all be forgotten.
You could rewrite it to eliminate the passive voice, but it's much worse:
> For there is no remembrance of the wise more than of the fool for ever; seeing that everyone shall forget all which now is in the days to come.
This forces you to present the ideas in the wrong order, instead of leaving "forgotten" for the resounding final as in the KJV version. And the explicit agent "everyone" adds nothing to the sentence; it was already obvious.
Rewriting “the points already made” to “the points people have already made” would not have improved it.
I don't know enough about English grammar to know whether this is correct, but it's not the assertion you took issue with.
Why am I not sure it's correct? If I say, "In addition to the blood so red," I am quite sure that "red" is not in the passive voice, because it's not even a verb. It's an adjective. Past participles are commonly used as adjectives in English in contexts that are unambiguously not passive-voice verbs; for example, in "Vito is a made man now," the past participle "made" is being used as an attributive adjective. And this is structurally different from the attributive-verb examples of "truly verbal adjectives" in https://en.wikipedia.org/wiki/Attributive_verb#English, such as "The cat sitting on the fence is mine," and "The actor given the prize is not my favorite;" we could grammatically say "Vito is a man made whole now". That page calls the "made man" use of participles "deverbal adjectives", a term I don't think I've ever heard before:
> Deverbal adjectives often have the same form as (and similar meaning to) the participles, but behave grammatically purely as adjectives — they do not take objects, for example, as a verb might. For example: (...) Interested parties should apply to the office.
So, is "made" in "the points already made" really in passive voice as it would be in "the points that are already made", is it deverbal as it would be in "the already-made points" despite its positioning after the noun (occasionally valid for adjectives, as in "the blood so red"), or is it something else? I don't know. The smoothness of the transition to "the points already made by those numbskulls" (clearly passive voice) suggests that it is a passive-voice verb, but I'm not sure.
In sibling comment https://news.ycombinator.com/item?id=44493969 jcranmer says it's something called a "bare passive", but I'm not sure.
It's certainly a hilarious thing to put in a comment deploring the passive voice, at least.
As I said elsewhere, one of the problems with the passive voice is that people are so bad at spotting it that they can at best only recognize it in its worst form, and assume that the forms that are less horrible somehow can't be the passive voice.
Can you insert an elided copula into it without changing the meaning and grammatical structure? I'm not sure. I don't think so. I think "In addition to the points already being made" means something different: the object of the preposition "to" is now "being", and we are going to discuss things in addition to that state of affairs, perhaps other things that have happened to the points (being sharpened, perhaps, or being discarded), not things in addition to the points.
The problem is that many people have only a poor ability to recognize the passive voice in the first place. This results in the examples being clunky, wordy messes that are bad because they're, well, clunky and wordy, and not because they're passive--indeed, you've often got only a fifty-fifty chance of the example passive voice actually being passive in the first place.
I'll point out that the commenter you're replying to used the passive voice, as did the one they responded to, and I suspect that such uses went unnoticed. Hell, I just rewrote the previous sentence to use the passive voice, and I wonder how many people think recognized that in the first place let alone think it worse for being so written.
Internet posts have a very different style standard than a book.
- Active: The user presses the Enter key.
- Passive: The Enter key is to be pressed.
- Imperative (aka command): Press the Enter key.
The imperative mood is concise and doesn't dance around questions about who's doing what. The reader is expected to do it.
That’s like getting rid of all languages and accents and switch to the same language
> The problem is executives want to completely remove humans from the loop, which almost universally leads to disastrous results.
Thanks for your words of wisdom, which touch on a very important other point I want to raise: often, we (i.e., developers, researchers) construct a technology that would be helpful and "net benign" if deployed as a tool for humans to use, instead of deploying it in order to replace humans. But then along comes a greedy business manager who reckons recklessly that using said technology not as a tool, but in full automation mode, results will be 5% worse, but save 15% of staff costs; and they decide that that is a fantastic trade-off for the company - yet employees may lose and customers may lose.
The big problem is that developers/researchers lose control of what they develop, usually once the project is completed if they ever had control in the first place. What can we do? Perhaps write open source licenses that are less liberal?
Stock your underground bunkers with enough food and water for the rest of your life and work hard to persuade the AI that you're not a threat. If possible, upload your consciousness to a starwisp and accelerate it out of the Solar System as close to lightspeed as you can possibly get it.
Those measures might work. (Or they might be impossible, or insufficient.) Changing your license won't.
AI is going to get the hang of coding to fill in the spaces (i.e. the part you’re doing) long before it’s able to intelligently design an API. Correct API design requires a lot of contextual information and forward planning for things that don’t exist today.
Right now it’s throwing spaghetti at the wall and you’re drawing around it.
I agree that it's also not currently capable of judging those creative ideas, so I have to do that.
It's not creative at all, any more than taking the sum of text on a topic, and throwing a dart at it. It's a mild, short step beyond a weighted random, and certainly not capable of any real creativity.
Myriads of HN enthusiasts often chime in here "Are humans any more creative" and other blather. Well, that's a whataboutism, and doesn't detract from the fact that creative does not exist in the AI sphere.
I agree that you have to judge its output.
Also, sorry for hanging my comment here. Might seem over the top, but anytime I see 'creative' and 'AI', I have all sorts of dark thoughts. Dark, brooding thoughts with a sense of deep foreboding.
Even if your API is for something that's never been done before, it can usually still take advantage of its training data to suggest a sensible shape once you describe the new nouns and verbs to it.
It also taught me to be more careful about checkpointing my work in git before letting an agent go wild on my codebase. It left a mess trying to fix its problems.
Many many python image-processing libraries have an `imread()` function. I didn't know about this when designing our own bespoke image-lib at work, and went with an esoteric `image_get()` that I never bothered to refactor.
When I ask ChatGPT for help writing one-off scripts using the internal library I often forget to give it more context than just `import mylib` at the top, and it almost always defaults to `mylib.imread()`.
(Unless, on the gripping hand, your image_get function is subtly different from Matlab's imread, for example by not returning an array, in which case a different name might be better.)
That's also how I'm approaching it. If all the condensed common wisdom poured into the model's parameters says that this is how my API is supposed to work to be intuitive, how on earth do I think it should work differently? There needs to be a good reason (like composability, for example). I break expectations otherwise.
Having an LLM demo your tool, then taking what it does wrong or uses incorrectly and adjusting the API works very very well. Updating the docs to instruct the LLM on how to use your tool does not work well.
I've found that LLMs can be kind of dumb about understanding things, and are particularly bad at reading between the lines for anything subtle. In this aspect, I find they make good proxies for inattentive anonymous reviewers, and so will try to revise my text until even the LLM can grasp the key points that I'm trying to make.
In both cases, you might get extra bonus usability if the reviewers or the API users actually give your output to the same LLM you used to improve the draft. Or maybe a more harshly quantized version of the same model, so it makes more mistakes.
No, because you'll be held responsible for the misinformation being accurate: users will say it is YOUR fault when they learn stuff wrong.
I find it interesting that any user would attribute this issue to Soundslice. As a user, I would be annoyed that GPT is lying and wouldn't think twice about Soundslice looking bad in the process
That kind of thinking is how you never get new customers and eventually fail as a business.
Down voters here on HN seem to live in a egocentric fantasy world, where every human being in the outside world live to serve them. But the reality is that business owners and leaders spend their whole day thinking about how to please their customers and their potential customers. Not other random people who might be misinformed.
Ok, sure, maybe this feature was worth having?
But if some people start sending bad requests your way because they can't or only program poorly, it doesn't make sense to potentially degrade the service for your successful paying customers...
OTOH it's free(?) advertising, as long as that first impression isn't too negative.
"Would you still have added this feature if ChatGPT hadn't bullied you into it?" Absolutely not.
I feel like this resolves several longstanding time travel paradox tropes.
I also went back to just sleeping on those flights and using connected models for most of my code generation needs.
What surprised me initially was just how confidently wrong Llama was... Now I'm used to confident wrongness from smaller models. It's almost like working with real people...
If the super-intelligent AI understands human incentives and is in control of a very popular service, it can subtly influence people to its agenda by using the power of mass usage. Like how a search engine can influence a population's view of an issue by changing the rankings of news sources that it prefers.
The users are different, the music that is notated is different, and for the most part if you are on one side, you don't feel the need to cross over. Multiple efforts have been made (MusicXML, etc.) to unify these two worlds into a superset of information. But the camps are still different.
So what ChatGPT did is actually very interesting. It hallucinated a world in which tab readers would want to use Soundslice. But, largely, my guess is they probably don't....today. In a future world, they might? Especially if Soundslice then enables additional features that make tab readers get more out of the result.
I already strongly suspect that LLMs are just going to magnify the dominance of python as LLMs can remove the most friction from its use. Then will come the second order effects where libraries are explicitly written to be LLM friendly, further removing friction.
LLMs write code best in python -> python gets used more -> python gets optimized for LLMs -> LLMs write code best in python
We don't live in a nice world, so you'll probably end up right.
In our new post truth, anti-realism reality, pounding one's head against a brick wall is often instructive in the way the brain damage actually produces great results!
Figuring out the paths that users (or LLMs) actually want to take—not based on your original design or model of what paths they should want, but based on the paths that they actually do want and do trod down. Aka, meeting demand.
1. I might consider a thing like that like any other feature request. If not already added to the feature request tracker, it could be done. It might be accepted or rejected, or more discussion may be wanted, and/or other changes made, etc, like any other feature request.
2. I might add a FAQ entry to specify that it does not have such a feature, and that ChatGPT is wrong. This does not necessarily mean that it will not be added in future, if there is a good reason to do so. If there is a good reason to not include it, this will be mentioned, too. It might also be mentioned other programs that can be used instead if this one doesn't work.
Also note that in the article, the second ChatGPT screenshot has a note on the bottom saying that ChatGPT can make mistakes (which, in this case, it does). Their program might also be made to detect ChatGPT screenshots and to display a special error message in that case.
what a wonderful incident / bug report my god.
totally sorry for the trouble and amazing find and fix honestly.
sorry i am more amazed than sorry :D. thanks for sharing this !!
so i am happy you implemented this, and will now look at using your service. thx chatgpt, and you.
Some users might share it. ChatGPT has so many users it's somewhat mind boggling
I know nothing about this. I imagine people are already working on it, wonder what they've figured out.
(Alternatively, in the future can I pay OpenAI to get ChatGPT to be more likely to recommend my product than my competitors?)
So winning AI SEO is not so different than regular SEO.
ahstilde•7h ago
toss1•7h ago
And in this case, OP didn't have to take ChatGPT's word for the existence of the pattern, it showed up on their (digital) doorstep in the form of people taking action based on ChatGPT's incorrect information.
So, pattern noticed and surfaced by an LLM as a hallucination, people take action on the "info", nonzero market demand validated, vendor adds feature.
Unless the phantom feature is very costly to implement, seems like the right response.
Gregaros•6h ago
I would go on to say that thisminteraction between ‘holes’ exposed by LLM expectations _and_ demonstrated museerbase interest _and_ expert input (by the devs’ decision to implement changes) is an ideal outcome that would not have occurred if each of the pieces were not in place to facilitate these interactions, and there’s probably something here to learn from and expand on in the age of LLMs altering user experiences.
bredren•3h ago