Unless of course it clashes with the ToS, then it will never help you (unless you manage to engineer your prompts just right), even when it comes to something as basic as pharmacology.
At one time it did not want to give me the pharmacokinetics of one medication because of its ToS.
(That sounds broken. It's basic, useful, harmless information ...)
This was (is) not limited to pharmacology, however.
I found something though.
I asked it for the interaction between two medications, and it told me "I am sorry you are going through this, but I can't give you information about ...", which I guess is still pharmacology.
Edit: I remember another case. It was related to reverse engineering.
Edit #2: I found another! I cannot give you specifics, but I asked for it to give me some example sentences for something, and it went on about "consensual relationships" and so forth, even though my partner consented. I told the LLM that she did consent and it went on about that "I'm sorry, but even if she consented, I cannot give you ... because ... and is not healthy ...", and so forth. (Do not think of anything bad here! :P)
At the end, went to doctor, offered me the same choices. No regrets so far. Without it, I would suffer for nothing.
It is a very good tool to investigate and discover things, could be also the opposite: seen a bit unproductive that it is censored, because doctors are mostly here for validation anyway for serious things.
Other people are not sensible - look at all the junkies shooting who knows what into who knows where. A system that can advise cooks to put glue on pizza is probably not the best mechanism for providing healthcare advice to at least the section of the population who are likely to try it.
For that matter, would you suppose that they don’t know what they’re doing is bad for them? Witness the smokers who crack jokes about “cancer sticks” as they light up.
It seems to me that, just as we all promise our dentists we floss every day and our doctors that we promise we’ll eat better, we might be more prone to be frank with “anonymous” chatbots—and thus get more frank advice in return, at least for clear-cut factual background and tradeoffs.
LLMs have real problems with both of those things, they are unreliable in the sense that they produce text with factual inaccuracies and they are uncontrollable in that it's really hard to predict their behaviour from their past behaviour.
Big open challenges in AI - I am sure that there will be solutions, but there aren't right now. I think it's very like self driving cars. Implementing a demo and showing it on the road is one thing, but then it takes twenty years before the slow roll out really gets underway.
I’ve reduced usage of journaling GPTs I created myself, despite having gained incredible utility from them.
“How sure are you on /20”
If it says yes I am sure, and other LLMs confirms the same way, then you can be fairly confident that it is a very good candidate answer.
If it says he is not sure, he is probably just agreeing with you, and better double-check by asking “is there other solutions ? What is the worst idea ?” Etc, to force it through thinking and context.
It is cross-validation, and you can even cross-validate by searching on the internet.
Though, 100% they say what you them want to say.
Except on religion and immigration, and some other topics where it will push its own opinion.
For example, if it claims A > B, then it shouldn't claim B > A in a fresh chat for comparisons.
In general, you shouldn't get A and not A, and you should expect A or Not A.
If it can go from prompt -> result, assuming it's invertible, then result -> prompt should also partially work. An example of this is translation.
The results of some mathematical solutions should go back and solve the original equations. Ex. The derivative of an antiderivative should give you back the original.
For example the other day i asked ChatGPT about a problem i had with some generated code that didn't compile. It then told me about a setting in the generator (nswag), that didn't exist. I told it that this setting does not exist and it said something like: "Sorry, my bad, try the following" and then kept inventing this setting with slightly different names and values over and over again. There are similar settings that exist, so it just hallucinated a tiny bit of text inside all the snippets that it learned from.
This is also not the first time this happened, most of the times i tried using AI for help with things, it just made up some nonsense and wasted my time with it.
Artificial colors and artificial flavors and artificial intelligence all have something in common - they are not as good as the real thing, and probably never will be.
Honesty and truthfulness are of primary importance. Avoid American-style positivity, instead aim for German-style bluntness: I absolutely *do not* want to be told everything I ask is "great", and that goes double when it's a dumb idea.
* Or fawning, I don't know how to tell them apart from the outside, even in fellow humans where we don't need to wonder if we're anthropomorphising too much. Does anyone know how to tell them apart from the outside?I'm also not limiting myself to ChatGPT, checking with other DIY sources — it isn't enough to only avoid sycophancy, it has to also be correct, and 90% right like it is in software is still 10% wrong, only this is wrong with the possibility of a shed drifting across a garden in a gale, or sinking into the soil as it shifts, if I do it wrong.
Measure twice, cut once.
Made me chuckle :)
Much how mainstream internet is tainted with American bias against nudity and for copyright.
(Meta: comment generated with ChatGPT of course)
So rather than “would you say that..” or “would you agree that…”, I approach it from the negative.
So “I think it’s not the case that…”, or “I disagree with X. Debate me?”
…and then see if it disagrees with me and presents solid counter arguments.
FWIW, I think ChatGPT can definitely be too eager to please, but Claude can be more direct and confrontational. I am a ChatGPT subscriber, but keep the Claude app installed and use it occasionally on the free tier for a second opinion. Copypasting your question is so easy on both apps that I will frequently get a second opinion if the topic merits it. I tried the same with Gemini, but get about two questions before it cuts me off…
> Claude never starts its response by saying a question or idea or observation was good, great, fascinating, profound, excellent, or any other positive adjective. It skips the flattery and responds directly.
https://docs.anthropic.com/en/release-notes/system-prompts#m...
Are mathematicians? Then yes, for example Terrence Tao is using LLMs. AlphaProve / geometry are also examples of this, using LLMs to generate lean proofs, translate from NL to lean, and so on.
And for the general "researchers", they use code generation to speed up stuff. Many scientists can code, but aren't "coders" in the professional sense. So they can use the advances in code generation to speed up their own research efforts.
once we managed to transfer our skills to them (coding, analysis, maths etc.) the next step is transferring our creativity to them. It is a gradual process with human oversight.
I don't quite agree when you look at a second order, applying more compute; for example, brute forcing a combination of ideas and using a judge to evaluate them. I suspect there's quite a bit of low hanging fruit in joining together different deep expertise areas.
I do come back to agreeing again for paradigm shifts. You don't get to very interesting ideas without fresh approaches, questioning core assumptions then rebuilding what we had before on new foundations. It is hard to see LLMs in their current shape being able to be naive and ignorant such that existing doctrine doesn't reign in new ideas.
If you train AI to be super skeptical, it will be so. But most people don't prefer to talk with a yes-person than a negative, inquisitive devil's advocate.
I also don't buy that yes-men and breakthroughs are mutually exclusive/polar opposites here.
I have the same thought but from a more negative angle. A vast share of new information in the near future will be just a repeat of whatever data the LLMs were trained on.
There is a tiny sliver of LLM usage that will not be a transformation of existing data (e.g. make me a chart of this data, write me an essay) but rather ”help me create a new tool that will solve a novel problem”.
I believe that’s what the person interviewed is saying in their own words. It’s hard to imagine something other than a brute force hypothesis machine that starts brute forcing solutions, but it will not be as effective as we wish if we can’t figure out how to come up with hypothesis for everything.
None of what I’m saying is that insightful and I’m sure people have thought of this already.
I wonder if ever there will be a Hitchhiker’s style revelation that we have had all the answers for all of our problems already, but the main issue is just incentives. Curing most cancers is probably just a money question, as is solving climate change.
setnone•3h ago
Bluestein•3h ago