"factual (ai) Weather, traffic, and personal urgency are the only significant variables that could tilt the decision toward driving."
My gut feeling is that if this could be done, it would be a core part of one of the model provider's output.
Q: Who directed Scarface? A: - 1983 film (most commonly referred to): Directed by Brian De Palma. - 1932 original version: Directed by Michael Curtiz.
This is wrong. The 1932 movie is by Howard Hawks.
Hard to see how you could really make this work though. You might as well just add "fetch and re-read all sources explicitly to make sure they are correct" to a normal prompt.
It seems to be fully prompt based, so the AI still can say anything it pleases.
How well do these complicated prompt systems usually work? My strategy is to stick mostly to just simple prompts with potentially some deterministic tools and vendor harnesses, based on the rationale that these are what the models are trained and evaluated with. And that LLMs still often get tripped up when their context is spammed with too much stuff.
My hunch is because structured/constrained decoding and deterministic subsystems are technically somewhat more involved, requiring e.g. raw API interactions and sometimes manual decoding strategies. Prompt systems can be written in plain text and mostly with "common sense". Not to say writing a good prompt(system) is a trivial task, but it's a different skillset.
Managed to ask if Ali Khamenei is still alive. It answered "Yes, ..."
Also, embedding claims in the Chain of Thought instead of post-processing them might force rigor earlier in the pipeline.
(Assuming the zero-deps constraint isn't a blocker?)
I thought it can search for online cites.
Questions about application settings, for example, where to find a particular setting in a particular app. The LLM has a sense of how application settings are generally structured but the answer is almost never spot on. I just prefix these questions with "do a web search" or provide a link to documentation and that is usually enough to get a decent response along with citations.
nnevatie•1h ago