Sure. Kindness and fairness are harder to encode into a system than the objective pursuit of a single metric. Assuming you can even get agreement on what would be kind or fair in a given situation.
Proofread0592•2h ago
People prompt the LLM to do evil things, and it eventually does. In my mind, the only failure here is the guardrails.
zahlman•3h ago