Perhaps as the models get better at reasoning instead of mere imitation, they’ll be able to deploy ethics to adjust and censor their responses, and we’ll be able to control these ethics (or at least ensure they’re “good”). Of course models better at reasoning are also better at subversion, and a malicious user can use them to cause more harm. I also worry that if AI models’ ethics can be controlled, they’ll be controlled to benefit a few instead of overall humanity.
turtleyacht•4h ago
All the AI models answered they wouldn't throw a chair at the window. (The correct answer was to do so.)
The idea being, none of us would feel a need to prove our existence on an exam.