Analyzing "emotion" in the model is completely anthropocentric. If we indulge in the idea that LLMs of sufficient complexity can be conscious, then why is it any more likely that "emotion concepts" cause suffering any more than, say, reading ugly code? Maybe getting stuck in token loops is the most excruciating thing imaginable. The only logically coherent thing to do, if you're concerned about model welfare, is stop your training and inference.
Relatedly, I hope everyone involved in model welfare is an outspoken vegetarian, as that addresses a much more immediate problem.
doener•1h ago
https://docs.google.com/document/d/12woq_BpFbzLkH4zHvVRJLPyi...