> If we want to understand the human mind in its entirety, we must move from domain-specific theories to an integrated one
I would prefer if first, the feasibility of a unified integrated theory was proven.
> An important step towards a unified theory of cognition is to build a computational model that can predict and simulate human behaviour in any domain
Why is this step important? Do LLMs qualify as a valid computational model that explains cognition? Which other steps lead to this one?
> Centaur was designed in a data-driven manner by fine-tuning a state-of-the-art large language model
These statements are contradictory. You don't design a large language model, you design it's inference engine. Calling it a "design" implies you had blueprint-like plans for its desired outcome. It's not only a definition argument: language models are fine-tuned by selected input data, and this practice in a research setting raises a red flag.
> We transcribed each of these experiments into natural language, which provides a common format for expressing vastly different experimental paradigms
References for this translation process are from the same authors, and build upon models that are not open-weight. This raises a red flag.
> simplifications were made where appropriate
What was the criteria for determining when a simplification was appropriate or not? I could not find any mention to simplification procedures in the supplementary material.
> Finally, we verified that Centaur fails at predicting non-human behaviour.
Is it failing at predicting non-human behavior, or is it relying on how relatively unknown LLM behavior is?
Let me explain better: if you get participants experienced in exploting LLMs, would Centaur fare differently? This skill is definitely within the realm of human cognition (eg. making an LLM hallucinate). This question is important.
alganet•7h ago
I would prefer if first, the feasibility of a unified integrated theory was proven.
> An important step towards a unified theory of cognition is to build a computational model that can predict and simulate human behaviour in any domain
Why is this step important? Do LLMs qualify as a valid computational model that explains cognition? Which other steps lead to this one?
> Centaur was designed in a data-driven manner by fine-tuning a state-of-the-art large language model
These statements are contradictory. You don't design a large language model, you design it's inference engine. Calling it a "design" implies you had blueprint-like plans for its desired outcome. It's not only a definition argument: language models are fine-tuned by selected input data, and this practice in a research setting raises a red flag.
> We transcribed each of these experiments into natural language, which provides a common format for expressing vastly different experimental paradigms
References for this translation process are from the same authors, and build upon models that are not open-weight. This raises a red flag.
> simplifications were made where appropriate
What was the criteria for determining when a simplification was appropriate or not? I could not find any mention to simplification procedures in the supplementary material.
> Finally, we verified that Centaur fails at predicting non-human behaviour.
Is it failing at predicting non-human behavior, or is it relying on how relatively unknown LLM behavior is?
Let me explain better: if you get participants experienced in exploting LLMs, would Centaur fare differently? This skill is definitely within the realm of human cognition (eg. making an LLM hallucinate). This question is important.