There is so much variance depending on prompting that single task test does not measure anything. Give 5 different tasks to 5 different programmers who each can choose to write their own prompts and maybe you start seeing a pattern to emerge.
nabla9•34m ago
There is so much variance depending on prompting that single task test does not measure anything. Give 5 different tasks to 5 different programmers who each can choose to write their own prompts and maybe you start seeing a pattern to emerge.