fp.
newest
Open in hackernews
A Long-Tail Professional Forum-Based Benchmark for LLM Evaluation
https://arxiv.org/abs/2511.06346
1
•
wslh
•
2mo ago