One or two more releases and they will reach Fable level.
lanycrost•31m ago
It's always nice to see how open source models growing, hope we will have good performance with lower tier hardware some day.
wongarsu•13m ago
It does really well on "AA-Omniscience Non-Hallucination Rate", far higher than DeepSeek, GPT 5.5 or Fable. I really like that benchmark because it's one of the few benchmarks that allows LLMs to elect not to answer if they are unsure and punishes them for trying to bullshit their way through the benchmark
DeathArrow•37m ago