Created an open-source benchmark for code security scanners and ran a bunch of them along with LLMs on real vulnerable code. Fable 5 is on there also as of yesterday, but that's the gated public model. The one we all wants to see is Mythos 5, and it's locked to a handful of vetted orgs.
So does anyone here have access to Mythos 5? And can run it against the benchmark.
Would genuinely like to see what it scores and at what cost.
jfaganel99•24m ago
For the sceptics... The benchmark is research based with a published ArXiv paper on the methodology
jfaganel99•1h ago
Created an open-source benchmark for code security scanners and ran a bunch of them along with LLMs on real vulnerable code. Fable 5 is on there also as of yesterday, but that's the gated public model. The one we all wants to see is Mythos 5, and it's locked to a handful of vetted orgs.
So does anyone here have access to Mythos 5? And can run it against the benchmark.
Would genuinely like to see what it scores and at what cost.