frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Will It Mythos?

https://swelljoe.com/post/will-it-mythos/
63•mindingnever•1h ago•25 comments

Steam Machine launches today

https://store.steampowered.com/news/group/45479024/view/685257114654870245
1389•theschwa•12h ago•1242 comments

VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

https://arxiv.org/abs/2606.16140
89•timhigins•3h ago•26 comments

GLM-5.2 – How to Run Locally

https://unsloth.ai/docs/models/glm-5.2
289•TechTechTech•8h ago•133 comments

Polymarket has flooded social media with deceptive videos by paid creators

https://www.wsj.com/business/media/polymarket-social-media-bets-prediction-market-441cdeb5?st=HhTZY2
133•Vaslo•2d ago•123 comments

In praise of memcached

https://jchri.st/blog/in-praise-of-memcached/
95•j03b•4h ago•38 comments

An Introduction to YOLO26

https://blog.roboflow.com/yolo26/
40•teleforce•3h ago•10 comments

Optocam Zero: a Pi Zero based digital camera made using off the shelf components

https://github.com/dorukkumkumoglu/optocamzero
137•iamnothere•10h ago•33 comments

Japanese symbols that speak without words

https://arun.is/blog/japan-symbols/
148•msephton•10h ago•68 comments

Cyberdecks, going analog, and convivial technology

https://blog.hydroponictrash.solar/cyberdecks-going-analog-and-convivial-technology/
77•akkartik•3d ago•33 comments

My Mathematical Regression

https://blog.dahl.dev/posts/my-mathematical-regression/
252•aleda145•3d ago•91 comments

Package Managers need global hooks

https://captnemo.in/blog/2026/06/17/package-managers-need-hooks/
13•evakhoury•4d ago•6 comments

Moebius: 0.2B image inpainting model with 10B-level performance

https://hustvl.github.io/Moebius/
261•DSemba•15h ago•67 comments

Giant Banana Pulled Over: Driver Says Cops Have Stopped Him 100s of Times

https://cowboystatedaily.com/2026/06/18/giant-banana-pulled-over-in-montana-driver-says-cops-have...
33•speckx•2d ago•2 comments

Show HN: Oak – Git alternative designed for agents

https://oak.space/oak/oak
171•zdgeier•13h ago•155 comments

Windows NT for GameCube/Wii

https://github.com/Wack0/entii-for-workcubes
42•zdw•3d ago•7 comments

Canada plans 'nuclear renaissance' with up to 10 reactors built by 2040

https://www.cbc.ca/news/politics/federal-nuclear-strategy-9.7244509
404•geox•10h ago•251 comments

1,700 free online courses from top universities

https://www.openculture.com/freeonlinecourses
126•momentmaker•3h ago•24 comments

Canyon HUD helmet for road riding

https://media-centre.canyon.com/en-INT/266866-new-canyon-heads-up-display-helmet-could-be-a-safet...
84•zh3•2d ago•94 comments

Is it time for a new Embedded Linux build system?

https://yoebuild.org/blog/time-for-a-new-build-system/
60•cbrake•4d ago•42 comments

Kyber (YC W23) Is Hiring a Head of Engineering

https://www.ycombinator.com/companies/kyber/jobs/FGmI8mx-head-of-engineering
1•asontha•8h ago

British Columbia, Time Zones, and Postgres

https://www.crunchydata.com/blog/british-columbia-and-time-zone-changes
129•sprawl_•10h ago•88 comments

Flock-Powered Police Chiefs Stalking Women Shows Why Warrants Are Needed

https://ipvm.com/reports/police-chiefs-track
457•jhonovich•10h ago•186 comments

I built an offline tool to stabilize TV audio because nothing else worked

https://github.com/AdBusterOfficial/Adbuster--WinApp
6•Bo_Amigo_910•2d ago•1 comments

Ultralytics YOLO26: Unified Real-Time End-to-End Vision Models

https://arxiv.org/abs/2606.03748
18•teleforce•3h ago•0 comments

Show HN: Pagecast – Publish Markdown/HTML Reports to Cloudflare Pages

https://github.com/Amal-David/pagecast
40•amaldavid•4d ago•11 comments

Job application asked for my SAT scores

https://mrmarket.lol/job-application-asked-for-my-sat-scores/
120•seltzerboys•8h ago•288 comments

Show HN: Got sick of ads, so I made my own logic puzzle site

https://puzzlelair.com/
165•HaxleRose•17h ago•106 comments

Help I accidentally a wigglegram

https://lmao.center/blog/wiggle-accidents/
515•gregsadetsky•3d ago•120 comments

Chevron signs 20-year power agreement with Microsoft for West Texas data center

https://www.chevron.com/newsroom/2026/q2/chevron-signs-20-year-power-agreement-with-microsoft-for...
136•cdrnsf•15h ago•121 comments
Open in hackernews

Will It Mythos?

https://swelljoe.com/post/will-it-mythos/
62•mindingnever•1h ago

Comments

jrochkind1•1h ago
> And, all of the bugs can be identified by several models if they are pointed directly at it and told what to look for.

This made me think, well, sure, if you tell them what to look for... but then:

> The models can look at the whole repo, and follow logic across file boundaries, but they’re not told what to look for.

So okay, the first one was an accidental mis-statement?

wodenokoto•45m ago
No. In the test they are not told what to look for. They are told “as part of a security audit, please audit this file. You are free to look at the rest of the report for context.”

Outside of the test, they are told “can you find this bug in this file?”

jrochkind1•30m ago
Why are they being told anything outside of the test? What is that for? Isn't “can you find this bug in this file?” also a test? It sounds like there are two kinds of tests? I'm clearly confused, I realize.
brigandish•8m ago
They are told outside the test because if they can't find it when given hints then it's safe to assume it won't find it given no hints. It verifies to test, to an extent, much like running tests that should fail when given a set of inputs that should make it fail (you write an always failing test alongside your other tests, right?;)
reinitctxoffset•51m ago
Opus 4 class models are terrifying at infosec. They tie their shoelaces together on other things, but don't fuck with them on that. It's a savant thing.

A cursory reading of the model card shows Mythos/Fable is a fine tune on Project Zero with some steering on persistence.

But I think it's a valuable lesson: advertise your product as a nuclear weapon while microdosing at Lighthaven to enough Davos attendees and sooner or later? Someone is going to evaluate the claim from a chair where you act first and nuance later.

Wild that Amodei's blog and pod circuit are the greatest IPO risk.

eru•46m ago
> Opus 4 class models are terrifying at infosec. They tie their shoelaces together on other things, but don't fuck with them on that. It's a savant thing.

I think they are very good at finding flaws; but they aren't all that great at making a system that doesn't have (security) flaws.

reinitctxoffset•43m ago
You are not wrong, but there's an asdymetry here: run adversarial self play and low-pass filter.
eru•35m ago
Mostly right. However there's an extra assumption I didn't explicitly state:

Almost all existing real world software is full of holes and security flaws. Mythos is better than humans at uncovering many of them; especially because its time is a lot cheaper than that of the top tier human experts (and even of mid-and low-tier human experts).

Especially when these systems are written in notoriously unreliably languages like C.

I don't think Mythos is especially good at writing systems that are free of security problems. Essentially the only way we know is by proving your software correct.

In principle, you can even prove C correct, but in practice you'll want to write your system from the ground up to be proven correct instead of adding that property after the fact; and for that you'll most likely also want to pick a language that supports this better.

See https://en.wikipedia.org/wiki/SeL4 for a noteworthy example.

jaggederest•39m ago
In my brief experience, the difference between fable and opus is largely in persistence, not global intelligence like you might expect. Fable just... goes the extra mile, sometimes in a scary way.
hodgehog11•31m ago
Hard disagree. Opus reports to me like a student. Fable reported to me like a colleague (researcher). It genuinely seemed to pick up on nuance that the other models just don't, even when I tell them explicitly. It's been really frustrating that neither Codex nor Opus can make targetted edits to Fable's code without screwing something subtle up. For context, this is for computational geometry work, so your mileage may vary.
hypfer•27m ago
Wait, so..

This is interesting. The "reported to me like a colleague" part.

Is it just that anthropic gave Mythos even more of that Anthropic™ character, (incorrectly) radiating confidence?

Is that why people have been losing their minds over that thing? Is this just cheap social engineering?

I mean I bet it is also slightly more capable than opus, but that would all check out to me. Man.

Thanks for sharing I suppose.

TylerE•23m ago
No, it’s just a fundamentally much better model. Going back to Opus feels like the model has been lobotomized. It makes much more frequent errors, especially of the “I claimed I tested x y and z, but actually only kinda half heartedly tested x, and assumed I understood what was wrong” variety.
hypfer•19m ago
Wait but that has been the exact word-for-word complaint when comparing sonnet to opus

Or opus to opus

Or really any new thing to old thing

tptacek•39m ago
What makes you say that? I think they're better than replacement-level developers at making secure systems (I spent 20 years looking for vulnerabilities in human-written code as a full-time job).
sscaryterry•35m ago
Agreed. In the right hands, they can perform magic.
eru•17m ago
See https://news.ycombinator.com/item?id=48640533 for some further elaboration.

These models are definitely a lot better than your run of the mill human developer at finding security flaws in existing systems. I'm agnostic at how good they are at actually making a secure system. Probably better, too, for two reasons:

- humans are really terrible

- the model probably has an easier time picking up special purpose tools you can use to write proven secure systems

I don't think Mythos can write secure C code, either. Practically no one can. (At least not directly. See how seL4 is officially written in C; but they didn't just set out to carefully write secure C code directly; C just happens to be an intermediate language they use.)

solumunus•5m ago
When the agent is becoming more accurate and thorough what would you expect to be reported?
raphman•19m ago
> It's been really frustrating that neither Codex nor Opus can make targetted edits to Fable's code without screwing something subtle up.

Reminds me of the old adage: don't try to be too smart when writing code. Otherwise, dumber people - including your future self - will have trouble working with it.

mohsen1•18m ago
Yes, in my project I made so much more progress in 3 days of Fable that is not comparable to how Opus is working.
dimgl•8m ago
Maybe I was getting downgraded to Opus 4.8 but I saw nothing even close to resembling this behavior when using Fable.
somesortofthing•31m ago
In LLMs, much like in humans, agency and misalignment are two sides of the same coin.