frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Why Claude's Comment Paper Is a Poor Rebuttal

https://victoramartinez.com/posts/why-claudes-comment-paper-is-a-poor-rebuttal/
6•vectorhacker•7h ago

Comments

low_tech_love•14m ago
A fundamental problem that we’re still far away from solving is not necessarily that LLMs/LRMs cannot reason the same way that we do (which I guess should be clear by now); but that they might not have to. They generate slop so fast that, if one can benefit a little bit from each output, i.e. if you can find a little bit of use hidden beneath the mountain of meaningless text they’ll create, then this might still be more valuable than preemptively taking the time to create something more meaningful to begin with. I can’t say for sure what is the reward system behind LLM use in general, but given how much money people are willing to spend with models even in their current deeply flawed state, I’d say it’s clear that the time savings are outweighing the mistakes and shallowness.

Take the comment paper, for example. Since Claude Opus is the first author, I’m assuming that the human author took a backseat and let the AI build the reasoning and most of the writing. Unsurprisingly, it is full of errors and contradictions, to a point where it looks like the human author didn’t bother too much to check what was being published. One might say that the human author, in trying to build some reputation by showing that their model could answer a scientific criticism, actually did the opposite: it provided more evidence that its model cannot reason deeply, and maybe hurt their reputation even more.

But the real question is, did they really? How much backlash will they possibly get from submitting this to arxiv without checking? Would that backlash keep them from submitting 10 more papers next week with Claude as the first author? If one puts in a balance the amount of slop you can put out (with a slight benefit) vs. the bad reputation one gets from it, I cannot say that “human thinking” is actually worth it anymore.

An Architectural Approach to Decentralization

https://www.infocentral.org/
1•Bogdanp•48s ago•0 comments

SelfDB: The last Back end as a service you will pay for

https://selfdb.io
1•selfdb_io•3m ago•1 comments

Can shoes be made in the US without cheap labour?

https://www.bbc.com/news/articles/cr4zvezn5nlo
1•dabinat•4m ago•0 comments

Dart and WebAssembly with JavaScript Interop

https://nick-fisher.com/articles/dart-javascript-interop-web-assembly/
1•nmfisher•6m ago•0 comments

How I Passed the AWS Certified Security – Specialty (SCS-C02) Exam in 2025

https://thehiddenport.dev/posts/aws-scs-c02-exam-experience/
1•ejher•7m ago•0 comments

Show HN: Mockstar – AI mock interviews and feedback for jobseekers

https://mockstar.co/
1•mattdotam•8m ago•0 comments

Jio and Jio-Fiber Down in Parts of India

1•saharshpruthi•11m ago•0 comments

What is cosh(List(Bool))? Or beyond algebra: analysis of data types

http://cofault.com/aodt.html
1•fanf2•15m ago•0 comments

Google Is Scamming Users with VEO 3, While Delivering VEO 2 Instead

3•machmadera•15m ago•0 comments

The right way to make AI part of your tech strategy

https://leaddev.com/technical-direction/right-way-make-ai-part-your-tech-strategy
1•argoeris•16m ago•0 comments

SAZ Caption AI

https://reach-boost-captions-craft.lovable.app
2•sigma-male•18m ago•2 comments

Show HN: Compiler for Writing Ethereum Smart Contracts with TypeScript

2•chase-manning•20m ago•0 comments

Show HN: Better Docx Import and Export Support for Tiptap Editor

7•philipisik•21m ago•0 comments

Timdle

https://www.timdle.com/
2•kaharvi•23m ago•0 comments

Choosing where to spend my team's effort

https://frederickvanbrabant.com/blog/2025-06-13-choosing-where-to-spend-my-teams-effort/
2•TheEdonian•24m ago•0 comments

A Systematic Review and New Analyses of the Gender-Equality Paradox

https://journals.sagepub.com/doi/10.1177/17456916231202685
2•mpweiher•25m ago•0 comments

Jordan's black refugees

https://weeklygazette.substack.com/p/jordans-black-refugees
2•progju•28m ago•0 comments

Apple quietly makes running Linux containers easier on Macs

https://www.zdnet.com/article/apple-quietly-makes-running-linux-containers-easier-on-macs/
2•abricq•30m ago•0 comments

Best Antidetect Browser Setups for Social Media Marketers

1•RainbowJ•31m ago•0 comments

The Gnarly Man

https://en.wikipedia.org/wiki/The_Gnarly_Man
1•nobody9999•31m ago•0 comments

Show HN: Shame Meter

https://twitter.com/the2ndfloorguy/status/1929074655517610073
3•madinmo•34m ago•0 comments

Technical co-founder, built everything. Offered 4%. Oof

3•cabbagepancakes•43m ago•2 comments

Show HN: Gifty – A real-world gift hunt you play with your feet

https://gifty-en.vercel.app/
1•mrtranlyvu•45m ago•0 comments

Show HN: A Chrome extension that highlights one sentence at a time while reading

https://github.com/hamsteak1488/focus-anchor
1•hamsteak•46m ago•0 comments

.NET Performance Testing: What Is Important to Know in 2025?

https://belitsoft.com/net-performance-testing
1•Aninay•47m ago•0 comments

Use Copilot Agent Mode in Visual Studio (Preview)

https://learn.microsoft.com/en-us/visualstudio/ide/copilot-agent-mode?view=vs-2022
1•nsoonhui•48m ago•0 comments

Warner Bros: fright night for bondholders

https://bondvigilantes.com/blog/2025/06/warner-bros-fright-night-for-bondholders/
2•Ozarkian•49m ago•0 comments

Google Chrome Music Video

https://www.youtube.com/watch?v=F50bCrlTbIs
1•ankitrgadiya•50m ago•0 comments

Founders: How do you audit code quality, infra costs, and dev team efficiency?

1•satya9099•51m ago•0 comments

Show HN: Life Anti-Checklist

https://antichecklist.com
1•alvinunreal•52m ago•0 comments