frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

https://seedance-2.net
1•dallen97•1m ago•0 comments

Wally: A fun, reliable voice assistant in the shape of a penguin

https://github.com/JLW-7/Wally
1•PaulHoule•2m ago•0 comments

Rewriting Pycparser with the Help of an LLM

https://eli.thegreenplace.net/2026/rewriting-pycparser-with-the-help-of-an-llm/
1•y1n0•4m ago•0 comments

Lobsters Vibecoding Challenge

https://gist.github.com/MostAwesomeDude/bb8cbfd005a33f5dd262d1f20a63a693
1•tolerance•4m ago•0 comments

E-Commerce vs. Social Commerce

https://moondala.one/
1•HamoodBahzar•4m ago•1 comments

Avoiding Modern C++ – Anton Mikhailov [video]

https://www.youtube.com/watch?v=ShSGHb65f3M
1•linkdd•6m ago•0 comments

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

https://www.aegismind.app
2•aegismind_app•10m ago•1 comments

Zig – Package Management Workflow Enhancements

https://ziglang.org/devlog/2026/#2026-02-06
1•Retro_Dev•11m ago•0 comments

AI-powered text correction for macOS

https://taipo.app/
1•neuling•15m ago•1 comments

AppSecMaster – Learn Application Security with hands on challenges

https://www.appsecmaster.net/en
1•aqeisi•16m ago•1 comments

Fibonacci Number Certificates

https://www.johndcook.com/blog/2026/02/05/fibonacci-certificate/
1•y1n0•17m ago•0 comments

AI Overviews are killing the web search, and there's nothing we can do about it

https://www.neowin.net/editorials/ai-overviews-are-killing-the-web-search-and-theres-nothing-we-c...
3•bundie•22m ago•1 comments

City skylines need an upgrade in the face of climate stress

https://theconversation.com/city-skylines-need-an-upgrade-in-the-face-of-climate-stress-267763
3•gnabgib•23m ago•0 comments

1979: The Model World of Robert Symes [video]

https://www.youtube.com/watch?v=HmDxmxhrGDc
1•xqcgrek2•27m ago•0 comments

Satellites Have a Lot of Room

https://www.johndcook.com/blog/2026/02/02/satellites-have-a-lot-of-room/
2•y1n0•28m ago•0 comments

1980s Farm Crisis

https://en.wikipedia.org/wiki/1980s_farm_crisis
4•calebhwin•29m ago•1 comments

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

https://github.com/skorotkiewicz/fsid
1•modinfo•34m ago•0 comments

Show HN: Holy Grail: Open-Source Autonomous Development Agent

https://github.com/dakotalock/holygrailopensource
1•Moriarty2026•41m ago•1 comments

Show HN: Minecraft Creeper meets 90s Tamagotchi

https://github.com/danielbrendel/krepagotchi-game
1•foxiel•48m ago•1 comments

Show HN: Termiteam – Control center for multiple AI agent terminals

https://github.com/NetanelBaruch/termiteam
1•Netanelbaruch•48m ago•0 comments

The only U.S. particle collider shuts down

https://www.sciencenews.org/article/particle-collider-shuts-down-brookhaven
2•rolph•51m ago•1 comments

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

1•solarisos•51m ago•3 comments

Show HN: Remotion directory (videos and prompts)

https://www.remotion.directory/
1•rokbenko•53m ago•0 comments

Portable C Compiler

https://en.wikipedia.org/wiki/Portable_C_Compiler
2•guerrilla•55m ago•0 comments

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

1•Ginsabo•56m ago•0 comments

Software Engineering Transformation 2026

https://mfranc.com/blog/ai-2026/
1•michal-franc•57m ago•0 comments

Microsoft purges Win11 printer drivers, devices on borrowed time

https://www.tomshardware.com/peripherals/printers/microsoft-stops-distrubitng-legacy-v3-and-v4-pr...
3•rolph•58m ago•1 comments

Lunch with the FT: Tarek Mansour

https://www.ft.com/content/a4cebf4c-c26c-48bb-82c8-5701d8256282
2•hhs•1h ago•0 comments

Old Mexico and her lost provinces (1883)

https://www.gutenberg.org/cache/epub/77881/pg77881-images.html
1•petethomas•1h ago•0 comments

'AI' is a dick move, redux

https://www.baldurbjarnason.com/notes/2026/note-on-debating-llm-fans/
5•cratermoon•1h ago•0 comments
Open in hackernews

VibeCodingBench: Benchmark Vibe Coding Models for Fun

https://twitter.com/yq_acc/status/2016201908181205358
2•jiayaoqijia•1w ago

Comments

jiayaoqijia•1w ago

  VibeCodingBench: We benchmarked 15 AI coding models on what developers actually do                                                                      
                                                                                                                                                          
  Current benchmarks have an ecological validity crisis. Models score 70%+ on SWE-bench but struggle in production. Why? They optimize for bug fixes in   
  Python repos—not the auth flows, API integrations, and CRUD dashboards that occupy 80% of real dev work.                                                
                                                                                                                                                          
  So we built VibeCodingBench: 180 tasks across SaaS features, glue code, AI integration, frontend, API integrations, and code evolution.                 
  Multi-dimensional scoring: Functional (40%) + Visual (20%) + Quality (20%) - Cost/Speed penalties. Security gate: Any OWASP Top 10 vuln = automatic 0.  
                                                                                                                                                          
  Top 5 Results (Jan 2026):                                                                                                                               
                                                                                                                                                          
   Claude Opus 4.5 — 89.2% | $12.31 | 44s                                                                                                               
   Claude Haiku 4.5 — 89.0% | $3.03 | 22s                                                                                                               
   Grok 4 Fast — 88.8% | $0.21 | 70s                                                                                                                    
  4⃣ OpenAI GPT-5.2 — 88.8% | $5.01 | 28s                                                                                                                 
  5⃣ Qwen3 Max — 88.6% | $5.42 | 45s                                                                                                                      
                                                                                                                                                          
  The real story? Cost varies 60x between similar performers. Grok 4 Fast matches GPT-5.2 at 1/25th the cost. Claude Haiku 4.5 delivers near-Opus quality 
  for $3 total.    
                                                                                                                                                          
   Live dashboard: https://vibecoding.llmbench.xyz/                                                                                                     
   GitHub repo: https://github.com/alt-research/vibe-coding-benchmark-public                                                                            
   Thesis: https://github.com/alt-research/vibe-coding-benchmark-public/blob/main/docs/THESIS.md                                                        
                                                                                                                                                          
  The ultimate test isn't fixing a bug in scikit-learn. It's shipping a feature your users need—safely, efficiently—before the sprint ends.               
                                                                                                                                                          
  Open source. Contributions welcome.