news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute

https://qlabs.sh/slowrun

45•sdpmas•1h ago

Comments

suddenlybananas•1h ago

Reminds me a fair bit of the BabyLM challenge. It would be good to give them a shout-out and see how this challenge differs.

sdpmas•51m ago

hey, it's Samip (behind the Slowrun repo). yeah that's a fair point, we will mention them in the blog. but there are a couple of major differences: 1. our emphasis is on using more compute to get better data efficiency. this is important because there are lots of hacky chances that will get lower loss, but when compared to general methods that leverage a lot of compute, they don't do so well. and you can already see how this emphasis on compute leads to different methods to BabyLM! 2. our reasoning behind the repo is not anything to do with how much data a child sees. and our dataset is not tailored towards that either. it's simple pretraining on random subset of the internet. we know there are better training algorithms that get lower loss on that data, and we are finding those.

soraki_soladead•47m ago

also, BabyLM is more of a conference track / workshop than an open-repo competition which creates a different vibe

archermarks•27m ago

Very cool idea. Interested to see how this progresses. One question: how worried are you about over-training on this particular dataset? i.e. instead of generalizing you lean more toward memorization? Obviously you leave out a validation set but since you're meta-optimizing the model itself by its performance on the validation dataset you're still at risk of over-fitting.

sdpmas•20m ago

yes, good point. right now, it's somewhat hard to overfit because the meta-optimization extracts tiny bits of information. but over time, we will switch the validation set to some other random subset of the FineWeb or even entirely OOD datasets!

A zero-dependency multi-agent AI engine that negotiates instead of agreeing

https://github.com/ProjectPortmanteau/Execution

1•illportstudios•2m ago•0 comments

Father claims Google's AI product fuelled son's delusional spiral

https://www.bbc.com/news/articles/czx44p99457o

1•tartoran•3m ago•0 comments

The origin of our fascination with crystals

https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2026.1633599/full

1•michaefe•5m ago•0 comments

Treetops Emit Ultraviolet Sparkles During Thunderstorms

https://www.smithsonianmag.com/smart-news/treetops-emit-ultraviolet-sparkles-during-thunderstorms...

1•thunderbong•5m ago•0 comments

Show HN: MomentSurfer – AI Scrolling Agent for Social Media

https://www.momentsurfer.com/

1•priyankaajsr•6m ago•0 comments

Don't Let Crypto Kill the Economy

https://bettermarkets.org/analysis/dont-let-crypto-kill-the-economy/

1•petethomas•6m ago•0 comments

Show HN: SmartAgentKit – policy-governed smart wallets for AI agents

2•martinbf•8m ago•0 comments

Show HN: Karellen-rr-MCP – MCP server that gives LLMs rr reverse debugging

https://github.com/karellen/karellen-rr-mcp

1•arcivanov•9m ago•1 comments

Israel Spent Years Hacking Tehran Traffic Cameras to Track Khamenei

https://thedefensepost.com/2026/03/04/israel-traffic-cameras-track-khamenei/

2•gambutin•10m ago•1 comments

Genome modelling and design across all domains of life with Evo 2

https://www.nature.com/articles/s41586-026-10176-5

1•kkoncevicius•10m ago•0 comments

Google ends its 30 percent app store fee and welcomes third-party app stores

https://www.engadget.com/apps/google-ends-its-30-percent-app-store-fee-and-welcomes-third-party-a...

2•_____k•11m ago•1 comments

Google Chrome moving to a two-week Release Cycle, to begin on 8 September

https://developer.chrome.com/blog/chrome-two-week-release

1•gr4vityWall•11m ago•0 comments

Ask HN: How will agents change our theories of labor?

1•char_string•12m ago•0 comments

Blogosphere – a directory of independent blogs and personal websites

https://blogosphere.app/

2•Curiositry•13m ago•0 comments

Show HN: A browser based sequencer for rapid music prototyping

https://music.grinningfrog.com

1•sesquieu•17m ago•0 comments

Helpme: Minimal tmux wrapper for context-aware agent debugging

https://github.com/cameronfyfe/helpme

2•ramoz•18m ago•0 comments

A Dual-LLM Policy for Reducing Noise in Agentic Program Repair

https://arxiv.org/abs/2510.03217

1•azhenley•18m ago•0 comments

Show HN: WooTTY - browser terminal in a single Go binary

https://github.com/icoretech/wootty

2•masterkain•19m ago•0 comments

Show HN: I built a CLI to sync AI agent skills and MCPs across coding agents

https://github.com/ryanreh99/skills-sync

1•ryanreh99•21m ago•0 comments

Two Claude Code skills for founders – debriefs and ADHD-aware interactio

https://github.com/assafkip/founder-skills

2•Assafkip•21m ago•0 comments

A Wordle for the Worldle

https://omarkamali.com/blog/worldle-for-the-world-wikilangs-launch

1•omneity•22m ago•0 comments

macOS Apps That Changed How I Work

https://medium.com/@gustav_82/10-macos-apps-that-changed-how-i-work-1ce209fb77b3

1•Glubker•22m ago•0 comments

Show HN: SmartAgentKit – policy-governed smart wallets for AI agents

https://smartagentkit.xyz

1•mbonfoster•23m ago•1 comments

Show HN: Refinance calculator that shows the real net worth impact

https://abelscalculators.com/

1•brentdrns•23m ago•0 comments

Show HN: Kryfto – Self-hosted MCP server with 42 tools for AI agent web access

https://github.com/ExceptionRegret/Kryfto

1•machinelinux•25m ago•1 comments

Gram: Zed, but with AI and chat features removed

https://www.theregister.com/2026/03/04/gram_cut_down_zed/

1•Chris2048•26m ago•0 comments

Show HN: Lexio – AI-Native PDF Reader (Ollama, Claude, OpenAI, Gemini)

https://github.com/nikodemseb/lexio

1•nikodemseb•27m ago•0 comments

Show HN: DSCO agentic CLI with multi-turn tool use and swarms

https://github.com/arthurcolle/dsco

1•arthurcolle•28m ago•0 comments

The Second 80%

https://www.braingrid.ai/blog/the-second-80-percent

1•acossta•29m ago•1 comments

What Is Code Review For?

https://blog.glyph.im/2026/03/what-is-code-review-for.html

1•donutshop•31m ago•0 comments