frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Run 30B model in 4GB Active Memory

https://github.com/NimbleEdge/sparse_transformers
4•vkkhare•1d ago
We have built fused operator kernels for structured contextual sparsity to avoid loading and computing activations with feed forward layer weights that eventually zero out by the activation.

The result? We are seeing 5X faster MLP layer performance in transformers with 50% lesser memory consumption avoiding the sleeping nodes in every token prediction. For Llama 3.2, Feed forward layers accounted for 30% of total weights and forward pass computation resulting in 1.6-1.8x increase in throughput:

Sparse LLaMA 3.2 3B vs LLaMA 3.2 3B (on HuggingFace Implementation):

- Time to First Token (TTFT): 1.51× faster (1.209s → 0.803s) - Output Generation Speed: 1.79× faster (0.7 → 1.2 tokens/sec) - Total Throughput: 1.78× faster (0.7 → 1.3 tokens/sec) - Memory Usage: 26.4% reduction (6.125GB → 4.15GB)

Find the operator kernels with differential weight caching open sourced at github.com/NimbleEdge/sparse_transformers. Lets get LLMs sprinting!

Comments

nrjpoddar•1d ago
Link github/sparse_transformers seems to be broken
vkkhare•22h ago
updated the link

Show HN: AI game animation sprite generator

https://www.godmodeai.cloud/ai-sprite-generator
55•lyogavin•9h ago•46 comments

Show HN: Air Lab – A portable and open air quality measuring device

https://networkedartifacts.com/airlab/simulator
456•256dpi•1d ago•180 comments

Show HN: Mustardwatch: Detect what files a program uses, rerun when they change

https://github.com/shachaf/mustardwatch
7•shachaf•6h ago•0 comments

Show HN: Ask-human-mcp – zero-config human-in-loop hatch to stop hallucinations

https://masonyarbrough.com/blog/ask-human
112•echollama•1d ago•53 comments

Show HN: SQLAlchemy just the core – a better way

https://github.com/sayanarijit/sqla-fancy-core
4•sayanarijit•5h ago•0 comments

Show HN: Bridgit – In-Person-First Networking

https://www.bridgitsocial.com/
3•amfooladgar•5h ago•3 comments

Show HN: Lambduck, a Functional Programming Brainfuck

https://imjakingit.github.io/lambduck/
64•jorkingit•1d ago•26 comments

Show HN: Claude Composer

https://github.com/possibilities/claude-composer
150•mikebannister•1d ago•85 comments

Show HN: Televyze, Your IPTV OS

https://televyze.com
3•1mbsite•7h ago•0 comments

Show HN: iOS Screen Time from a REST API

https://www.thescreentimenetwork.com/api/
99•anteloper•1d ago•50 comments

Show HN: Container Use for Agents

https://github.com/dagger/container-use
72•aluzzardi•1d ago•17 comments

Show HN: ClickStack – Open-source Datadog alternative by ClickHouse and HyperDX

https://github.com/hyperdxio/hyperdx
230•mikeshi42•1d ago•65 comments

Show HN: Arxivlens – Save Hours Researching Scientific Literature on ArXiv

https://arxivlens.com/
2•Ava234•8h ago•0 comments

Show HN: GPT image editing, but for 3D models

https://www.adamcad.com/
177•zachdive•2d ago•84 comments

Show HN: Kobuddy – Send Articles to Your Kobo E-Reader

https://kobuddy.app/
2•No-Arugula5818•10h ago•1 comments

Show HN: CensorIt – Remove vocals or music from uploaded audio files

https://censorit.org/signin
3•mr_aaron•10h ago•0 comments

Show HN: EndBOX – A toy-like retro computer for EndBASIC

https://www.endbasic.dev/2025/06/unveiling-the-endbox.html
4•jmmv•11h ago•0 comments

Show HN: Solomon's Agent - a CLI to simplify the web

https://github.com/jadbox/solomonagent
3•jadbox•11h ago•1 comments

Show HN: Lightweight Durable Workflows Built on Postgres

https://github.com/dbos-inc/dbos-transact-python
5•qianli_cs•11h ago•0 comments

Show HN: Tape/Z – a toolkit for analysing z/OS assembler (HLASM) code

https://github.com/avishek-sen-gupta/tape-z
2•armorer•12h ago•0 comments

Show HN: String Flux – Simplify everyday string transformations for developers

https://stringflux.io
16•eaglepeak•1d ago•7 comments

Show HN: I made a 3D SVG Renderer that projects textures without rasterization

https://seve.blog/p/i-made-a-3d-svg-renderer-that-projects
206•seveibar•2d ago•70 comments

Show HN: An open-source browser extension that integrates difftastic into GitHub

https://andersonaddo.github.io/amadiff/
3•WiggleGuy•14h ago•0 comments

Show HN: Grab a Random ArXiv Paper

https://jepedersen.dk/arxiv.html
13•jegp•1d ago•2 comments

Show HN: Instant video edits with local Whisper models (macOS)

https://cutword.com
5•jelled•15h ago•0 comments

Show HN: Posture Correction Using AirPods Motion Sensors

https://github.com/wizenheimer/workwell
7•tinylm•22h ago•1 comments

Show HN: Book to help you build a PostgreSQL-like database server from scratch

https://technicaldeft.com/build-a-database-server
4•zetter•17h ago•0 comments

Show HN: App.build, an open-source AI agent that builds full-stack apps

https://www.app.build/
89•davidgomes•2d ago•13 comments

Show HN: Kan.bn – An open-source alterative to Trello

https://github.com/kanbn/kan
503•henryball•4d ago•218 comments

Show HN: I wrote a Java decompiler in pure C language

https://github.com/neocanable/garlic
168•neocanable•3d ago•94 comments