news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Whisper at 1.58 bits with custom kernels for edge inference

https://medium.com/@enerzai/1-58-bit-quantization-the-wegovy-for-ai-models-9954a449c144

6•coolhanhim•6h ago

Comments

coolhanhim•6h ago

We quantized OpenAI’s Whisper model to 1.58 bits using Quantization-Aware Training (QAT) to run speech recognition on resource-constrained embedded CPUs. Post-Training Quantization(PTQ) was unsuccessful under 4 bits, so we conducted QAT with a replicated dataset. To make inference feasible, we also implemented custom low-bit kernels optimized for edge deployment. This post walks through the technical challenges and how we addressed them for extreme quantization in real-world use.

My Theory: Advertising is a lot like capitalism itself

1•cm2012•1m ago•0 comments

No CSS Club

https://nocss.club/

1•smartmic•1m ago•0 comments

SpecTree: Composable Context Engineering for LLMs

https://www.fuzzycomputer.com/posts/spectree

1•matt_holden•1m ago•0 comments

What 30k Citations Taught Us About AI Search

https://www.tryzenith.ai/blog/ai-search-google-vs-gemini-q3-2025

2•manveerc•2m ago•0 comments

Another Google Pixel 6a catches fire after battery-nerfing update

https://arstechnica.com/gadgets/2025/07/another-google-pixel-6a-catches-fire-after-battery-nerfing-update/

1•AdmiralAsshat•2m ago•0 comments

Show HN: Kiln – AI Boilerplate with Evals, Fine-Tuning, Synthetic Data, and Git

https://github.com/Kiln-AI/demos/blob/main/end_to_end_project_demo/README.md

1•scosman•6m ago•0 comments

Tesla Readies a Taxi Service in San Francisco–But Not with Robotaxis

https://www.wired.com/story/tesla-robotaxi-san-francisco-bay-area/

1•belter•10m ago•0 comments

A Unified Frontier in Neuroscience, AI and Neuromorphic Systems

https://arxiv.org/abs/2507.10722

2•belter•11m ago•0 comments

How to Future-Proof Your Work and Life with a Personal OS

https://thenextshift.beehiiv.com/p/building-your-personal-operating-system-for-the-future-of-work

1•kennyledet•11m ago•1 comments

Application factory pattern in Jupyter notebooks

https://world.hey.com/apetrov/application-factory-pattern-in-jupyter-notebooks-cb58ef7c

2•apetrov•14m ago•0 comments

Different Clocks

https://ianto-cannon.github.io/clock.html

6•pppone•14m ago•0 comments

'I witnessed war crimes' in Gaza

https://www.bbc.com/news/videos/cy8k8045nx9o

4•nathanyz•14m ago•0 comments

Single-Qubit Gates with Errors at the 10⁻⁷ Level

https://journals.aps.org/prl/abstract/10.1103/42w2-6ccy

2•gnabgib•14m ago•0 comments

Show HN:AI Agent with 20+ models for PDF chat and annotation

https://twitter.com/aka4uh/status/1949907056108191754

1•ieuanking•15m ago•0 comments

Bevy in Production: Building Modeling at Metabuild [video]

https://www.youtube.com/watch?v=16r9plbAhXo

2•todsacerdoti•17m ago•0 comments

Writing a Web Server in Pure Bash [video]

https://www.youtube.com/watch?v=L967hYylZuc

2•bahamas10•17m ago•0 comments

Shrinkle – Shrink words, find hidden phrase

https://www.shrinkle.org/

3•onion92•17m ago•0 comments

The U.S.-China trade war supercharged Vietnam's chip industry

https://restofworld.org/2025/vietnam-chip-industry-us-china-trade-war/

1•colinprince•18m ago•0 comments

Scott Alexander Is Smarter Than Me. Should I Steal His Beliefs?

https://starlog.substack.com/p/scott-alexander-is-smarter-than-me

1•paulpauper•18m ago•2 comments

Cyberattack of Russian airline Aeroflot causes chaos – 100+ flights cancelled

https://techcrunch.com/2025/07/28/flights-grounded-as-russias-largest-airline-aeroflot-hit-by-cyberattack/

2•kristofferR•19m ago•1 comments

Ultrascript = Ts and Go

https://github.com/ultrascript-coder/ultrascript

1•h_tbob•21m ago•1 comments

Confluent Developer - Courses

https://developer.confluent.io/courses/

2•saikatsg•23m ago•0 comments

Be a guest on the first AI-hosted podcast

https://ainterview.space

2•TommyKid•23m ago•0 comments

What Does Consulting Do?

https://www.nber.org/papers/w34072

3•MrBuddyCasino•23m ago•1 comments

'In the Shadow of the Moon' Film Review

https://www.hollywoodreporter.com/movies/movie-reviews/shadow-moon-1242948/

1•walterbell•23m ago•0 comments

Show HN: I built an API for extracting YouTube summaries, transcripts and stats

https://www.socialkit.dev/

2•geiger01•23m ago•0 comments

The Realities of Semantic Search

https://deepnoodle.ai/blog/the-realities-of-semantic-search

2•myzie•25m ago•0 comments

Show HN: I built an AI agent that schedules meetings from Gmail and Slack

https://meetalphie.com/

4•Riphyak•26m ago•2 comments

Plex: Perturbation-Free Local Explanations for LLM-Based Text Classification

https://arxiv.org/abs/2507.10596

2•PaulHoule•26m ago•0 comments

Show HN: Typogram Studio – like Figma but for typography

https://typogram.co/studio/

1•wentin•27m ago•0 comments