frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
1•FinnLobsien•27s ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•1m ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•2m ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
1•basilikum•4m ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•5m ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•10m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
2•throwaw12•11m ago•1 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•11m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•12m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•14m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•17m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•20m ago•0 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
1•mgh2•26m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•28m ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•33m ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•35m ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•35m ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•38m ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•39m ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•41m ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•42m ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•45m ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•46m ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•49m ago•1 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
1•__natty__•50m ago•0 comments

Show HN: Android-based audio player for seniors – Homer Audio Player

https://homeraudioplayer.app
3•cinusek•50m ago•2 comments

Starter Template for Ory Kratos

https://github.com/Samuelk0nrad/docker-ory
1•samuel_0xK•52m ago•0 comments

LLMs are powerful, but enterprises are deterministic by nature

2•prateekdalal•55m ago•0 comments

Make your iPad 3 a touchscreen for your computer

https://github.com/lemonjesus/ipad-touch-screen
2•0y•1h ago•1 comments

Internationalization and Localization in the Age of Agents

https://myblog.ru/internationalization-and-localization-in-the-age-of-agents
1•xenator•1h ago•0 comments
Open in hackernews

Ask HN: How to extract structured information from captured audio?

1•sandreas•9mo ago
Hey HN,

I would like to extract structured information from captured audio on a device that is not too expensive (a small LLM would be an option, I got an old NVidia 1660 Super with 6GB VRAM).

OpenAI Whisper could be used to get the audio contents as text, but I don't really know how I could reliably extract the information in a structured way. There is always a "purpose", which is selected out of let's say 10 possible purposes and "required data", which is depending on the purpose and composed by key value pairs, that also have predefined values.

An example (spoken text):

  Please apply for leave from 1st November to 8th november.
Result (structured data):

  {
     purpose: "apply for leave",
     data: {
        start: "2025-11-01",
        end: "2025-11-08"
     }
  }

What are my options to do this in a reliable way that can match different purposes with different data by "best match" approach?

Comments

sargstuff•9mo ago
Related OpenAI forum topic(s) that covers related issues[0].

Old school, mark 'paragraph'/sentence, regular expression out miscellaneous info (using language linguistics / linguistic 'typing' aka noun, verb, etc) , then dump relevent remaining info in json/delimited format & normalize data (aka 1st november to 11/01). multi-pass awk script(s) / pearl / icon are languages with appropriate in-language support. use regular expressions/statistics to detect 'outliers'/mark data for human review.

multi-pass awk would require a codex/phrases related to a delimited/json tag. so first pass, identify phrases (perhaps also spell correct), categorize phrases related to given delimited field (via human intervention), then rescan, check for 'outliers'/conflicting normalizations & have script do corrects per human annotations.

Note: Normalized phonetic annotations bit easer to handle than common dictionary spelling.

[0] : https://community.openai.com/t/summarizing-and-extracting-st...

sandreas•9mo ago
Thanks, I'm going to read through the link. I also found some python libs, that do this, so since I need to run Whisper on the backend to transfer the speech to text, I think it would be suitable to use python also for tokenization - maybe spaCy (https://www.geeksforgeeks.org/tokenization-using-spacy-libra...).
sargstuff•9mo ago
Very less tramatic programming exercise than using awk. :-) aka realistic programming tool(s) for required task.