frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Do you have a mathematically attractive face?

https://www.doimog.com
1•a_n•1m ago•1 comments

Code only says what it does

https://brooker.co.za/blog/2020/06/23/code.html
1•logicprog•6m ago•0 comments

The success of 'natural language programming'

https://brooker.co.za/blog/2025/12/16/natural-language.html
1•logicprog•6m ago•0 comments

The Scriptovision Super Micro Script video titler is almost a home computer

http://oldvcr.blogspot.com/2026/02/the-scriptovision-super-micro-script.html
2•todsacerdoti•7m ago•0 comments

Discovering the "original" iPhone from 1995 [video]

https://www.youtube.com/watch?v=7cip9w-UxIc
1•fortran77•8m ago•0 comments

Psychometric Comparability of LLM-Based Digital Twins

https://arxiv.org/abs/2601.14264
1•PaulHoule•9m ago•0 comments

SidePop – track revenue, costs, and overall business health in one place

https://www.sidepop.io
1•ecaglar•12m ago•1 comments

The Other Markov's Inequality

https://www.ethanepperly.com/index.php/2026/01/16/the-other-markovs-inequality/
1•tzury•14m ago•0 comments

The Cascading Effects of Repackaged APIs [pdf]

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6055034
1•Tejas_dmg•16m ago•0 comments

Lightweight and extensible compatibility layer between dataframe libraries

https://narwhals-dev.github.io/narwhals/
1•kermatt•18m ago•0 comments

Haskell for all: Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding
2•RebelPotato•22m ago•0 comments

Dorsey's Block cutting up to 10% of staff

https://www.reuters.com/business/dorseys-block-cutting-up-10-staff-bloomberg-news-reports-2026-02...
2•dev_tty01•25m ago•0 comments

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]

https://www.youtube.com/watch?v=3SxNBz1VTE0
1•sanity•26m ago•1 comments

In the AI age, 'slow and steady' doesn't win

https://www.semafor.com/article/01/30/2026/in-the-ai-age-slow-and-steady-is-on-the-outs
1•mooreds•33m ago•1 comments

Administration won't let student deported to Honduras return

https://www.reuters.com/world/us/trump-administration-wont-let-student-deported-honduras-return-2...
1•petethomas•34m ago•0 comments

How were the NIST ECDSA curve parameters generated? (2023)

https://saweis.net/posts/nist-curve-seed-origins.html
2•mooreds•34m ago•0 comments

AI, networks and Mechanical Turks (2025)

https://www.ben-evans.com/benedictevans/2025/11/23/ai-networks-and-mechanical-turks
1•mooreds•35m ago•0 comments

Goto Considered Awesome [video]

https://www.youtube.com/watch?v=1UKVEUGEk6Y
1•linkdd•37m ago•0 comments

Show HN: I Built a Free AI LinkedIn Carousel Generator

https://carousel-ai.intellisell.ai/
1•troyethaniel•38m ago•0 comments

Implementing Auto Tiling with Just 5 Tiles

https://www.kyledunbar.dev/2026/02/05/Implementing-auto-tiling-with-just-5-tiles.html
1•todsacerdoti•39m ago•0 comments

Open Challange (Get all Universities involved

https://x.com/i/grok/share/3513b9001b8445e49e4795c93bcb1855
1•rwilliamspbgops•40m ago•0 comments

Apple Tried to Tamper Proof AirTag 2 Speakers – I Broke It [video]

https://www.youtube.com/watch?v=QLK6ixQpQsQ
2•gnabgib•42m ago•0 comments

Show HN: Isolating AI-generated code from human code | Vibe as a Code

https://www.npmjs.com/package/@gace/vaac
1•bstrama•43m ago•0 comments

Show HN: More beautiful and usable Hacker News

https://twitter.com/shivamhwp/status/2020125417995436090
3•shivamhwp•44m ago•0 comments

Toledo Derailment Rescue [video]

https://www.youtube.com/watch?v=wPHh5yHxkfU
1•samsolomon•46m ago•0 comments

War Department Cuts Ties with Harvard University

https://www.war.gov/News/News-Stories/Article/Article/4399812/war-department-cuts-ties-with-harva...
9•geox•50m ago•1 comments

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt
3•yi_wang•50m ago•0 comments

A Bid-Based NFT Advertising Grid

https://bidsabillion.com/
1•chainbuilder•54m ago•1 comments

AI readability score for your documentation

https://docsalot.dev/tools/docsagent-score
1•fazkan•1h ago•0 comments

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

https://science.nasa.gov/blogs/science-news/2026/02/06/nasa-study-non-biologic-processes-dont-ful...
3•bediger4000•1h ago•2 comments
Open in hackernews

Ask HN: How to extract structured information from captured audio?

1•sandreas•9mo ago
Hey HN,

I would like to extract structured information from captured audio on a device that is not too expensive (a small LLM would be an option, I got an old NVidia 1660 Super with 6GB VRAM).

OpenAI Whisper could be used to get the audio contents as text, but I don't really know how I could reliably extract the information in a structured way. There is always a "purpose", which is selected out of let's say 10 possible purposes and "required data", which is depending on the purpose and composed by key value pairs, that also have predefined values.

An example (spoken text):

  Please apply for leave from 1st November to 8th november.
Result (structured data):

  {
     purpose: "apply for leave",
     data: {
        start: "2025-11-01",
        end: "2025-11-08"
     }
  }

What are my options to do this in a reliable way that can match different purposes with different data by "best match" approach?

Comments

sargstuff•9mo ago
Related OpenAI forum topic(s) that covers related issues[0].

Old school, mark 'paragraph'/sentence, regular expression out miscellaneous info (using language linguistics / linguistic 'typing' aka noun, verb, etc) , then dump relevent remaining info in json/delimited format & normalize data (aka 1st november to 11/01). multi-pass awk script(s) / pearl / icon are languages with appropriate in-language support. use regular expressions/statistics to detect 'outliers'/mark data for human review.

multi-pass awk would require a codex/phrases related to a delimited/json tag. so first pass, identify phrases (perhaps also spell correct), categorize phrases related to given delimited field (via human intervention), then rescan, check for 'outliers'/conflicting normalizations & have script do corrects per human annotations.

Note: Normalized phonetic annotations bit easer to handle than common dictionary spelling.

[0] : https://community.openai.com/t/summarizing-and-extracting-st...

sandreas•9mo ago
Thanks, I'm going to read through the link. I also found some python libs, that do this, so since I need to run Whisper on the backend to transfer the speech to text, I think it would be suitable to use python also for tokenization - maybe spaCy (https://www.geeksforgeeks.org/tokenization-using-spacy-libra...).
sargstuff•9mo ago
Very less tramatic programming exercise than using awk. :-) aka realistic programming tool(s) for required task.