frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Turn native language audio into flashcards and shadowing practice

https://lingochunk.com/try
33•alder•4h ago
Here is a tool I built initially for myself to help with my German and Greek language studies. It started as a hack for creating Anki cards from native language audio. It extracts the words, finds their base forms (lemmas) and groups the examples by the lemma. At some point I realised that I have a transcription with word level timestamps that opens a lot of other opportunities. So I added a mode to click the first and last word in the transcript and it starts looping with the right gap and repeat count.

Another feature I use a lot is selecting an audio fragment, sending a predefined prompt to an AI to "explain grammar" or "explain nuances of meaning" and I still experimenting with prompts.

And because shadowing is so easy I also use it as a player to improve my English pronunciation. (I am not a native English speaker.)

I made a quick video showing the workflow for creating Anki cards and shadowing: https://youtu.be/TaR58uuDBvU?si=o5aGLAi2S-BZ7Zy9

The app supports 15 input languages (Japanese and Chinese are the latest experimental additions), and more than 30 output languages.

I would really appreciate it if you could try it https://lingochunk.com/try. I know there are other tools with similar functionality but I created something that fits my workflow and it is fun to build.

Also I struggled to find public domain audio for the try page. I'd be grateful if anyone could point me to public domain sources (I used LibriVox, Wikimedia and FSI courses), or if you're a creator, let me feature some of your own recordings with credits and links.

Comments

3stacks•1h ago
This is awesome! I’ll be lurking for new data sources. I’m working on a self-hosted language app more focused around cloze and sentence mining into Anki. I love seeing more stuff happening in this space
alder•1h ago
Thanks! I am glad you like it! I essentially mine the source audio, and all examples have cloze style gaps (blurring, in my case) that are revealed on the back of the card. I also beep the word in the sentence when you try to play it on the front card in built-in SRS system. Unfortunately that is not implemented in the Anki export, but it is technically possible.
__float•1h ago
I don't know what resolution or display you built this on, but a heads up the initial impression on my 4K monitor is that everything is incredibly tiny.
alder•1h ago
To be honest I haven't tested it on a 4K monitor yet, so I am not surprised. There are two controls above the transcript that change the font size and the line spacing, which should help a bit for now. Something to fix, thanks!
hiAndrewQuinn•1h ago
Very nice work. I'm going for a different thing, but my audio2anki tool [1] is about as streamlined as I could make it to turn a YouTube URL I want to learn into a stack of Anki flashcards, purely locally.

[1]: https://github.com/hiAndrewQuinn/audio2anki

jrrv•1h ago
Is it possible to add traditional characters for mandarin?

Also the pinyin for 誰/谁 is coming through as shuí, whilst this character has two pronounciations, I believe shéi is the more common one.

alder•46m ago
Thanks! Chinese and Japanese as source languages are still experimental, I did my best to support them but I have to rely on people who actually know the language and this kind of feedback is really useful. I'll look into adding traditional characters and fixing the pinyin.
jrrv•40m ago
No worries, I appreciate the effort. I did go back and listen and they are indeed pronouncing sheí in the audio too.

I use a firefox extension to convert simplified to traditional, looks like it's open source so that may be of some use to you: https://github.com/tongwentang/tongwentang-extension.

Although there are some clashes that it does not handle, e.g. 隻 and 只 are both 只 in simplified, you just have to know which one it is from context, but the extension fails to convert to 隻 where appropriate.

Koaisu•1h ago
Just tried it with an unsupported language and it still worked I set it to Chinese and inputted the audio. Still got correct results.
dirteater_•57m ago
What are you doing for Chinese word segmentation/pinyin?
alder•15m ago
For segmentation and POS I rely on spaCy zh_core_web_sm, pinyin from pypinyin library. Also the small correction level on top. But I am not a Chinese language expert to judge if it really works and I'll rely on feedback from the users to improve it.
jcg591•50m ago
Very cool! I'm also learning Greek and it's amazing how many resources are becoming available.
alder•34m ago
Thanks! Yes, it's getting better for Greek but still not on par with other languages. I completed the only 2 Greek levels on Duolingo and they are really boring compared to the German one I am doing now. Easy Greek is a bit above my level, and the number of YouTubers in Greek is tiny compared to German.
pzagor2•25m ago
I also built a tool to help me study Spanish. I really like the idea of shadowing, so I built a tool that lets you take any YouTube video and generate a sentence-by-sentence exercise to help you repeat the speaker's phrases.

https://talkhabit.com/shadow Or example, of one exercise: https://talkhabit.com/shadow?videoUrl=https%3A%2F%2Fwww.yout...

Stuff I need to work on: - It only works with videos that have auto-generated captions - It works best with monologue videos

deaton•13m ago
This is really cool, just as I'm starting to get towards the back end of the Kaishi 1.5k deck so this will be perfect for my Japanese studies. Thanks for sharing.

Show HN: I made Google Trends for Hacker News by indexing 18 years of comments

https://hackernewstrends.com
122•ytkimirti•1h ago•41 comments

You can't unit test for taste

https://dev.karltryggvason.com/you-cant-unit-test-for-taste/
132•kalli•1d ago•53 comments

Zig's New BitCast Semantics and LLVM Back End Improvements

https://ziglang.org/devlog/2026/#2026-06-25
36•kouosi•1h ago•4 comments

Ford rehires 350 engineers after AI fails to preserve expertise or train juniors

https://www.bloomberg.com/news/articles/2026-06-25/ford-has-been-rehiring-quality-inspectors-afte...
84•alanwreath•36m ago•39 comments

Half-Life 2 in a Browser

https://hl2.slqnt.dev/
492•panza•9h ago•202 comments

Show HN: Turn native language audio into flashcards and shadowing practice

https://lingochunk.com/try
34•alder•4h ago•15 comments

Anthropic says Alibaba illicitly extracted Claude AI model capabilities

https://www.reuters.com/world/china/anthropic-says-alibaba-illicitly-extracted-claude-ai-model-ca...
638•htrp•19h ago•1042 comments

LastPass notifies users of yet another data breach

https://9to5mac.com/2026/06/23/lastpass-notifies-users-of-yet-another-data-breach/
272•mooreds•5h ago•124 comments

Ask HN: What surprised you about Estonia e-Residency and running an Estonian OÜ?

33•jvilalta•1h ago•21 comments

Puzzling Success of Overparameterization: Lottery Tickets or Escape Dimensions?

https://infoscience.epfl.ch/entities/publication/9a49779b-f9f8-448d-b3d1-737c78455309
31•rbanffy•1d ago•5 comments

OpenAI unveils its first custom chip, built by Broadcom

https://techcrunch.com/2026/06/24/openai-unveils-its-first-custom-chip-built-by-broadcom/
769•jamdesk•21h ago•441 comments

Wikipedia Workers in Britain set global first by seeking union recognition

https://utaw.tech/news/wikipedia-recognition
175•chobeat•8h ago•163 comments

Cloudflare launched self-managed OAuth for all

https://blog.cloudflare.com/oauth-for-all/
277•terryds•13h ago•121 comments

Blogging can just be stating the obvious

https://blog.jim-nielsen.com/2026/blogging-stating-the-obvious/
361•Curiositry•15h ago•110 comments

Lianda and the Long March

https://blog.georeactor.com/books-06-26b
5•mapmeld•1d ago•0 comments

Bohemia Interactive: Cold War Assault Remastered Source Code on GitHub

https://github.com/BohemiaInteractive/CWR
160•dewey•2d ago•34 comments

LuaJIT 3.0 proposed syntax extensions

https://github.com/LuaJIT/LuaJIT/issues/1475
201•phreddypharkus•14h ago•119 comments

45°C cooling design cuts data center water use to near zero

https://blogs.nvidia.com/blog/liquid-cooling-ai-factories/
427•nitin_flanker•1d ago•353 comments

Medical students are using popular research tool to pump out misleading studies

https://www.science.org/content/article/medical-students-are-using-popular-research-tool-pump-out...
116•rndsignals•13h ago•65 comments

SoftBank 2026 AGM [pdf]

https://group.softbank/media/Project/sbg/sbg/pdf/ir/investors/shareholders/2026/shareholders-meet...
14•dmmalam•2h ago•3 comments

GLM-5.2 is a step change for open agents

https://www.interconnects.ai/p/glm-52-is-the-step-change-for-open
320•vantareed•2d ago•184 comments

Show HN: Secs-man, a secrets manager you can (not) rely on

https://github.com/Fran314/secrets-manager-rs
15•Fran314•3h ago•11 comments

Show HN: StartupsBR – A map of Brazilian startups

https://www.startupsbr.com/sao-paulo
47•leonagano•5d ago•21 comments

Dostoyevsky isn't difficult

https://www.autodidacts.io/dostoyevsky-isnt-difficult/
202•surprisetalk•3d ago•253 comments

Lies, Damn Lies and Database Benchmarks

https://questdb.com/blog/lies-damn-lies-and-database-benchmarks/
44•eigenBasis•2d ago•16 comments

RubyLLM: A Ruby framework for all major AI providers

https://rubyllm.com/
425•doener•1d ago•74 comments

Words, Words, Words

https://aeon.co/essays/literature-fans-should-welcome-ai-as-a-fellow-wordsmith
28•benbreen•2d ago•11 comments

Qualcomm to Acquire Modular

https://www.reuters.com/business/qualcomm-buy-ai-startup-modular-2026-06-24/
227•timmyd•1d ago•84 comments

PR spam today looks like email spam in the early 2000s

https://www.greptile.com/blog/prs-on-openclaw
249•dakshgupta•1d ago•143 comments

Countries are competing to see which can carry out mass surveillance the best

https://mullvad.net/en/why-privacy-matters/state-mass-surveillance
231•Cider9986•2h ago•88 comments