frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

What makes comprehensible input comprehensible?

https://cij-analysis.streamlit.app
35•surprisetalk•7mo ago

Comments

joshdavham•7mo ago
Oh wow! I’m surprised to see someone post my analysis haha

Happy to answer any questions here. I kept my analysis really high level for a general audience but since this is HN, we can get a bit nerdy :D

flippyhead•7mo ago
I love this. I made a totally free, just for fun, tool based around learning Japanese via Youtube using the CI approach. https://seikai.tv The trick is finding content that is at the right level but that you also find interesting. Great article, thank you!
joshdavham•7mo ago
Thanks for the kind words!
ragazzina•7mo ago
> Word length - At least in English and French (the languages I know best), longer words are generally considered harder.

I think in a language with a lot of similar sounds or even homophones, longer words are easier. For a beginner Chinese speaker that knows both words, hearing "chē" will probably be ambiguous, but "chūzūchē" will be parsed immediately.

joshdavham•7mo ago
That’s a good point.

I don’t think the ‘longer equals harder’ pattern holds for every language. I actually reached out to the head teacher at CIJ when I first made this analysis and she said the same.

kazinator•7mo ago
This is mainly resolved by context. "Penultimate" is a harder word than "pen". Now that could also mean "penitentiary" in North American vernacular, or a box in which a pig is kept, but not in a sentence like "Can I borrow your pen?"
EdiX•7mo ago
I don't think this captures the whole situation. Much of what makes comprehensible input comprehensible, at lower levels, is presence of visual hints.
joshdavham•7mo ago
That's exactly right.

Much of the beginner videos make use of visual hints like you say (images, props, etc), and none of these were taken into account in my analysis.

I do think it could be cool to do a 'visual' analysis of CI in the future where you attempt to measure how much context is present (or not) in each video and see what insights you could draw from that.

joshdavham•7mo ago
Here's the source code for this analysis to those interested: https://github.com/joshdavham/cij-analysis

I will note that the transcripts (and parsing scripts) are not included in the repo. The transcripts are not my intellectual property so I can't share it (and the parsing scripts are a bit of a dumpster fire).

kazinator•7mo ago
What makes comprehensible input comprehensible? Is that a trick question?

Avoiding unknown vocabulary, or including just a small amount that can be inferred from context; avoiding rare grammatical rules; avoiding stuffing too many clauses into sentences, keeping them short.

Just like a language has a large vocabulary of words of which only a subset is common, a similar observation holds for the grammar rules. Some are used only in very formal/erudite speech or writing. Also, just like your active vocab is not as large as the vocab you understand, the same goes for grammar: you don't wield as many constructs as you grow.

Semantically, avoiding obscure cultural references, culturally rooted unstraightforward metaphors, figures of speech or idioms.

Avoiding difficult topics. E.g. "I have a pen" vs. explaining Karl Popper's logical positivism.

It's much easier to acquire the "household" dialect of a language than to be able to understand news about politics, scientific papers, or literary essays.

Why E cores make Apple silicon fast

https://eclecticlight.co/2026/02/08/last-week-on-my-mac-why-e-cores-make-apple-silicon-fast/
54•ingve•2h ago•24 comments

DoNotNotify is now Open Source

https://donotnotify.com/opensource.html
238•awaaz•5h ago•43 comments

Show HN: Fine-tuned Qwen2.5-7B on 100 films for probabilistic story graphs

https://cinegraphs.ai/
16•graphpilled•1h ago•3 comments

Dave Farber has died

https://lists.nanog.org/archives/list/nanog@lists.nanog.org/thread/TSNPJVFH4DKLINIKSMRIIVNHDG5XKJCM/
51•vitplister•1h ago•7 comments

Matchlock – Secures AI agent workloads with a Linux-based sandbox

https://github.com/jingkaihe/matchlock
54•jingkai_he•5h ago•14 comments

Reverse Engineering Raiders of the Lost Ark for the Atari 2600

https://github.com/joshuanwalker/Raiders2600
26•pacod•4h ago•1 comments

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt
261•yi_wang•12h ago•131 comments

Curating a Show on My Ineffable Mother, Ursula K. Le Guin

https://hyperallergic.com/curating-a-show-on-my-ineffable-mother-ursula-k-le-guin/
18•bryanrasmussen•3h ago•10 comments

Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding
167•RebelPotato•11h ago•47 comments

Rabbit Ear "Origami": programmable origami in the browser (JS)

https://rabbitear.org/book/origami.html
29•molszanski•3d ago•3 comments

The Legacy of Daniel Kahneman: A Personal View (2025)

https://ejpe.org/journal/article/view/1075/753
21•cainxinth•3d ago•0 comments

SectorC: A C Compiler in 512 bytes (2023)

https://xorvoid.com/sectorc.html
330•valyala•19h ago•65 comments

LLMs as the new high level language

https://federicopereiro.com/llm-high/
149•swah•5d ago•281 comments

The Architecture of Open Source Applications (Volume 1) Berkeley DB

https://aosabook.org/en/v1/bdb.html
53•grep_it•5d ago•8 comments

Software factories and the agentic moment

https://factory.strongdm.ai/
250•mellosouls•22h ago•404 comments

A11yJSON: A standard to describe the accessibility of the physical world

https://sozialhelden.github.io/a11yjson/
13•robin_reala•5d ago•2 comments

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
202•surprisetalk•19h ago•214 comments

Modern and Antique Technologies Reveal a Dynamic Cosmos

https://www.quantamagazine.org/how-modern-and-antique-technologies-reveal-a-dynamic-cosmos-20260202/
12•sohkamyung•5d ago•0 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
201•AlexeyBrin•1d ago•42 comments

Arcan Explained – A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
6•walterbell•5h ago•0 comments

uLauncher

https://github.com/jrpie/launcher
43•dtj1123•5d ago•17 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
219•vinhnx•22h ago•26 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
383•jesperordrup•1d ago•123 comments

Brookhaven Lab's RHIC concludes 25-year run with final collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
87•gnufx•18h ago•66 comments

Wood Gas Vehicles: Firewood in the Fuel Tank (2010)

https://solar.lowtechmagazine.com/2010/01/wood-gas-vehicles-firewood-in-the-fuel-tank/
62•Rygian•3d ago•31 comments

First Proof

https://arxiv.org/abs/2602.05192
164•samasblack•22h ago•97 comments

LineageOS 23.2

https://lineageos.org/Changelog-31/
103•pentagrama•8h ago•30 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
123•momciloo•19h ago•31 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
631•theblazehen•3d ago•228 comments

In the Australian outback, we're listening for nuclear tests

https://www.abc.net.au/news/2026-02-08/australian-outback-nuclear-tests-listening-warramunga-faci...
27•defrost•3h ago•4 comments