It all started way back when with this guy named Khatzumoto who did his own guerilla academic research, for lack for a better word - and proving it by getting fluent in 18 months. The theories were based in comprehensible input (from Stephen Krashen) and nonstop immersion, which do carry weight and are great ways - if not the only way to learn ANY language. He created AJATT, All Japanese All The Time, and basically created a mini cult that legitimately got people to fluency. It might sound extreme but on the other end I heard of someone who did duolingo (100% slop product btw) for 6 months and didn't know how to say "Thank You" in Japanese. Khatzumoto's ideas were manic, strange, but sometimes truly brilliant. I've never found a blog quite like it ever since - writings that an LLM really can't emulate. The original blog 404'd but it's been revived by a community member here - https://alljapanesealltheti.me/index.html I go back to them every now and then when I want some crazy motivation.
Nobody knows what has happened to Khatzumoto, he basically just dropped off the face of the internet - I wonder if he's doing alright.
That's a very interesting definition of fluent.
Interestingly, that's the "trick" behind a lot of the seemingly magic skill of geo guessers. The best players have played so much, that they now "see" things that a regular person wouldn't even consider to look for, like the camera quality, what year the car was from, and so they narrow down the possible countries by those aspects, before even looking at the "picture".
Slightly OT, but this happens constantly with ML classifiers on any highly multi-dimensional problem. At first it seems like magic, and then someone digs into the principal components of the prediction, and finds a mixture of a few highly specific factors that -- in the worst case -- is an artifact of the dataset itself (image blur or color bias, for example).
Also common is that the predictive factors aren't pathological -- they're just "boring" -- and therefore the performance of the model is dismissed by the practitioner ("oh, I'd have thought of that, since it's only using a few common traits that are well-understood.")
If I could speak a foreign language as well as a 8B parameter LLM, hallucinations and all, I'd be immensely ahead of where I am now. It's not like second languages aren't themselves often broken in somewhat similar ways.
* Confused where in the original series Spock goes to school.
* Watches the video and sees 2009 "Star Trek"
* "as a kid"...
* feels old
Anki is already extremely extendable so I would think that with a not too much work deep LLM integration could be implemented in Anki. Like, instead of showing static content for a card, have Anki call an LLM to create the daily iteration of a given prompt.
I could see it coming from 2 directions (since I think most people agree language learning and SRS go together):
- anki on steroids - a dynamic way for me to do SRS that feels more natural - a way to do natural language interaction with the LLM (chat, voice, etc.), with SRS as an added feature or integrated more subtly
I think it'd be cool to then use that "{word} {part of speech} {example sentence}" info to generate more example sentences with an LLM.
That way you can grind your problem words with real sentences and cement it quickly, otherwise this process happens too slowly through regular reading.
Yet somehow a German word is impossible to remember.
LLMs suck, because their goal is not to improve learning. Same way Duolingo sucks, the goal is to optimize global metrics over a massive userbase, not optimize individual metrics, where each individual has its own context.
All the conversations, KPIs are "What gets the most users, makes them stay on the app longest, paying the most money". 4 years and not a single conversation about how to improve dating.
I haven't used Anki for language learning, but I imagine that if I did, it would be to add some new vocabulary I had just learned from a book, conversation, film, etc. I don't think it would help me learn a language from zero though- that would require practicing it.
In summary, Anki is great for reinforcing something you've just learned, but you can't reinforce your way into the context that is necessary to truly understand something.
It didn't. They wrote "Anki is dead" because it brings clicks.
Writing my own cards as I'm learning is the only way I've found it effective.
With tools like Google and Microsoft's neural TTS and Anki's AwesomeTTS add-on my cards have audio that is so realistic that I am also constantly exposed to near-native listening. I do 3-way cards (Writing only -> English, Audio only -> English, and English -> Other language) so I'm actually getting a reasonable simulation of real life practice (reading, listening, speaking) on an individual sentence basis. My process is: (1) find a high quality sentence from a book / app / website / ChatGPT (with verification from a native speaker); preferably one that is fairly simple apart from a single word or verb conjugation that I haven't learned yet, in keeping with the i+1 rule, (2) create an Anki card for that sentence using my own custom note templates, (3) add audio with AwesomeTTS. Creating a card like this takes me perhaps 10-20 seconds as its mostly just copy-pasting and clicking a few buttons.
Of course to become truly fluent you need practice. But when I practice I'm already able to follow the gist of conversations and I can stumble my way through speaking in most situations: I've got a huge head start thanks to all the latent vocabulary and grammar that my brain knows thanks to Anki, instead of having to constantly look blankly at the other person while I pull out Google Translate.
I couldn't give you a percentage, but I made most of my own cards, including all of those 2000+ kanji cards. There's lots of debate in the language learning community about vocab cards or sentence cards, and generally the ideal is the sentence cards, as it provides the context that helps you use is naturally (as opposed to literal translations from your native language).
> I still need to work out different variations of the concept to understand it, and that's not something that Anki can help with.
But imagine if it could!
For example if the English prompt is "watermelon" - are you supposed to recall the Italian word cocomero, anguria, or melone d'aqua (all of which mean watermelon)? If the English prompt is "bank" - is that a place you deposit money, a river bank, to bank (turn) a plane, or to bank (count) on something happening? You end up having to build in messy hacks like giving clues in the prompt as to which translation is intended (which means you memorise the clue instead of the word) or having cards for bank(1), bank(2), bank(3), and bank(4) which becomes very tedious for recall. Sentences mitigate these problems somewhat.
I now only use vocab cards for object nouns where there's only one important translation, and mainly because I can put pictures on these cards so that I'm learning from e.g. the concept of an orange instead of the English word for orange (which saves you the step of mentally translating when you aren't yet fluent with the word).
I guess the best way to start is just to create a new deck in it with one card and then go from there. I already have a daily review habit, which is the most important part.
I'm not sure if it is efficient, mind you, but I suppose it's effective because I can recall information later when relevant, and I believe that like exercising just being able to stick to a study routine ends up being more important than picking the best routine
I did not do this with many cards though, hoping that they would eventually stick.
I think in general the more you engage with the thing you are doing, the better you remember. Even when reading or listening to a lecture or whatever. Maybe what I'm proposing here is that by making it dynamic you create a system where deeper engagement is necessary.
It helps you discover reading material suited to your current knowledge. It’s better to acquire new words within colorful contexts, and then use flashcards to review them after learning the material. (It has its own flashcard companion app or you can use Anki.)
Soon I am working on making the activity of reading words in native texts also count as reviewing those words in current and future flashcards, using FSRS. So that you can spend more time reading and not see it as detracting from catching up on your flashcard review workload. And because the reader tracks every word and kanji you come across, it can start to find and suggest the most effective passages to revisit or read for the first time from your personal corpus it currently accumulates from what you load in.
But next I am working on adding manga and video to enhance the fun of it, as OP mentions being important too.
I’ve recently managed to go full-time on this project and hope to bring it to more languages and platforms before long.
I have not yet found a really good tool for learning Mandarin, except for classes and actually talking with people and doing the hard work of writing the characters again and again, for which I rarely have energy or patience.
One thing I did notice in a course was, that writing an article about a topic helped a lot. It needs to be something where you use the same new vocabulary many times. But the problem with that is, that it makes my hands and wrist hurt after a couple of writings.
I also find Du Chinese and The Chairman's Bao quite useful, although indeed for the writing, nothing seems to substitute actually writing. Right now I can read much more Mandarin than I can write.
- The twenty rules of formulating knowledge (https://www.supermemo.com/en/blog/twenty-rules-of-formulatin...) ; old but really solid advice on how to actually write cards that stick.
- The book Fluent Forever; it’s meant for languages, but the general principles carry over to learning basically anything.
For example I use LLMs to generate cards for me, and Anki's algorithm to make them stick.
Similarly a LLM plugin could easily present a fresh sentence each time you review a particular vocab
1- front: image+subtitle (in TL) back: word in TL.
2- fill-in-the-blank phrases for the word, fully in TL, translation shows up after completion (you HAVE to use the function where you actually type it out)
3- front: word in TL back: translation in NL with an image, also the inverse, but the image is always in the back. Making it a different picture as the one for 1 is essential
So each new word would generate me about 6 to 8 new cards. At a fast enough rate of card creation, you won't run into the problem of memorizing each card because you will be creating like 100 cards in a day. The "fatal mistake" (to quote the author) of this article is underestimating how much this process of card creation and organization aids in learning. Creating your own study material IS studying in itself.
That is, assuming the strategy being compared to LLMs here is the correct one of actually studying the language and creating your own Anki deck while you study the material, instead of the incorrect strategy of downloading a deck
Thanks for the laugh, I like your writing style but to echo others, I think you went a little extreme on the Anki.
On semi-related note, currently making on a language app (gengengo.com) if anyone wants to check it out.
Anki (US: /ˈɑːŋki/, UK: /ˈæŋki/; Japanese: [aŋki]) is a free and open-source flashcard program. It uses techniques from cognitive science such as active recall testing and spaced repetition to aid the user in memorization.[4][5] The name comes from the Japanese word for "memorization" (暗記).[6]
The SM-2 algorithm, created for SuperMemo in the late 1980s, has historically formed the basis of the spaced repetition methods employed in the program. Anki's implementation of the algorithm has been modified to allow priorities on cards and to show flashcards in order of their urgency. Anki 23.10+ also has a native implementation of the Free Spaced Repetition Scheduler (FSRS) algorithm, which allows for more optimal spacing of card repetitions.[7]
Anki is content-agnostic, and the cards are presented using HTML and may include text, images, sounds, videos,[8] and LaTeX equations. The decks of cards, along with the user's statistics, are stored in the open SQLite format.
I want to use my app mainly for language learning, but as a demo, I also have some geography cards that zoom in on a country on the world map, for the front side.
I totally believe this. It's 100% in line with my observations about Duolingo. I know people who put in a lot of time into Duolingo and learned nothing. I gave it a try and despite the fact that I put maybe 50 hours into Italian (estimate) I learned nothing. I could get all the cards right, but I didn't learn any Italian (Over the years I've learned 3 languages in addition to my native language, so I know it's not my problem.) Eventually I realized that I did learn something but it wasn't the language. Somehow I just knew the right answers.
8<----
I think I've already built what you want:
https://api-dev.laleolanguage.com/v1/docs
I started working on this system before LLMs were a thing, but its purpose was specifically to address the problem described in this article -- "flashcard blindness", I've heard it called.
The idea was to solve this, instead of with an LLM, but with a giant corpus of native input. The algorithm tracks all the "language building blocks" separately, assigns each of them a difficulty and a study value, and then calculates the total difficulty and total study value of each selection in your corpus. Using that you can find material to read (or listen to, but I haven't gotten that far yet) that balances difficulty and impact on your learning. This way you're actually reading new material, rather than "memorizing rectangles".
There's a public beta for Biblical Greek [1]; I learned Koine Greek entirely through my own system. But I initially developed it for myself for Mandarin; and it's got experimental ports to Korean and Japanese (all three of which are not yet public).
But yes, this could definitely be integrated with an LLM:
1. Using the API, the LLM could ask for the top 40 words to learn or review
2. The LLM could then generate something using something from those words
3. The LLM could send the generated content to the API, to have it graded for difficulty. If the overall difficulty was too high, it could rephrase things to make them simpler (or perhaps even rephrase things to make them more complex, if the difficulty were too low).
4. The LLM could then show the content to the user, and log that the user had seen it.
The API isn't public yet, but if you're interested in trying it out, drop me a line:
contact@laleolanguage.com
Those 'useless' static cards are extremely efficient for learning the 1st 2000-3000 words, which is key to start reading. After about 4000 there's little sense in using SRS anymore, and then I'd rather spend more time with an actual book, but getting there with anki felt like using a cheat code compared to how I learnt my first foreign language. It's not exciting, it's pure toil, but it does work.
And when it comes to the next stage, I can't imagine how random llm-generated texts are better than, say, graded readers or real books. Most people would likely find it more interesting to spend an hour or two a day following an exciting story and characters they care about, and it's (based on a sample of one) way easier to memorise all of those new words when there's an emotional connection for each one (just how we form associations between words and experiences while growing up).
As for the app itself- I have tried it with my native language, and at the advanced level it produced a sterile and slightly unnatural text with a complexity of a typical fiction. If someone could read this, then I don't see why they would bother. At the beginner level the app generated a couple of news stories which, though simple grammatically, had a vocab that I would never have recommended to a novice. Local news of a "a firefighter saved a kitten stuck on a tree" variety are much more useful for that kind of learning, and you get this from any free newspaper.
LLMs are extremely useful for learning foreign languages, but I feel like this isn't the way to go
Frankly, who wants to learn something the old way when you can ask an LLM to do it for you?
All this reminds me of my home country when the Ponzi schemes were at their highest point (2008). There were small and medium cities, were a sizeable percent of the population actually stopped working at all! Why? Because the ponzi schemes were promising up to 300% returns within 6 months! And the Ponzi schemes were actually delivering (so much that the banks actually pressured the government to intervene). So people put all the money in the ponzis and simply wait a bit to cash in part of their returns. No one opened their small grocery stores, nor small pubs, nor did anyone want to do chores or manual labor. Until everything collapsed.
> Language learners chase something called i+1 material
I really dislike this traditional language teaching. It didn't work for me. It's too abstract, boring, and you end up memorizing stuff the wrong way. And usually later you need to un-learn/re-learn things properly.
What worked for me (and fellow struggling students I taught) was normal text about topics I find interesting. Like boats? Pick boating magazines, books, and documentaries/movies (turn subtitles on). From WTF to "I know some of those words" to is that a pattern? And only then go for the actual rules of the language. This way you are engaged and learn real-world things.
And for dull learning, it's better to spend time in i-1. Miyagi-stile practice repetition of the basic things to the point you can't fail even if you are tired. Then move to games like finding rhymes or tongue-twisters. [Ironically, AFAIK this is the Japanese way to learn calligraphy, Judo, etc]
Same applies to many other disciplines. YMMV
rsanek•3h ago
A piece of feedback: one of the common issues I've found with AI-generated questions & answers based on an article is that they will often hone in on testing values. It looks like incontextlearning.com often suffers from this same issue (over half my comprehension questions were "how many"-style). I can easily answer these types of questions even if I don't know what the content is about.
dothereading•2h ago