Edit: Thanks, @Bengalilol.
The 1.7B one looks meh.
But really solid numbers on the 9B! Props to the team!
"all official 24 EU languages” to "all 24 official EU languages"
It seems the new version is called Horizon Europe
i predict at some point countries will get CIA'ed when they publish plans to build a large data center.
Similar to the time when they got CIA'ed when announcing plans for new nuclear plants.
Europe has about 1.3 times the population of the USA and about 75% of the GDP yet EU tech output is a very small percentage of US tech output. We are not talking about 70, 50, 30, or even 20%. It's a drop in the bucket.
>The seven largest U.S. tech companies, Alphabet (Google), Amazon, Apple, Meta, Microsoft, Nvidia, and Tesla, are 20 times bigger than Europe’s seven largest, and generate 10 times more revenue.
https://eqtgroup.com/thinq/technology/why-is-europes-tech-in...
"Why" is a good question, but I definitely wouldnt expect significant competition in LLMs from Europe based on the giant tech disparity. Having 1 non-cutting edge model that isn't really competitive is pretty much what I would expect.
I only use open source LLMs for writing (Qwen 32b from Groq) and open source editor of course, Emacs.
If some people can write better using commercial LLMs (and commercial editors), by all means, but they put themselves at a disadvantage.
Next step for me, is to use something open source for translation, I use Claude for the moment, and open source for programming, I use GPT curently. In less than a year I will find a satisfying solution to both of these problems. I haven't looked deep enough.
I'm going to guess that this part is intentional. Europe tends to be more aggressive in enforcing antitrust laws. Economically, Europe's goal isn't to have the biggest companies but to have more smaller companies.
So you're not going to get companies like Google, but you will get companies like Proton, Spotify, Tuta, Hetzner, Mistral, Threema, Filen, Babbel, Nextcloud, CryptPad, DeepL, Vivaldi, and so on.
So is your hypothesis that the total market cap of EU tech companies is something like 50,60,70, etc. % of total US tech marketcap? Something significantly different than the ~10% implied by that figure (largest us companies 10x largest EU companies). And it's just more broadly distributed?
Hard to find data on this but this is showing EU tech market cap at 3.2T. https://www.stateofeuropeantech.com/chapters/outcomes
Whereas this is saying the US "megacaps" ($200B+) are at 21T. https://www.cnbc.com/2025/09/05/tech-megacaps-worth-market-c...
Which puts the entire EU tech market at 15% of the US megacaps. Not even the entire market.
I don't see any sense in which the EU has fewer capabilities. It has, say, a smaller number of businesses with smaller market dominance.
It isnt clear to me what capability the EU would gain by having a monopolist social network, a monopolist search engine, a monopolist advertising trader
This AI law is a clear example of that. Pencil pushers creating more obstacles for the sake of creating more obstacles rather than actually taking a pragmatic approach.
which sank to the bottom thanks to HN's invisible hand
Oh wait, one's not supposed to notice
But they were not trained on government-sanctioned homegrown EU data.
If none of the LLM makers used the very big corpus of EU multilingual data I have an EU regulation bridge to sell it to you
ok what are you implying on this
Plus all your T&S/AI Safety is not solved with translation, you need lexicons and data sets of examples.
Like, people use someone in Malaysia, to label the Arabic spoken by someone playing a video game in Doha - the cultural context is missing.
The best proxy to show the degree of lopsidedness was from this : https://cdt.org/insights/lost-in-translation-large-language-...
Which in turn had to base it on this: https://stats.aclrollingreview.org/submissions/linguistic-di...
From what I am aware of, LLM capability degrades once you move out of English, and many nation states are either building, or considering the option of building their own LLMs.
In recent LLMs, filtered internet text is at the low end of the quality spectrum. The higher end is curated scientific papers, synthetic and rephrased text, RLHF conversations, reasoning CoTs, etc. English/Chinese/Python/JavaScript dominate here.
The issue is that when there's a difference in training data quality between languages, LLMs likely associate that difference with the languages if not explicitly compensated for.
IMO it would be far more impactful to generate and publish high-quality data for minority languages for current model trainers, than to train new models that are simply enriched with a higher percentage of low-quality internet scrapings for the languages.
Chat GPT for example tends to start emails with "ich hoffe, es geht dir gut!", which means "I hope you are well!". In English (especially American) corporate emails this is a really common way to start an email. In German it is not as "how are you" isn't a common phrase used here.
But also European culture could maybe make a difference? You can already see big differences between Grok and ChatGPT in terms of values.
European culture is already embedded in all the models, unless the people involved in this project have some hidden trove of private data that they're training on which diverges drastically from things Europeans have published publicly (I'm 99.9% positive they don't...especially given Europe's alarmist attitude around anything related to data).
I think people don't understand a huge percentage of the employees at OpenAI, Anthropic, etc. are non-US born.
>Europe is the only continent in the world to have a large public network of supercomputers that are managed by the EuroHPC Joint Undertaking (EuroHPC JU). As soon as we received the EuroHPC JU access to the supercomputer, we were ready to roll up our sleeves and get to work. We developed the small model right away and in less than 6 months the second model was ready.
[1] https://www.eurohpc-ju.europa.eu/eurohpc-success-story-speak...
Repurposing some of that physics sim compute
Just the base model and a template like "English: {text}\n{language}:" can also work with a bit of filter and retry logic
The US and China are running rings around Europe.
Mistral is an exception as it was funded by US VCs and they are a great example showing that without VC funding, Mistral would have been begging to the EU for a microsopic grant to train a LLM worse than Llama.
There are a few variables here but at this point in time, private-funded innovation isn't different by much and all things considered, the difference isn't in its favor.
My experience with government funding is that they apply something and won't even try to sell it because selling is hard: you don't want to know that the thing you built is lacking nor that the competition is better. Especially the academic types don't. Yet I'm paying for these guys. Also, by funding the academics they won't even need to go to the job market.. But as I paid for their education I thought I was buying people who create value.
Perhaps the above is rather harsh and it's "not that bad", my subjective experience nevertheless.
Vaswani is an Indian born computer scientist, Shazeer is US, Parmar was born in India, Uszkoreit was born in Germany, Jones was born in the UK, Gomez is British-Canadian, Kaiser is a Polish computer scientist, and Polosukhin is Ukrainian.
Almost all of these people have PhDs and Master degrees. The ROI on academia is vast for society, including European universities. The thing the US does well is capitalize on that education, and sadly also try to steal credit for it as "American exceptionalism." If Europe and other countries learn how to keep their academics and get them working in local industries, America's edge will evaporate overnight.
The wider availability of capital is a bigger deal though. "Attention is all you need" is available to people on other continents to read, but a computer scientist in Europe that understood exactly how big transformers were going to be and why had less chance of funding than a webdev in California with a pitchdeck full of cliches and me-too GPT wrapper for an industry they'd barely touched does today.
It seems like it, in most ways, it would be bad to train on 24 separate languages. That's just 24 partitions to the data. Seems really inefficient and better to simply train in the biggest (english) and translate.
I do think this will introduce some biases that correlate with the English language. It would be interesting to see more specifically what this means. But regardless, I don't think you can produce a competitive model with such a large subdivision of training data.
2. A credible scale effort for EU own silicon for AI Compute, wouldn't hurt either.
3. And this can only be achieved by vertical integration to combat fragmentation.
Yep, the US-government sponsored, open-weight LLM is miles ahead of EuroLLM
Would you prefer European AI sovereignty with 15% overhead costs from geographic distribution, or 100% dependence on Nvidia/OpenAI with zero European industrial base?
EuroAI: Europe’s Moonshot to AI Sovereignty
https://open.substack.com/pub/ifiwaspolitical/p/euroai-europ...
Also funny to read this narrative of how access to the European 'supercomputer' cluster is going. https://x.com/levelsio/status/1981485945745788969
Geniunely repugnant. Atleast the Trump admin has the decency to pump everyones 401k...
I'm trying to figure out why it bothers me so much. I think its because the EU are such unbelievable losers in everything they do. they can't even grift, thats how useless they are. they can't even steal properly. its so undignified, and offensive to the senses.
The EU is such a bizarre place because they treat capital and entrepreneurs with such massive distrust, but never really bothered getting rid of the quasi-static entrenched hierarchies from feudalism? Like I'll go to the UK or France and there will just be massive swathes of land owned by the nobility or 'former' nobility? Maybe start there but let your high-value human capital earn a good wage?
Yeah, no, this isn't even remotely true.
When you talk to most EU business owners, even in tech, the limiting factor isn't regulations. This being the #1 reason is such a tired trope.
Ironically, China has in some ways a bigger regulatory burden when it comes to software, as there if the government doesn't approve the business is dead in the water. I doubt that Klarna would've gotten off the ground there, for one, I could see them being shut down much earlier there. In the EU only now very slowly are some governments even starting to talk about some weak measures around their business model. But I've never, not once in my life, heard "Chinese software companies can't get off the ground due to the regulatory burden".
The same people who clamor about the EU regulations are the ones who hate on the EU for their protectionist measures against US tech. Yet another bout of irony here - China's software industry has flourished exactly thanks to 10 times stronger protectionist measures against US tech. So has Korea's, and their protectionism has never even been anywhere on the China level, more inbetween EU and China. No, if there's anything that would help, it's much more tech protectionism in the EU.
Pieter Levels is at the end of the day an influencer, not a serious founder.
Okay, what is the limiting factor? Because when I talk to EU business owners (admittedly, very few) - they point to lack of big EU capital markets, which is directly downstream of the policy environment. And when I talk to top EU human capital, they all point to the lack of competitive wages. There's a real difficulty in allocating capital to talented humans.
And, at least in Southern Europe, the income tax schedule is so aggressive it's hard to justify continuing working in many of these countries if you are highly talented.
Like, if you can tell me what the induced operator norm from l_2 -> l_2 is - probably you should come to the US and work at a biglab and make bank. What can you do in Portugal, Italy, Spain, etc.??
> Pieter Levels is at the end of the day an influencer, not a serious founder.
Sure, agreed.
I think it is a complete misreading to point to protectionism as the reason for Chinese success, but having a big unified domestic market for consumers along with massive saving rates and capital controls probably does help.
Why work in the "europoor" countries when you can go to america and earn megabucks.
All of these purported EU-specific reasons completely ignore that things are the same elsewhere. It's the US that is the outlier.
Capital controls are protectionist measures, but anyway, no.
> Okay, what is the limiting factor?
Let's look at which countries have a significant local software industry compared to population size.
- China
- US
- Korea
- You can argue for Japan and India but that's already starting to stretch.
- Yup, effectively no where else. Even in an "out of the way" place like Myanmar everyone uses Meta, with a nice little genocide to show for it. Sure, in Vietnam they use Zalo, and other places have a few other local players. But most of the famous US tech apps are dominant.
Is the EU the outlier here? No. Everywhere else US tech dominates. Meta, Netflix, Apple, Google, Uber, Spotify, Microsoft, Match Group, Paypal, Amazon, and on and on. They don't just dominate the EU, they dominate the world.
Except for the countries I named above, where at least some of the markets that US big tech competes in, instead have bigger local players. And even there, guess what?
Their market share is almost 1:1 linearly correlated to the degree of protectionism in those countries, all the way from China, then Korea, then India/Japan, and then everywhere else! Who woulda thought!
Why does Korea have much less US tech dominance than, say, Germany? Despite German companies theoretically having a big advantage: the German public is 100x more privacy conscious than the Korean one, and much less trusting of US companies.
I can tell you that it's not less regulations; Korea's GDPR is much more onerous than the EU's and so are investment regulations. On every single regulatory aspect, German software startups have it easier. But they were never protected. US tech was allowed to waltz in, dump their products - that's what they did, it's hilarious how now China "dumping" EVs and solar is suddenly an issue when it's exactly the strategy that US tech continues to this day; the AI companies are doing it right now! And the Korean companies were protected. Both by the rules burden, that local companies had to deal with too, along with intentional protectionism.
When it comes to solar and EVs, we all understand that a foreign country dumping their goods kills local industry. It's the exact same with software.
But then half of HN has millions on the bank exactly thanks to the above - this is where all those fat SV salaries have come from - so I do get the lack of desire to understand it.
Seems like you actually believe this. I think our starting points on reality are different enough that we are not going to have a productive conversation, I wish you and other Europeans the best of luck in your protectionism-led growth strategy. Make sure to not discuss it with any pesky macroeconomists who might lead you astray. take care
A few.
A big part is that the EU is a collection of countries that (with very few exceptions) have different languages and laws. For a company to serve Spain and France, for instance, it would need to translate everything, hire local lawyers and customer support agents. Considering the much smaller size of the countries (biggest one is 70 million vs 330 million in the US), the opportunity for "unlimited" growth is limited.
This also rebounds in the fact that when an American company makes it big, they have the resources to flood other EU markets and be cheaper/better than the local competition due to economies of scale and money based on their big successful US market. A French company making it big is still small compared to a US equivalent.
Then, there's the capital markets, no denying that. The money being thrown around the US is like nowhere else on the planet. Some of it definitely a bubble / unrealistic, but that doesn't matter. But in part it's because of the size of the total potential market that this is justified.
Education / national mythology also plays a part, I think (this is pure conjecture now). In the US, the "American Dream", "everyone can make it" etc is heavily ingrained. It propagates through the world with the help of Hollywood and other American cultural exports. In most EU countries, there isn't such a heavy emphasis on independence and "pulling yourself up by your bootstraps". "Hustle culture" isn't a thing. So for most people, it isn't something that comes naturally to them to start a company and work 100 hour weeks to be big and rich and successful and famous.
That's not to say there aren't such people, I went to 42 and have been to Station F and know some people in that universe. A decent proportion of my classmates wanted to make their startup and make it big, and some did end up starting their own companies.
Ding ding ding! When China does it with solar and EVs we call it "dumping". When Uber, OpenAI and Anthropic do it, that term is never ever used. VC funded US techs dumps harder than any Chinese industry ever has.
Which part is easier? That you have 50 different states with slightly varying laws to consider (e.g. Californian Data protection)? That you have a byzantine system of "benefits" to choose and manage?
And compared to where? Germany or Estonia or Sweden or Spain? The complexities will vary wildly depending on the country (kind of like in the US, where lots of companies pick the state to base themselves in based on the combination of favourable laws and precedents and taxes).
there are certain sentences you can just tell would never be written by an American lol
California Consumer Privacy Act is a thing you need to take into account for Californian customers.
Illinois has a Biometric Privacy Act.
And who knows what Wyoming or South Dakota or Oregon have that you might take into account if your business falls under any of them.
most laws like CCPA also have some threshold where you already need to be pretty successful for it to apply to you.
for some select industries (biometrics & healthcare), yes you have a patchwork of laws.
Secondly, what's easier besides VC funding? If it's VC funding, the disparity there has nothing to do with regulations - guess how much VC funding the non-EU rest of the world gets.
I have a tech startup in Estonia and I agree. To me the biggest limiting factor is lack of funding.
And you register online.
What it's terribly good at is adding burdens that the US giants don't face early on, slowing down the early growth between 28 fragmented markets. I don't know specifically about how China works, but the question is proving product-market fit, and for that, you need a lot of users fast.
In the EU, it's a different battle country to country as the media environment, the markets, the regulation etc. are all fractured.
While he is great at converting his influencer status to income in his micro-SaaS projects, I don't think running ad-fueled browser games on state-sponsored super computer should be really aim of these grant programs.
All these while the EU is running out of funds and in a process of de-industrialization. There should be an independent corruption investigation on Brussels.
It's bureaucracy, often bordering with stupidity. You may need advisors to navigate all their forms & processes. But it certainly isn't "pals-only" type of deal.
On the other hand - is it harder than getting VC funding? For seasoned founder with reputation - probably. For fresh startup - probably not.
highly doubt, the whole thing about the success of the US west coast is that they are&were willing to fund unproven upstarts.
The point being that, as soon as public dollars are on the table, people expect perfection. Anything less is waste, fraud, and abuse.
There's literally no winning. Want to make sure the money is allocated right? Bureaucracy. Want to not do that? Waste, fraud, and abuse.
Except that Apple, Intel, Tesla, etc have all received US government investment [1]. TSMC is a product of the Taiwanese state! Government investment can be done well, and seeds excellent companies.
[1]: https://www.sba.gov/blog/2024/2024-02/white-house-sba-announ...
https://www.politico.eu/article/ombudsman-slams-commission-f...
Yes, some of the questions are weird, but I'd really rather write a bit confirming that the AI system being developed isn't going to be racist or Skynet than jump through some other hoops that exist (and that absolutely includes VC due diligence). The actual biggest issue with European funds is they get way more competent applications than they can fund anyway.
And frankly, the dream scenario that Pieter describes where he somehow would qualify for these resources also wouldn't help kickstart the tech industry, and it's also not how it works in the states.
What does help, and what European governments (at least the one in The Netherlands that Pieter is from) actually do, is more funding for startups. If you're a startup founder in NL almost every angel you talk to has a matched funding deal with the government. That's such a smart way of keeping up with the US. Do you think US startups get free compute from the government? They don't even get subsidies most of the time. What they get is better funding because there's more capital available, and helping investors with that is exactly how you solve that.
Does government offering matched funding to investors actually help startups who are struggling to find (any) funding? If a startup can't find (any) funding, matching is irrelevant.
> Do you think US startups get free compute from the government? They don't even get subsidies most of the time. What they get is better funding because there's more capital available, and helping investors with that is exactly how you solve that.
Umm. I'm not really convinced that the political elites in Europe understand how to do any of this stuff well.
See also: https://www.eib.org/en/publications/online/all/the-scale-up-...
In the US, some ex-Googler might found a startup. Europe doesn't have the equivalent of FAANG. (Europe-wide companies are not quite as easy as US-wide)
Even if the super computer itself "fails", is the goal actually the secondary impacts to the economy?
(And in the US, we do our own fair share of picking winners / losers, especially in the current regime)
Cluster: for public benefit, cutting edge research in biotech, medical, robotics.
Levels: I want to create AI photos of people for my AI Slop startup
That's not what the quoted paragraph says and you can read the whole release if you want: https://ec.europa.eu/commission/presscorner/detail/en/ip_25_...
--- start quote ---
Apply AI Strategy
The Apply AI Strategy aims to harness AI's transformative potential by driving adoption of AI across strategic and public sectors including healthcare, pharmaceuticals, energy, mobility, manufacturing, construction, agri-food, defence, communications and culture. It will also support small and medium-sized enterprises (SMEs) with their specific needs and help Industries integrate AI into their operations.
--- end quote ---
I also quoted a paragraph from a document I will find when I'm not on mobile.
Levels literally wants to train AI Slop: https://x.com/levelsio/status/1981499900266193028
--- start quote ---
Train a foundational model for AI photos of people
--- end quote ---
My quote: Cluster: for public benefit, cutting edge research in biotech, medical, robotics.
Literal quote from your link: The Apply AI Strategy aims to harness AI's transformative potential by driving adoption of AI across strategic and public sectors including healthcare, pharmaceuticals, energy, mobility, manufacturing, construction, agri-food, defence, communications and culture.
You: your quote was misleading.
I'm sorry, I don't have the time or the patience with willfully ignorant and blind people getting their interpretations from AI slop engagement farmers.
Adieu
Now let's wait for the people saying "Spain" could change this. Hypocrites.
Cultural genocide at its best.
The big win for accessibility has already been won...3 years ago.
I used the 9B Instruct version, from the small models, it was the one with the best Latvian knowledge out there, bar none. GPT-OSS 20B and Qwen3 30B A3B and similar ones weren't even close.
That said, the model itself was a little bit dumb and not something you'd really use for programming/autocomplete or tool calling or anything like that, which also presented some problems - even for processing text, if you need RAG or tool server calls, you need to use something like Qwen3 for the actual logic and then pass the contents to EuroLLM for translation/formatting with the instructions, at which point your n8n workflow looks a bit messy and also you have to run those two models instead of only one.
Meanwhile, the best cloud model for Latvian that I've found so far was Google Gemini 2.5 Pro, but obviously can't use cloud models in certain on-prem use cases.
I have to specifically tell something like this: “do you known Lithuanian language”, then it starts replying in Lithuanian
They almost exclusively compare their model to prior models from 2024 or older and brag about "results comparable to Gemma-2-9B". I'm not sure what I expected. The eurollm.io homepage states "EuroLLM outperforms similar-sized models", which just seems like a lie for all practical purposes
An overly charitable interpretation is that EuroLLM isn't a reasoning model and has minimal post-training, so they sought out comparisons to such models (they're still ignoring reasoning models that have non-reasoning modes)
As another comment here noted, the title is missing (2024) - this model was released almost a year ago, last December, so it's not surprising that that's the models they compare to.
>You need to agree to share your contact information to access this model
Is this common? I've never seen it on the site before, and it isn't on the smaller model. What are they collecting this information for?
Still two month earlier 19 European language model with 30B parameters got almost no mention:
https://huggingface.co/TildeAI/TildeOpen-30b
Mind you that is another open model that is begging for fine-tuning (it is not very good out of box).
As the aim of EuroLLM is to provide EU citizens with powerful and useful AI tools, it is critical that the model can also translate and answer questions in other European and non-European languages. With this in mind, we added support for 11 additional languages (Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian).
I suppose that's a typo and I found a technical report here: https://arxiv.org/abs/2506.04079
Comparison with similar EU models + 600 other highlights:
We support your work and offer backup and distribution. Here a copy just in case: https://hugston.com/uploads/llm_models/EuroLLM-22B-Instruct-...
This model was released in 2024, and I couldn't find any links to the training data - is it just an open weights model?
adzm•5h ago
Maltese, interestingly, is the only Afro-Asiatic derived language.
Hungarian, Finnish, and Estonian are the three Uralic languages.
All the others are Indo-European, Greek being the only Hellenic one, Irish the only Celtic, the rest are Baltic, Slavic, Italic, or Germanic.
(I originally used the term Balto-Slavic, though I was unaware of some of the connotations of that term until just now. Baltic and Slavic do share a common origin, but that was a very very long time ago)
purrcat259•5h ago
ebb_earl_co•5h ago
kridsdale3•5h ago
ggsp•5h ago
arbuge•4h ago
Source: I'm also Maltese.
Raed667•5h ago
arbuge•4h ago
cenamus•12m ago
purrcat259•2h ago
adzm•5h ago
purrcat259•2h ago
There was a point about 7 years ago when the overton window shifted to "speak english to strangers first" because of a large influx of foreigners who did not know the language. Since then I've met foreigners who have better Maltese than some natives.
Older folks & geriatrics will sometimes be surprised when they assume someone is foreign and they turn out to be Maltese. "int Malti??" is a statement I get often because I don't look Mediterranean despite being born here.
nxor•5h ago
JAlexoid•3h ago
I was surprised to hear Maltese radio stations played in taxis, while visiting Malta just a few weeks back
nxor•1h ago
purrcat259•2h ago
Businesses do work in Maltese and English. Both are official languages. Its quite rare to encounter a business that deals near exclusively in Maltese. Many prefer Maltese but will fall back to english where necessary.
Regarding monolignual speakers, I think theres a lot of stereotypes for maltese only, english only and code switchers. I think its all a bit silly... So as long as communication can happen I don't fuss.
On Maltese music... There's a lot of low ish quality music then there's a few absolute gems. Look up The Travellers, Lapes, Jon Mallia on YouTube/Spotify.
nxor•1h ago
cm2012•4h ago
purrcat259•2h ago
Tade0•3h ago
purrcat259•2h ago
cess11•2h ago
franklin_p_dyer•2h ago
https://tatoeba.org/
runarberg•1h ago
How much do you consider Maltese its own language (as opposed to a dialect of Arabic)?
notahacker•1h ago
I don't think anyone would seriously consider it a dialect of Arabic though with its completely different alphabet and half the vocabulary and morphology coming from Italian languages/dialects, even if Malta hadn't spent the best part of a millennium trying very hard not to become part of the Arab world
barrell•20m ago
I do wonder what natives think and feel about the longevity of their language? What is taught in schools at what ages (assuming English is in the mix somewhere). Is there enough media in Maltese for Malti to go about the moderns at fully in Maltese? It’s shockingly hard to find any information on Maltese, and even harder to find content.
I’m not sure if’s dying out, or in danger thereof; if there are preservation efforts, or if there is no need.
jim180•5h ago
Telaneo•5h ago
asveikau•5h ago
I think some people get touchy about them being lumped together if their last period of commonality (per the article) was 1400 BCE. For comparison, I believe all the Slavic languages were mutually intelligible around 1200 AD. But much more recently than this, in the last few centuries, there have been notable attempts by east slavs to absorb the Baltic language cultures and deny them.
krzyk•4h ago
I doubt West and East Slavic were. But inside those geographic groups they probably were (Czech and Polish AFAIR were around that time).
actionfromafar•4h ago
kaato137•5h ago
kreetx•5h ago
d1sxeyes•5h ago
kreetx•5h ago
dragonwriter•5h ago
lo_zamoyski•2h ago
NicuCalcea•4h ago
Well, that and Romanian. And Hungarian. And outside the EU, Albanian. And Georgian, Azeri and Armenian if you consider those Eastern Europe.
ardit33•4h ago
NicuCalcea•2h ago
Some of my fellow Romanians will also claim they're Central European, but in my mind, all the ones I listed are Eastern European countries. I'd even include Turkey and Kazakhstan in there, part of the latter is to the West of the Urals, which is what we normally consider the border between Europe and Asia.
kreetx•3h ago
In my mind, I was thinking of the belt of countries between Russia and Central Europe, starting from the Baltics down to the Balkan (excluding Greece).
NicuCalcea•2h ago
rich_sasha•4h ago
There is a branch that contains both Baltic and Slavic languages, but there's also one that contains Albanian and Greek.
ardit33•4h ago
There have been some attempts to tie Albanian to Germanic, or Greek, or other branches, but they all have failed.
At some point they all are Indo_european, but they split a way ago.
pqtyw•3h ago
and
> only Estonian is not a Slavic language.
So following this logic saying "in Eastern Europe, only Estonian is not a Baltic language" would make as much sense?
sublimefire•4h ago
pqtyw•3h ago
The fact they they are the closest surviving relatives on it own doesn't mean it makes sense to group them together (i.e. Italo-Celtic is also a theorized subgroup in a similar way but nobody is disputing that Celtic and Italic languages evolved into distinct groups).
Then there is a huge amount of missing links and unknown unknowns. e.g. Thracian and Dacian probably were also pretty close to Baltic or Slavic (maybe even closer to Baltic than Slavic is but we don't know enough about them to make any conclusive claims at all... but we at least know these languages existed)
Tade0•3h ago
adzm•5h ago
I updated my original comment, and learned a good amount about that dispute as a result, so thanks for calling it out.
Vinnl•5h ago
Best get to retraining those models.
przemub•5h ago
rsynnott•4h ago
outside1234•4h ago
hebelehubele•4h ago
umanwizard•3h ago
geretnal•3h ago
layer8•3h ago
piltdownman•4h ago
https://www.reddit.com/r/northernireland/comments/1fivtob/no...
pqtyw•3h ago
AlecSchueler•1h ago
Levitz•3h ago
https://www.politico.eu/article/catalan-basque-galician-boos...
runarberg•1h ago
handelaar•1h ago
Muvasa•1h ago
sigmar•5h ago
Vinnl•1h ago
mikrl•4h ago
“Brea, bûter en griene tsiis is goed Ingelsk en goed Frysk”
RobotToaster•4h ago
lawlessone•2h ago
tirant•4h ago
tannhaeuser•3h ago
> Some of these [Old Saxon] speakers took part in the Germanic conquest of England in the fifth century AD. While it is not true that English and Plattdeutsch derive completely from the same source, the Old Saxon input into Anglo-Saxon was of primary importance and this linguistic group contributed greatly to the Anglo-Saxon dialects which our English forefathers spoke.
[1]: http://www.plattmaster.de/plattoew.htm
tecleandor•4h ago
They get certain recognition, but they are not official in Europe. For example, just from Spain there are 13 languages on that list.
ginko•3h ago
Vinnl•1h ago
ChrisMarshallNY•4h ago
What about Basque? Is that too controversial?
[0] https://en.wikipedia.org/wiki/Hotel_Beau_Séjour
td540•4h ago
ChrisMarshallNY•4h ago
It's all Greek, to me...
mytailorisrich•4h ago
So for instance, Basque is not an official language of any country (only French in France and Spanish/Castilian in Spain). Belgium's official languages are French, Dutch, and German, "Flemish" is only a local variant of Dutch (Belgian French is also only a local variant of French).
ChrisMarshallNY•4h ago
In the US, people will resort to fisticuffs, over variants of Spanish. I usually translate into Castilian Spanish, because that seems to be the equivalent of "Vanilla" Spanish. No one is really happy (except the Spaniards), but I'm not accused of favoritism.
contravariant•4h ago
rags2riches•3h ago
tirant•4h ago
mytailorisrich•3h ago
Section 3
(1) Castilian is the official Spanish language of the State. All Spaniards have the duty to know it and the right to use it.
(2) The other Spanish languages shall also be official in the respective Autonomous Communities in accordance with their Statutes.
(3) The richness of the different linguistic modalities of Spain is a cultural heritage which shall be specially respected and protected.
[1][1] https://www.senado.es/web/conocersenado/normas/constitucion/...
tirant•4h ago
embedding-shape•4h ago
Levitz•3h ago
From https://european-union.europa.eu/principles-countries-histor... we can find an excerpt relating to the policy and its purpose:
>One of the EU’s founding principles is multilingualism.
>This policy aims to:
>communicating with its citizens in their own languages
>protecting Europe’s rich linguistic diversity
>promoting language learning in Europe
With this in mind, the first intention fails by an enormous margin, given that 95%+ of Spain doesn't speak an iota of Basque, the second is met handily, given the long history of the language, and I'm not sure what to think about the third, any language whatsoever would serve that purpose.
yvdriess•3h ago
OptionOfT•1h ago
Now, being from Belgium, even within that small part of the country where everybody is supposed to speak Dutch, I genuinely don't understand people from near the coast, which was about 150 miles from where I used to live.
punnerud•4h ago
arbuge•4h ago
It's Semitic, to be precise.
https://en.wikipedia.org/wiki/Semitic_languages
UebVar•2h ago
fsckboy•3h ago
JAlexoid•3h ago
PS: Gaelic is a more general term for Irish and Scottish. Ireland brings specifically Irish(Gaeilge in Irish) language.
rags2riches•3h ago
ginko•3h ago
layer8•2h ago
[0] https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:01...
raattgift•2h ago
ChocolateGod•2h ago
rcbdev•1h ago
threesmegiste•3h ago
runarberg•49m ago
sva_•2h ago
> as well as some additional relevant languages (Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian).
https://arxiv.org/pdf/2409.16235
The paper also goes into detail on training set sources, which I feel like a curation thereof might be considered the main contribution of this publication?
_kidlike•2h ago
ranadomo•2h ago
https://en.wikipedia.org/wiki/Graecians
3836293648•33m ago
ks2048•2h ago
Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian.
amarant•2h ago
I have often joked that Norwegian is just a dialect of Swedish, but I never expected to get official validation like this!
bdhtu•2h ago
emil-lp•2h ago
rcbdev•1h ago
jenadine•31m ago