frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Why We Need Arabic Language Models

https://www.natureasia.com/en/nmiddleeast/article/10.1038/nmiddleeast.2025.142
21•thinkingemote•2h ago

Comments

sarabande•2h ago
Does anyone know if they published the dataset?
nakamoto_damacy•2h ago
I wonder of you pre train on Hebrew and Arabic if it will find the similarities between the RTL writing direction. So many similar words. I guess both came from Aramaic? If so, how about the trifecta of ancient languages with Aramaic then Hebrew the Arabic.
ch4s3•1h ago
They don’t come from Aramaic, Arabic is a Southwestern Semitic language and Aramaic and Hebrew are Northwestern Semitic languages. Aramaic and Hebrew tree are sort of cousins with Hebrew splitting off from southern Canaanite which was sort of a siblings language with an older form of Aramaic.
nakamoto_damacy•1h ago
They = Hebrew, Aramaic and Arabic

---

## 1. “They all kept the triconsonantal root system — where word meaning is based on three core consonants (like K-T-B = “write” → Hebrew katav, Arabic kataba, Aramaic ktav).”

*Source evidence:*

* The article “Triliteral Roots / Consonantal Roots” states that many Semitic languages (including Arabic, Hebrew) have roots typically made of three consonants (triliteral) and that words are formed by inserting vowels, etc. ([Transparent Blogs][1]) * A source says: “Both Hebrew and Arabic rely on a triliteral root system, meaning words are formed from three core consonants. Example of the root K-T-B…” ([Biblical Hebrew][2]) * Another general description: “The roots of verbs and most nouns in the Semitic languages are characterized as a sequence of consonants ... such abstract consonantal roots are used…” ([Wikipedia][3]) So this claim is well supported.

*Arabic translation of the claim:*

> احتفظت جميعها بنظام الجذر الثلاثي الحروف — حيث يعتمد معنى الكلمة على ثلاثة حروف صامتة أساسية (مثل ك-ت-ب = “كتب/يكتب” → العبرية כתב (katav)، العربية كتب (kataba)، الآرامية כתَب (ktav)).

*Hebrew translation of the claim:*

> כולן שמרו על שיטת השורש התלת-עברי — שבה משמעות המילה מבוססת על שלושה עיצורים ל־(למשל כ־ת־ב = “כתב” → עברית כתב ( katav ), ערבית كتب ( kataba ), ארמית כתב ( ktav )).

*Citations (for this claim):*

* Semitic linguistics: “The roots of verbs and most nouns in the Semitic languages are characterized as a sequence of consonants …” ([Wikipedia][3])

* “Both Hebrew and Arabic rely on a triliteral root system, meaning words are formed from three core consonants.” ([Biblical Hebrew][2])

* Description of the K-T-B root being used in both Arabic and Hebrew. ([Wikipedia][4])

---

## 2. “They share similar grammar and sound systems, just evolved differently.”

*Source evidence:*

* A blog post on Duolingo says: “Because Arabic and Hebrew are part of the same large language family, their grammars often ‘work’ in similar ways.” ([Duolingo Blog][5]) * A site “Arabic and Hebrew Compared” states: “Arabic and Hebrew morphology … is based on the consonant root system. …” ([Google Sites][6]) * The Wikipedia article on Semitic languages states that the Semitic languages share many grammatical features (word order, non-concatenative morphology, etc.) ([Wikipedia][7]) So yes, there is support for similar grammar and sound (phonological) systems.

*Arabic translation of the claim:*

> إنهما تشتركان في نحو وصوتيات متشابهة، רק تطورتا بشكل مختلف.

*Hebrew translation of the claim:*

> הן חולקות דקדוק ומערכות צלילים דומות, רק שהתפתחו באופן שונה.

*Citations (for this claim):*

* “Because Arabic and Hebrew … their grammars often ‘work’ in similar ways.” ([Duolingo Blog][5]) * “Arabic and Hebrew morphology … is based on the consonant root system.” ([Google Sites][6]) * “Semitic languages share a number of grammatical features …” ([Wikipedia][7])

---

## 3. “Many religious and cultural interactions over millennia reinforced overlap (borrowed or re-borrowed vocabulary).”

*Source evidence:*

* The article “Similarities Between Hebrew and Arabic” mentions: “Many Hebrew and Arabic words are cognates, retaining similar meanings and sounds.” ([Biblical Hebrew][2]) * A blog “Halal, Hillul, and the Shared Meanings of Hebrew and Arabic” discusses relationships between similar sounding words (cognates) due to shared roots. ([Hebrew College][8]) * Comparative grammar sources mention that because Hebrew, Arabic and Aramaic are closely related, there has been lexical borrowing and shared vocabulary. ([semiticroots.net][9]) So your statement about religious/cultural interaction reinforcing overlap (vocabulary) is broadly supported.

*Arabic translation of the claim:*

> العديد من التفاعلات الدينية والثقافية عبر الألفيات عزَّزت التداخل (استعارت أو أعادت استعارة مفردات).

*Hebrew translation of the claim:*

> אינספור אינטראקציות דתיות ותרבותיות לאורך אלפי השנים חיזקו את ההשתלבות (השאלה או השאלה מחדש של אוצר מילים).

*Citations (for this claim):*

* “Many Hebrew and Arabic words are cognates …” ([Biblical Hebrew][2]) * “The relationships between similar-sounding words … in the case of the Semitic languages, similar roots.” ([Hebrew College][8]) * “Hebrew, Arabic, and Aramaic … than between Hebrew and any other language …” ([semiticroots.net][9])

---

[1]: https://blogs.transparent.com/hebrew/hebrew-grammar-consonan... "Hebrew Grammar: Consonantal Roots - Transparent Language Blog" [2]: https://biblicalhebrew.org/similarities-between-hebrew-and-a... "Similarities Between Hebrew and Arabic" [3]: https://en.wikipedia.org/wiki/Semitic_root?utm_source=chatgp... "Semitic root - Wikipedia" [4]: https://en.wikipedia.org/wiki/K-T-B?utm_source=chatgpt.com "K-T-B" [5]: https://blog.duolingo.com/are-arabic-hebrew-persian-related/... "Dear Duolingo: Are Arabic, Hebrew, and Persian related?" [6]: https://sites.google.com/site/mopclanguages/arabic-and-hebre... "MOPC Languages - Arabic and Hebrew Compared" [7]: https://en.wikipedia.org/wiki/Semitic_languages?utm_source=c... "Semitic languages" [8]: https://hebrewcollege.edu/blog/halal-hillul-and-the-shared-m... "Halal, Hillul, and the Shared Meanings of Hebrew and Arabic" [9]: https://www.semiticroots.net/downloads/Comparative%20Grammar... "Comparative Grammar of the Semitic Languages"

binarymax•1h ago
Do we need language specific LLMs? I can’t vouch for the data coverage or accuracy of Arabic in the leading models today, but I do know them to be highly cross-lingual capable.
readthenotes1•1h ago
Makes me wonder who did the translation

"This is a translation of the Arabic article published on 3rd August 2025"

Full irony would be from an LLM

tokai•1h ago
Such declarations have become pretty useless without any indicator of the translation method.
nzeid•1h ago
> Clear examples emerge when global language models address culturally sensitive issues, such as social relationships or political debates. They often adopt ambiguous positions that overlook the Arab cultural context, creating a gap between these digital tools and the values and lived experiences of Arab users.

Well I have bad news, my friend. English language models are also terrible at this.

This whole article seems to stem from the premise that it's important for LLMs to engage cultural issues competently. But... should they even?

Fade_Dance•33m ago
>But... should they even?

I don't see why not?

Also, while I don't have access to this perspective myself, I'd imagine this is an unending annoyance in many areas of the world, since they are consuming often quite America-centric offerings where localization is an after-thought and contracted out.

The Rust-based Binder driver has now been merged into Linus' tree

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eafedbc7c050
1•weinzierl•1m ago•0 comments

Show HN: I made an AI tool to create production-ready images in one workflow

https://www.iley.app/
1•brightUiso•1m ago•0 comments

A Survey of Vibe Coding with Large Language Models

https://arxiv.org/abs/2510.12399
1•Anon84•2m ago•0 comments

Patat: Terminal-based presentations using Pandoc

https://github.com/jaspervdj/patat
1•walterbell•3m ago•0 comments

Virtual Enigma

https://enigma.virtualcolossus.co.uk/
1•austinallegro•4m ago•0 comments

The week private credit's 'golden' narrative got a little less shiny

https://www.businessinsider.com/private-credit-bad-week-concerns-dimon-cockroach-comment-2025-10
1•zerosizedweasle•4m ago•0 comments

The Parrot in the Machine

https://www.nybooks.com/articles/2025/07/24/the-parrot-in-the-machine-the-ai-con-bender-hanna/
1•mcovalt•5m ago•1 comments

The Shutdown Is Stretching On. Trump Doesn't Seem to Mind

https://www.nytimes.com/2025/10/18/us/politics/trump-democrats-shutdown-deal.html
1•zerosizedweasle•5m ago•0 comments

Doctor Who archive expert shares positive update on missing episode

https://www.radiotimes.com/tv/sci-fi/doctor-who-missing-episodes-update-teases-announcement-newsu...
1•gnabgib•6m ago•0 comments

Borders Don't Protect You

https://vp.net/l/en-US/blog/Borders-Don%27t-Protect-You
1•rasengan•7m ago•0 comments

Xi preparing to go toe to toe with Trump, there will only be one winner

https://www.theguardian.com/commentisfree/2025/oct/19/donald-trump-xi-jinping-china-trade-tariffs...
1•zerosizedweasle•7m ago•0 comments

An open letter to the Obsidian team

https://www.emilebangma.com/Writings/Blog/An-open-letter-to-the-Obsidian-team
1•birdculture•7m ago•0 comments

A New Challenge for China's Economy: 'Involution'

https://www.wsj.com/world/china/a-new-challenge-for-chinas-economy-involution-419500f1
1•mudil•8m ago•0 comments

The Scientific Mind of Leonardo da Vinci – With Martin Kemp [video]

https://www.youtube.com/watch?v=TlJ-FpVlgVI
1•jlg23•8m ago•1 comments

Nvidia and TSMC Celebrate First Nvidia Blackwell Wafer Produced in the US

https://blogs.nvidia.com/blog/tsmc-blackwell-manufacturing/
2•jonbaer•8m ago•0 comments

Andrej Karpathy: How I Use LLMs [video]

https://www.youtube.com/watch?v=EWvNQjAaOHw
1•behnamoh•8m ago•0 comments

Becoming AI-first: Lessons from 100s of conversations on building AI products

https://www.ashpreetbedi.com/articles/becoming-ai-first
3•bediashpreet•9m ago•0 comments

Industries being killed by millennials (2018)

https://www.the-independent.com/life-style/millennials-industry-casual-dining-weddings-beer-razor...
1•LouisLazaris•10m ago•0 comments

Speeding up C++ functions with a thread_local cache

https://lemire.me/blog/2025/10/19/speeding-up-c-functions-with-a-thread_local-cache/
1•jjgreen•12m ago•0 comments

Something from "space" may have just struck a United Airlines flight over Utah

https://arstechnica.com/space/2025/10/something-from-space-may-have-just-struck-a-united-airlines...
2•corvad•15m ago•1 comments

Après Moi, Le Déluge

https://en.wikipedia.org/wiki/Apr%C3%A8s_moi,_le_d%C3%A9luge
1•danielschreber•18m ago•0 comments

What Problem Is RAG Solving?

https://www.gojiberries.io/what-problem-is-traditional-rag-solving/
2•neehao•19m ago•0 comments

Liver fat, not weight, predicts health risks in obese children

https://medicalxpress.com/news/2025-09-liver-fat-weight-health-obese.html
2•PaulHoule•19m ago•0 comments

Show HN: Interactive Stress Toy

https://bigjobby.com/pendulum/
1•FatMike•22m ago•0 comments

Inside The Republican network behind big soda's bid to pit Maga against Maha

https://www.theguardian.com/us-news/2025/oct/19/inside-the-republican-network-behind-big-sodas-bi...
11•CrypticShift•27m ago•0 comments

AT Protocol alternatives to popular social media platforms

https://aternative.to/
1•Kye•30m ago•0 comments

We empower communities and nations around the world to map the electrical grid

https://MapYourGrid.org/
1•edward•36m ago•0 comments

AI-powered security engineers and source code scanners

https://joshua.hu/retrospective-zeropath-ai-sast-source-code-security-scanners-vulnerability
1•ingve•36m ago•0 comments

Big trouble if AI and crypto aren't bubbles

https://www.abc.net.au/news/2025-10-20/ai-crypto-bubbles-speculative-mania/105884508
1•chris1993•37m ago•0 comments

Show HN: 18yo first iOS app: blocks distracting apps and unlocks with QR/barcode

https://apps.apple.com/us/app/recode-screen-time-control/id6752352978
17•alhart•39m ago•3 comments