frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Func Prog Podcast #9 with Hécate

https://discourse.haskell.org/t/func-prog-podcast-9-with-hecate/12854
1•Vosporos•1m ago•0 comments

Why Radiology AI Didn't Work and What Comes Next

https://www.outofpocket.health/p/why-radiology-ai-didnt-work-and-what-comes-next
1•nradov•1m ago•0 comments

A Snake Hunt in God's Country

https://www.theparisreview.org/blog/2025/08/25/a-snake-hunt-in-gods-country/
1•bookofjoe•4m ago•0 comments

A Federal Appellate Court Finds the NLRB to Be Unconstitutional

https://prospect.org/justice/2025-08-25-federal-appellate-court-finds-nlrb-unconstitutional/
2•Tadpole9181•4m ago•0 comments

Seeing infrared: contact lenses that grant 'super-vision'

https://www.theguardian.com/science/2025/may/22/infrared-contact-lenses-super-vision
2•colinprince•5m ago•0 comments

The biggest frogs build their own ponds

https://www.science.org/content/article/world-s-biggest-frogs-build-their-own-ponds
1•MaysonL•6m ago•0 comments

Skills You Need to Develop to Be a Better CTO (2017)

https://m.brianmcmanus.org/5-skills-you-need-to-develop-to-be-a-better-cto-528ad055706d
1•colinprince•7m ago•0 comments

The Anti-Autocracy Handbook: Scholars Guide to Navigating Democratic Backsliding

https://zenodo.org/records/15696097
2•nabla9•8m ago•0 comments

Benedict Evans: Why AI Isn't What You Think

https://fs.blog/knowledge-project-podcast/benedict-evans/
1•feross•8m ago•0 comments

Long context GPT-OSS fine-tuning

https://unsloth.ai/blog/gpt-oss-context
1•danielhanchen•10m ago•1 comments

Why AI Models Are Bad at Verifying Photos

https://www.cjr.org/tow_center/why-ai-models-are-bad-at-verifying-photos.php
1•giuliomagnifico•11m ago•0 comments

A Denisovan skull is upending the story of human evolution

https://www.newscientist.com/article/2492337-an-incredible-denisovan-skull-is-upending-the-story-...
2•Anon84•12m ago•0 comments

Show HN: DataCompose – PyJanitor-style dataframe cleaning for PySpark

https://github.com/datacompose/datacompose
1•tccole•14m ago•0 comments

Fasting may affect metabolism and immune response differently in the obese

https://linkinghub.elsevier.com/retrieve/pii/S2589004225011332
1•PaulHoule•14m ago•0 comments

Medicare Will Require Prior Approval for Certain Procedures

https://www.nytimes.com/2025/08/28/health/medicare-prior-approval-health-care.html
2•whack•15m ago•0 comments

Exoplan: Health-Driven Calendar

https://exoplan.io
1•exo_paul•15m ago•0 comments

Interactive Monty Hall Problem Simulator with Probability Visualization

https://montyhallsim.vercel.app/
1•ig1201•16m ago•1 comments

Show HN: I built the ATS YC said would never work

https://www.gethivemind.ai
2•BrainyZeiny•16m ago•0 comments

Building the Space Industry in Colombia

https://www.saganprog.com
1•felipediaz_•17m ago•1 comments

Reevaluating the revolution that fed the world

https://beyondimitation.substack.com/p/revisiting-the-revolution-that-fed
1•mellosouls•21m ago•0 comments

A conservative vision for AI alignment

https://www.lesswrong.com/posts/iJzDm6h5a2CK9etYZ/a-conservative-vision-for-ai-alignment
2•flypunk•22m ago•1 comments

The Economics of Envy

https://www.astralcodexten.com/p/the-economics-of-envy
2•feross•22m ago•0 comments

Health Effects of Cousin Marriage: Evidence from US Genealogical Records

https://www.aeaweb.org/articles?id=10.1257/aeri.20230544
3•speckx•28m ago•0 comments

Solana Consensus – From Forks to Finality

https://neodyme.io/en/blog/solana_consensus/
1•lawrenceyan•28m ago•0 comments

Affiliates Flock to 'Soulless' Scam Gambling Machine

https://krebsonsecurity.com/2025/08/affiliates-flock-to-soulless-scam-gambling-machine/
7•todsacerdoti•30m ago•0 comments

'Isn't Designed to Solve Privacy Concerns,' Grafana CTO on Bring Your Own Cloud

https://www.theregister.com/2025/08/28/grafanas_tom_wilkie_interview/
2•rntn•30m ago•0 comments

Show HN: A high-level search agent

https://www.gensee.ai/tooling.html
5•bobby_zhu•32m ago•0 comments

Uncertain<T>

https://nshipster.com/uncertainty/
5•samtheprogram•32m ago•0 comments

The Toad Report #1

https://willmcgugan.github.io/toad-report-1/
3•ingve•34m ago•0 comments

Dating like it's 1999. No algorithm. No swipes. No signup

https://1999date.com/
2•midzer•34m ago•2 comments
Open in hackernews

Ask HN: Best foundation model for CLM fine-tuning?

2•philomath868•6h ago
Hi,

I have a largish (2 GB) corpus of curated, high-quality text in some low-resource language, and I want to build a model that would provide an advanced "auto complete" service for writers.

I'm thinking of taking a decoder-only model such as Llama, Mistral or Gemma, slice off the embedding layers (which are based on unneeded languages), create new ones (perhaps initialized based on a FastText model trained on the corpus), paired with a tokenizer newly created from my corpus, then train the model on my corpus until convergence.

Additional potential details include: a custom loss function for synonym-aware training (based on a custom high-quality thesaurus), where synonyms of the "correct" word are somewhat rewarded; POS-tagging the corpus with a Language-specific POS-tagger, and add a POS-tagging head to the model as a Multi-task Learning, to force grammatical generation.

In order to be able to use a good model as the base, I will probably be forced to use PEFT (LoRA). My current setup is whatever is available on Colab Pro+, so I can probably use the 7b-12b range of models?

My main question is, which base model would be best for this task? (Again, for completion of general writing of all kinds, not programming or advanced reasoning).

Also, will the synonym and POS additions help or hurt?

Anything else I might be missing?

Thanks!