Fine-tuning an LLM to write docs like it's 1995

25•taubek•1h ago

Comments

mock-possum•26m ago

> we’re not there yet, in part because of how much more powerful connected frontier models are

Is that why though? You need a beast of a machine to run a functional local model in my experience.

I think the big part is there’s significant sticker shock to buying capable hardware.

That said,

> weekend. I chose to try fine-tuning on two models, Llama 3.1 8B Instruct and Qwen 2.5 7B Instruct. At their size (around 8B) they run comfortably on a MacBook Air

Perhaps I spoke too soon?

Anyway

> I chose the Microsoft collection as the source of training materials. The collection contains out-of-print docs published between 1977 and 2005: more than 37 million words, covering old systems and SDKs

this strikes me as a very specific brand of 1995’s prose, spanning about 30 years. It’s a cool article though, so maybe that’s a forgivably clickbaity title.

mschild•14m ago

Running models locally is surprisingly easy and possible even on older hardware.

Obviously not the largest, up-to-date models but for what I expect most people use them for, even on hn, there are some shockingly good models that dont require €4k machines.

I have a desktop with an AMD 6900XT and 5600 with 32GB ram. Obviously no slouch but its several years old at this point. I can comfortably run qwen 3.5 9b and get a speedy 60 token/sec output with decent results.

mock-possum•11m ago

idk I can barely field a 14b on my desktop, and it’s rough trying to replicate the agentic pair programming experience I’m accustomed to with Claude. And I don’t mean it doesn’t work as well, I mean it doesn’t work.

Is there some secret I’m missing? I’ve tried rolling my own harness, and tried a few of the ones the cool kids use - I think pi was the most recent. Not quite my tempo, I’m afraid.

OJFord•11m ago

> this strikes me as a very specific brand of 1995’s prose, spanning about 30 years.

It's probably a fair approach to say the significant influence (training dataset) on writing at a particular time is the preceeding 30 years' material? It's certainly not only what's already written that year (nor anything since).

vintagedave•24m ago

I love old-school docs, and this was a fantastic read. But, I couldn't see the three generated doc pages linked anywhere. Did I miss something?

I'd really like to see the Win2K-style docs on REST, for example.

C++: The Documentary Released Today

Meta enables ADB on deprecated Portal devices [video]

Fine-tuning an LLM to write docs like it's 1995

Anthropic's open-source framework for AI-powered vulnerability discovery

Open Code Review – An AI-powered code review CLI tool

Do transformers need three projections? Systematic study of QKV variants

Azure Linux 4.0 is Microsoft's first general-purpose Linux

WiFi Time

The IsUpMap lets you check the status of over 100 major sites at once

I'm skeptical about efforts to revolutionize schooling

Watching a Z80 from an RP2350

Branchless Quicksort faster than std:sort and pdqsort with C and C++ API

Delacroix's Entry of the Crusaders into Constantinople Restored

Magenta RealTime 2: Open and Local Live Music Models

SpaceX, Other Mega IPOs Denied Fast Index Entry by S&P

Linear Cosine Palettes(2025)

Go Experiments Explained

Reverse-Engineered Userspace Driver for Asus ZenVision Lid OLED on Linux"

The Pentagon is running an AI propaganda mill targeting Latin America

Samurai City

When AI Builds Itself: Our progress toward recursive self-improvement

The Causes of Long Covid

KVarN: Native vLLM backend for KV-cache quantization by Huawei

VoidZero Is Joining Cloudflare

Queen bees emerge from special wax chambers

Retro-Tech Parenting

JLink JTAG Access on the Pinecil

WSL 2 is getting faster Windows file system access

Castor: CERN Advanced STORage Manager

Making Debian or Fedora persistent live images

Fine-tuning an LLM to write docs like it's 1995

Comments

C++: The Documentary Released Today

Meta enables ADB on deprecated Portal devices [video]

Fine-tuning an LLM to write docs like it's 1995

Anthropic's open-source framework for AI-powered vulnerability discovery

Open Code Review – An AI-powered code review CLI tool

Do transformers need three projections? Systematic study of QKV variants

Azure Linux 4.0 is Microsoft's first general-purpose Linux

WiFi Time

The IsUpMap lets you check the status of over 100 major sites at once

I'm skeptical about efforts to revolutionize schooling

Watching a Z80 from an RP2350

Branchless Quicksort faster than std:sort and pdqsort with C and C++ API

Delacroix's Entry of the Crusaders into Constantinople Restored

Magenta RealTime 2: Open and Local Live Music Models

SpaceX, Other Mega IPOs Denied Fast Index Entry by S&P

Linear Cosine Palettes(2025)

Go Experiments Explained

Reverse-Engineered Userspace Driver for Asus ZenVision Lid OLED on Linux"

The Pentagon is running an AI propaganda mill targeting Latin America

Samurai City

When AI Builds Itself: Our progress toward recursive self-improvement

The Causes of Long Covid

KVarN: Native vLLM backend for KV-cache quantization by Huawei

VoidZero Is Joining Cloudflare

Queen bees emerge from special wax chambers

Retro-Tech Parenting

JLink JTAG Access on the Pinecil

WSL 2 is getting faster Windows file system access

Castor: CERN Advanced STORage Manager

Making Debian or Fedora persistent live images