Libraries are under-used. LLMs make this problem worse

https://makefizz.buzz/posts/libraries-llms

43•kmdupree•5h ago

Comments

tptacek•4h ago

I don't know about vibe coding (I'm not a fan of vibe coding) but LLM agents make me more likely to use good libraries, not less, because they instantly know how to use them; there's less intellectual friction to breaking them out (don't have to find and add the dep, don't have to look for the example code). These kinds of things made me ultra-likely to just hand-code crappier versions of stuff libraries did, before I got LLM-assisted.

tedunangst•3h ago

Is it your assessment or the LLM's that it's a good library? There have been many times I looked at the API for a library, said this is bonkers, and bailed. The weird contortions needed to use something should be a signal.

tptacek•3h ago

It's mine. I've been shooting down LLM library picks semiregularly. That's kind of what motivated me to comment: it is not at all my experience that LLMs steer me away from libraries, and rather more my experience that it's keeping me on my toes suggesting libraries I might not want to use.

cluckindan•4h ago

”Dunning-Kruger effect leads us to understimate the complexity of the problem solved by the library we're considering.”

Invoking the smarter-than-thou effect is not a great starting point.

See e.g. https://www.sciencedirect.com/science/article/abs/pii/S01602...

If we’re considering a library, it would be prudent of us to take a look at the source code to see what exactly we’re pulling in. In the process, we would learn about the lay of the land, the API and the internals, and get at least an overview of the complexity of the problem it solves.

briantakita•4h ago

I learned to consider that if one brings up Dunning-Kruger...projection/irony may be at play.

Anyways...I've had a few reoccurring issues with libraries. Note that the language is framed on a case by case basis...not general rules.

1. The essential implementation is a small amount of code...wrapped in structures just for packaging essential code. The wrapping code can be larger & more complex than the essential code.

2. There's small differences between what's needed & what's provided. Which requires workarounds for the desired outcome. These workarounds muddy the logic & can be pervasive at scale.

3. There can be dissonance between the app architecture & the library api.

4. Popular libraries in particular...create a culture of thinking in terms of the library/framework. Leading to resource inefficiencies...And outright dismissing solutions that are a better match for the domain. In short, the library/framework api frames the problem & solution...Which may not match the actual problem & optimal solution.

5. The library/framework authors are concerned about promoting the library/framework. Not solving the actual problem. Many problems need to be solved. The library/framework just be the "Golden Hammer" to pound in your screw.

With all that being said...there are many useful libraries that define & solve problems in their particular domain. Particularly with common, well defined, appropriately scoped requirements.

terribleperson•3h ago

I imagine a good example for 4 would be the Tidyverse. It's very nice, but R with and without Tidyverse packages are very different experiences with different syntaxes, conventions, and even communities.

Though the addition of pipes to the base language is helping fix that.

fmbb•4h ago

The Dunning-Kruger effect absolutely also leads to people releasing libraries they should not have and which nobody should use.

unclad5968•3h ago

The DK effect only implies that people who know things underestimate their knowledge superiority and people who don't know things underestimate their knowledge inferiority. The popular interpretation that uniformed people think they're informed is not consistent with the DK research.

I don't think DK has anything to do with people releasing libraries that nobody should use.

seunosewa•4h ago

I disagree. Every python package we install seems to install dozens of libraries, each of which can could harbour malware. Many of them are only used for a single function within them. We have no idea of what most of the packages are for. It's a lot.

j-pb•4h ago

This. We finally have a tool that can learn from all the libraries and abstractions that have to fit everybody's needs (and do so badly because there is no free lunch), and extract just the parts that are actually relevant to our problem and domain. This allows you to not only produce a much smaller attack surface, but also allows for domain specific optimisations and shortcuts.

It's kinda like project specific semantic monomorphization.

handfuloflight•4h ago

Sure. Lot's more debugging than using something battle tested, which is why I have this in my CLAUDE.MD:

> If there is a battle tested, well known package that can help us, then recommend it BEFORE implementing large swaths of custom code.

lazide•3h ago

This is hilarious.

handfuloflight•3h ago

You're right. I didn't fully read what the OP was saying, which is genius; and my response was more towards the article.

Noumenon72•2h ago

I didn't get this either so let me try to explain as ChatGPT did to me:

Monomorphization means taking a generic function and generating a version specific to the type being used, eg a Rust function

  fn identity<T>(x: T) -> T {
      x
  }

can be compiled into one version for i32 and one for String, which is more efficient since the compiler knows the types:

  fn identity_i32(x: i32) -> i32 { x }
  fn identity_string(x: String) -> String { x }

Semantic monomorphization could mean extracting the parts of the library that are meaningful to generate problem-specific concrete code: instead of importing pandas to do

  import pandas
  df = [{"a": 1}, {"a": 2}]
  total = sum(d["a"] for d in df)

The LLM might skip the import entirely and generate only:

  data = [{"a": 1}, {"a": 2}]
  total = sum(d["a"] for d in data)

If I understood right, the parent found it funny that a comment suggesting we could never use libraries because we can concretize the specific relevant code, would be responded to with a Claude.MD that essentially said "always use libraries instead of concrete relevant code". I missed it because I didn't stop to look up "monomorphization", so I hope this helps anyone else like me get the joke.

lazide•12m ago

Nah, I found it hilarious that any LLM would have any clue what would constitute ‘well baked’ in the context, or that any of this was going to end well.

handfuloflight•4m ago

And what's prevent it from having a clue?

closeparen•3h ago

>This allows you to not only produce a much smaller attack surface

Why does this reduce your attack surface? Can the functions in the library, unrelated to the ones you're using, be triggered by user input somehow?

aDyslecticCrow•4h ago

https://en.m.wikipedia.org/wiki/Log4j https://en.m.wikipedia.org/wiki/Npm_left-pad_incident

Languages and domais that have leaned too faar into package managers and small libraries are prone to fragility and security nightmares.

For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

Id much rather deal with a bug in our code than a depricated library or breaking version update.

If we are to use a library outside of standard unix or stdlib within my field, better expect a nighmareish code review and a meeting.

Besides being fun; implementing it ourselves improves our skill level for the future. Something vibe coding itself goes against aswell.

skydhash•3h ago

> For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

A project only become serious once legal is breathing down engineering's neck. Before that, it's usually the far west. After, it becomes a security circus trying to patch the technology deficiency (custom registries, complex linting and other analysis tooling,...)

giantg2•3h ago

If it's open source, it may be possible to create your own fork to fix issues.

AlienRobot•4h ago

LOL! I thought the article was going to be about reading books and ChatGPT!

And yes, I agree.

https://www.npmjs.com/package/boolean

>converts lots of things to boolean.

>3 million weekly downloads

This is insane.

what•3h ago

3 million weekly downloads for a package that is “deprecated” and the source repo no longer exists. Truly insane.

AlienRobot•1h ago

Even if it wasn't deprecated this is literally

    ['yes', 'y', '1'].indexOf(input.toLowerCase()) !== -1

People adding a dependency to avoid writing one line of code...

lazide•3h ago

This is the total leopards-eating-faces moment from all the greybeards.

rjsw•3h ago

Same with ruby, I have to use a package with 230 dependencies.

ozim•3h ago

This sounds exactly like under utilized - if someone needs a function or two from a library I guess making yourself depending on 3rd party for such small gain doesn't make sense.

PaulHoule•4h ago

I dunno, often people say libraries are over used, at least in the JavaScript world.

My celery/RabbitMQ-based web crawler failed because of the Cloudflare CAPTCHAs, I figured it was best to empty out the queue and archive it. I asked copilot what to do and it told me to use a CLI program. “Does that come with RabbitMQ?” “No, you download it from GitHub”. It offered to write me a Python script but the CLI program did exactly what I needed. It got an option wrong but I’d expect the same if I asked a friend for help.

outside1234•4h ago

I don't know if this is true. An LLM just today recommended a library I had never heard of, and after doing some due diligence, it looks solid.

This is analogous to folks who claim nobody is going to be able to learn software engineering any more. I think it is just the opposite. LLMs can be an awesome tool for learning.

cat_plus_plus•4h ago

If your vibe coding prompt generated a 1000 line output, you should probably ask if there is a library that would do that for you. If not, library is not worth it to shorten a one pager.

brikym•4h ago

A bit of duplication is better than a lot of dependency.

giantg2•3h ago

Kind of a false dichotomy. To avoid a lot of dependencies you generally need a lot of duplication.

kianN•3h ago

Unrelated: I initially expected this articles to be referring to public libraries. I think that would be a challenging connection to prove despite it making intuitive sense.

On the article: some use cases eg handling dates, fault tolerant queues have so many edge cases and are so mission critical that relying on a battle tested tool makes a lot of sense.

However, in my career I’ve seen a lot of examples of a package being installed to avoid 40-50 lines of well thought out code and now a dependency is forever embedded in the system.

I think there is a catch with replacing libraries with LLM generated code. Part of the benefit of skipping third party libraries is the domain knowledge that gets built up: this is potentially lost with llm generated code.

krackers•3h ago

I thought this was about physical libraries as well. Maybe the link is librarians, supposedly if you didn't even know where to begin searching a trained librarian would have been a good person to ask.

AlienRobot•3h ago

>Vibe coding is more fun than reading documentation. Shit, vibe-coding can be more fun than ordinary coding.

In my experience the big problem is that the documentation is always terrible, you can't ask open-ended questions on stack overflow, the library's reddit (if any) has zero users, and anything asked on their discord is not searchable.

It's incredible that we still don't have a stack overflow that is just a forum.

skydhash•3h ago

There are some bad/missing documentations out there, but more often than not, people rush to use the library without first understanding the domain and learning the library's design. Once that's done, the generated api reference and the source code is more than enough to get going.

corby•3h ago

I'm having a problem like this now. I have a library that handles very complex hardware drivers and linkages.

I want people in the company to use it, but it's big and complicated (lots of chipsets and Bluetooth to boot).

I'm trying to design the library so the MCP can tell the LLM to pull it from our repo, read the prompt file for instructions and automatically integrate with the code.

I can't get it to do it consistenlty. There is a big gap in the current LLM tech where there is no standard/consistent way to tell an LLM how to interface with a library (C/Python/Java/etc.)

The LLM more often than not will read the library and then start writing duplicate code.

Maddening.

simonw•3h ago

That's part of the idea behind https://llmstxt.org/ - even if you ignore the "/llms.txt" URL there's a bunch of thinking around that to help write explanations of things like libraries that can be used to "teach" a model to use it by injecting that into a prompt.

I'm still not clear on what the best patterns for this are myself. I've been experimenting with dumping my entire documentation into the model as a single file - see https://github.com/simonw/docs-for-llms and https://github.com/simonw/llm-docs - but I'd like to produce shorter, optimized documentation (probably with a whole bunch of illustrative examples) that use fewer tokens and get better results.

nimish•3h ago

At this point it seems like just learning the library is easier than trying to cram the documentation into an LLM compatible format.

simonw•3h ago

Doing the work to effectively prepare those docs for an LLM probably does involve "learning the library", but once one person has done that (and published the results) many other people can benefit from it.

I'm a library author myself, so publishing LLM-enhanced versions of the docs to help other people use my library more effectively feels like a sensible use of my time.

khalic•3h ago

The Dunning Kruger Effect? It applies to unqualified people in a group… wrong usage

giantg2•3h ago

I'd much rather learn a library than create it from scratch. The two main issues are licensing concerns and being able to find ones that actually do what you need.

layer8•2h ago

The third issue is avoiding “You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.”

(The quotes comes from a different context, but works quite well here as well.)

egypturnash•3h ago

"naive"

(or "naïve")

msgodel•2h ago

Totally disagree. I avoided python for way too long because of how people were abusing pip/anaconda. Especially with such a complete standard library there's no reason to be dragging in external libraries most of the time (except numpy and maybe pytorch if you're doing ML.)

AbsenceBench: Language models can't tell what's missing

Phoenix.new – Remote AI Runtime for Phoenix

Wiki Radio: The thrilling sound of random Wikipedia

AMD's Freshly-Baked MI350: An Interview with the Chief Architect

Harper – an open-source alternative to Grammarly

Visualizing environmental costs of war in Hayao Miyazaki's Nausicaä

YouTube's new anti-adblock measures

Show HN: Nxtscape – an open-source agentic browser

Show HN: Inspect and extract files from MSI installers directly in your browser

No More Shading Languages: Compiling C++ to Vulkan Shaders [pdf]

College baseball, venture capital, and the long maybe

Verified dynamic programming with Σ-types in Lean

Tuxracer.js play Tux Racer in the browser

Alpha Centauri

Smartphones: Parts of Our Minds? Or Parasites?

Rose-Gold-Tinted Liquid Glasses

Cracovians: The Twisted Twins of Matrices

AtomicOS – A security-first OS with real crypto and deterministic language

Oklo, the Earth's Two-billion-year-old only Known Natural Nuclear Reactor (2018)

A brief, incomplete, and mostly wrong history of robotics

Proba-3's first artificial solar eclipse

A Python-first data lakehouse

Dancing Naked on the Head of a Pin: The Early History of Microphotography

The JAWS shark is public domain

BYD begins testing solid-state EV batteries in the Seal

Jürgen Schmidhuber：the Father of Generative AI Without Turing Award

Klong: A Simple Array Language

Show HN: SnapQL – Desktop app to query Postgres with AI

An analysis of recent multithreading improvements for a smoother game

Ancient termite poo reveals 120M-year-old secrets of Australia's forests

AbsenceBench: Language models can't tell what's missing

Phoenix.new – Remote AI Runtime for Phoenix

Wiki Radio: The thrilling sound of random Wikipedia

AMD's Freshly-Baked MI350: An Interview with the Chief Architect

Harper – an open-source alternative to Grammarly

Visualizing environmental costs of war in Hayao Miyazaki's Nausicaä

YouTube's new anti-adblock measures

Show HN: Nxtscape – an open-source agentic browser

Show HN: Inspect and extract files from MSI installers directly in your browser

No More Shading Languages: Compiling C++ to Vulkan Shaders [pdf]

College baseball, venture capital, and the long maybe

Verified dynamic programming with Σ-types in Lean

Tuxracer.js play Tux Racer in the browser

Alpha Centauri

Smartphones: Parts of Our Minds? Or Parasites?

Rose-Gold-Tinted Liquid Glasses

Cracovians: The Twisted Twins of Matrices

AtomicOS – A security-first OS with real crypto and deterministic language

Oklo, the Earth's Two-billion-year-old only Known Natural Nuclear Reactor (2018)

A brief, incomplete, and mostly wrong history of robotics

Proba-3's first artificial solar eclipse

A Python-first data lakehouse

Dancing Naked on the Head of a Pin: The Early History of Microphotography

The JAWS shark is public domain

BYD begins testing solid-state EV batteries in the Seal

Jürgen Schmidhuber：the Father of Generative AI Without Turing Award

Klong: A Simple Array Language

Show HN: SnapQL – Desktop app to query Postgres with AI

An analysis of recent multithreading improvements for a smoother game

Ancient termite poo reveals 120M-year-old secrets of Australia's forests

Libraries are under-used. LLMs make this problem worse

Comments