Fundamental Problems of Lisp, the Cons Cell (2024)

http://xahlee.info/comp/lisp_cons_problem.html

66•gudzpoz•3d ago

Comments

alphazard•3h ago

Cons, car, cdr are fine for putting things together and breaking them apart.

The problem is everything is boxed, and the indirection mechanism is conflated with the concatenation mechanism. What you really want is the ability to compose a value (concatenate 2 things together), but store it contiguously in memory. If cons did that then you could create true products of values, without any indirection.

Then you need a different method for creating indirection. A function for each of C's malloc and '*' respectively. That would give you the control needed to create trees with arbitrary branching factors.

Jtsummers•3h ago

If you don't want cons, you can use vectors and arrays in Lisp. Cons isn't the only data structure available.

johnisgood•3h ago

And often you want to avoid heavy cons-ing, too.

behnamoh•3h ago

One Lisp that doesn't have this con (pun intended!) is Janet. I really like the language, but the documentation is lacking. Otherwise, it's a nice little language which compiles to small binaries we can share. It's image-based (!) which is refreshing given how Racket, Clojure, etc. don't support images (at least not that I'm aware of).

pbohun•3h ago

I recently started using Janet and enjoy it too. I think Rich Hickey got it right when he created a single set of functions to operate on all collections, which fixes most of what's mentioned in the article. Janet does a great job of copying what Clojure got correct.

I've found the documentation for Janet to be fairly adequate. What are some parts where the documentation could be better?

wffurr•1h ago

Took me some digging to figure out what you meant by “image”. Seems like a snapshot of the state after loading the file but before executing main.

kazinator•1h ago

In the mainstream, what adopted the image-based approach is office suite application. What is a document, such as a spreadsheet or wordprocessor (or integrationo of those?) It's a bunch of objects and code saved to disk, revived when you open the document.

If you need to explain image-based to people unfamiliar, that's not a bad analogy.

DonaldFisk•2h ago

The cons is integral to Lisp, and is used to construct the list, which is a purely functional data structure. Other languages also have lists, e.g. Prolog and Haskell (as well as similar pure functional languages).

Saying that they're a fundamental problem in Lisp is like saying that objects are a fundamental problem in Smalltalk or that arrays are a fundamental problem in APL.

If Xah doesn't like cons cells, there are other languages he can use instead. Lisp (as well as Prolog, Haskell, ML, etc.) programmers won't be persuaded to abandon cons cells or lists as they find them extremely useful.

astine•2h ago

Xah's complaint appears to be that cons cells are too low level a construct to expose to the user. He seems fine with lists in general but just dislikes how they are implemented in Lisp. If I were to implement a list in a non-Lisp language (like C or whatever,) it would likely internally include a struct called "Node" that filled the same purpose as cons cells do in Lisp. However, users of this hypothetical 'list' usually wouldn't directly interact with these "Nodes"; they would mostly be an implementation detail, as appears to be the case for most list implementations other than Lisp's. I think that Xah is wishing that was the case for Lisp as well.

adastra22•1h ago

I’m confused as to what that would look like though. If you implement a linked list in C/C++, it ends up looking like cons. What is Xah imagining it would be otherwise?

astine•1h ago

You'll have an equivalent of a cons cell in any linked list implementation, but you won't necessarily interact with it as a user. Look at the standard library implementation of a linked list in C# for example (https://learn.microsoft.com/en-us/dotnet/api/system.collecti...). You see methods for looping over a list, for accessing elements, for insertions and removals, but not much for directly accessing a list's nodes and manually constructing a list out of those nodes. That part is abstracted away.

DonaldFisk•1h ago

> Xah's complaint appears to be that cons cells are too low level a construct to expose to the user

On his web page, he writes: > Lisp's cons forces programer (sic) to think of list in a low-level nested of 2-item construction, with explicit functions like “cddr”, “cadaar” etc, and even a special notation “(A . B)”.

Except, most of the time, it doesn't. For example, I think (a b c) is a list of three elements. If I want c, I call (third '(a b c)), not (caddr '(a b c)). I know it's stored as (a . (b . (c . NIL))) and that third is just an alias for caddr, but I'm not forced to think that way.

So why not just think in terms of lists, which can be nested, rather than cons-cells, which are only rarely used for anything other than to construct lists? The cons function itself would still be needed, as adding a value to a list without modifying the original list is extremely useful. Or you could think in terms of the more abstract data structures (parse trees, Lisp functions, etc.) you construct from lists.

When learning and using a language, it's important to think in that language.

astine•1h ago

So why not just think in terms of lists, which can be nested, rather than cons-cells, which are only rarely used for anything other than to construct lists?

Because, as he continues:

Worse, any proper list can have improper list as elements. So, you can have a list of cons, or cons of lists, cons of cons, list of lists, or any mix. The overall effect of the cons is that it prevents lisp to have a uniform high level treatment of lists, with the result that development of functions that work on tree are inconsistent and few.

In other words, the fact that the implementation details for lists are so exposed means that you have to be careful when interacting with 'lists' that turn out not to be actual lists. There is no type information, either at compile or runtime, that ensures that what you're dealing with is actually a list and not something else. So you can't _actually_ think in terms of lists; you have to think in terms of cons cells which are probably lists but might not actually be so.

dragonwriter•1h ago

> So why not just think in terms of lists, which can be nested, rather than cons-cells, which are only rarely used for anything other than to construct lists?

Cons cells are used to construct other higher-level data structures, like trees, not just lists.

But this whole discussion seems to just be a version of the usually static vs. dynamic typing discussion, where the “you have to think in terms of cons not lists” side is the static typing side saying that without handcuffs making it statically impossible to do the wrong thing you can't ever stop thinking about the things that could be done even if you aren't doing them and the “you can think in terms of lists without a problem” is the dynamic typing side saying “no, we actually don’t need to think about other possibilities unless we actively choose to use them”.

noosphr•2h ago

cons is not purely functional since set-car! and set-cdr! have existed just as long as cons had.

DonaldFisk•2h ago

You mean Lisp isn't purely functional, as it has rarely used functions which destructively modify cons cells and lists. The cons function itself has no side effects, so is purely functional. As long as you avoid list-modifying functions, the cons data structure and any lists built from it are purely functional data structures (i.e. their contents never change), just like in Prolog and Haskell.

noosphr•2h ago

Cons cells are mutable.

Saying that cons is functional as long as you don't modify the result later is like saying that variable assignment in C is functional.

There are lisp dialects that have immutable cons: https://docs.racket-lang.org/reference/pairs.html

lelanthran•1h ago

> Lisp (as well as Prolog, Haskell, ML, etc.) programmers won't be persuaded to abandon cons cells or lists as they find them extremely useful.

Who said anything about abandoning lists? The argument appears to be that lists don't need to be built up out of two-element cells (i.e cons cells), and that Lisp, by enforcing `cons` cells as a way to construct lists, is seeing some limitations.

TBH, I agree a little that that PoV - there is no need for exposing the implementation of the list to the programmer, much less forcing the programmer into using the implementation.

The other problem with enforcing lists to be constructed of cons cells is one of performance; the default list in Lisp is a cache-invalidating monster.

dwringer•1h ago

Also to be fair to Common Lisp, it has lots of support for sequences that aren't functional lists. Cons cells just happen to be convenient for representing things like the source code so they are prevalent. They're hardly a requirement for any algorithm's logical implementation.

dragontamer•2h ago

Wait, what's the problem with Cons?

This seems to just complain that cons is hard to understand or something, without actually saying what makes it hard or unintuitive.

--------

Cons is fundamentally a linked list. The "pair" is a (value+pointer), and sometimes "value" can itself be a pointer. So when you have (pointer, pointer), you now can create trees.

As such, "Cons" is a powerful abstraction for trees, pairs, and linked lists. That's all it is. If you need "more" than this (ie: graphs), then perhaps its insufficient.

---------

"Cons" itself is a malloc(sizeof(pair)) operation. While car/cdr depend on whatever you're doing or however you've designed the program.

If I were to criticize "Cons", its that high performance modern programming needs to use arrays because the cost of pointers has grown. Latency of memory lookups is relatively slower on modern machines (or really, latency hasn't improved in like 20 years), so you need to consolidate more members together.

But in terms of a "simple" data-structure, "cons" cells and list-programming is a thing for a reason.

astine•2h ago

I think that the author's complaint is that cons cells are too low level a construct to be exposed so prominently to the user. I think he would prefer that it was abstracted over with some higher level interface the way a lot of other language do it. He mentions Clojure as a Lisp that gets it right in his opinion and that's exactly what Clojure does. Cons cells are still available as a construct, but the default method for interacting with lists is a set of higher level functions that also work on arrays and other sequences and don't require that you think as much about the implementation of the lists.

KerrAvon•2h ago

> If I were to criticize "Cons", its that high performance modern programming needs to use arrays

This has been true of linked lists since the advent of memory caches; it’s not just the last 20 years. They’re simple to implement, but they’re usually very bad for performance; you really want better locality of reference in a primitive data structure. You almost always should be using an array of some kind instead.

dragontamer•2h ago

Arrays are much worse at insertions, especially at the middle or beginning.

There's a reason why the Linux Kernel continues to use Red-Black trees today for a huge number of its data-structures. When you need reliable performance on all operations (insert, delete, reorderings, reallocations), its hard to beat a tree (or "cons" for that matter).

If you can use an array, then use an array. But making the tree a "good default" seems like the right move. Lisp does that with cons/cdr/car and (a . b), or (a b c d e) syntax.

Turning (a b c d e) into (a b b1 c d e), especially for larger sizes, will likely be faster than arrays (which innately needs to memcpy ~half the array on the average).

adastra22•1h ago

There isn’t a consistent single way to implement a red black tree with cons though, which I think is OP’s gripe.

Jtsummers•1h ago

Why does there need to be a single consistent way to implement something with cons? That's an odd complaint. What language would the OP be happy with if they want one singular, canonical way to implement everything?

adastra22•45m ago

To be clear, I'm steal manning his argument as I'm not sure I agree with it. But I think the point is summarized best with an example: "Here's an API that takes a red-black tree as an argument." Okaaay, what format is the red-black tree? How do I construct one?

In Lisp you'd pretty much have to ship your module with its own tree-construction logic, or use some common (but importantly, not technically standard) implementation, and document that's what you did, so people know how to structure the data they pass to you.

In C++, you simply take a std::map, which implements a red-black tree for you, and being part of the standard library it is accepted and commonly used nearly everywhere.

I think OP wishes Lisp had something like std::map.

Jtsummers•23m ago

Cons cells aren't the only data structures in Lisp. And there are libraries, packages, that you can use if you want a common structure across multiple projects.

> To be clear, I'm steal manning his argument as I'm not sure I agree with it. But I think the point is summarized best with an example: "Here's an API that takes a red-black tree as an argument." Okaaay, what format is the red-black tree? How do I construct one?

If the API calls for taking a red-black tree as an argument, it will also provide either the specific red-black tree implementation itself or it will refer you to an existing package to include as a dependency. This is not dissimilar from C++. In C++ if something takes a std::map, you use a std::map. If it takes another data structure that's not in the STL, then it will provide it or tell you how to obtain it as a dependency.

BTW: steel, not steal, manning.

dragontamer•50m ago

I mean, in Lisp your go-to structure will likely be unbalanced trees

Which like Quicksort (nearly the same analysis btw), has exceptional average case performance (even faster than RB trees on the average !!!!) but the downside of horrific O(n^2) performance on the worst case.

RBTrees or AVL trees all help at improving those worst cases (AVL strangely has its best case be the worst case of unbalanced trees lol).

In any case, there's a lot of tree flavors. Lisp is certainly sufficient and simple when it's exploring these different details.

Unbalanced trees, in any case, is actually a reasonably sane default. RB Tree or AVL is probably a better default but don't knock the unbalanced tree before you try it!!

------------

I've heard of highly optimized lisps where the typical list is turned into an array or at least a rope in the average case btw.

kazinator•58m ago

When some random tech person says "modern", the dividing line is usually the date when they started programming. Everything which was already available at that time is "modern", and none of it existed the year before that, since it has no associations with their pre-programming life.

kazinator•1h ago

> high performance modern programming needs to use arrays

What is "modern"? How far back do we have to go to "pre-modern" such that dependent loads through pointers are not a performance issue?

dreamcompiler•48m ago

> The "pair" is a (value+pointer), and sometimes "value" can itself be a pointer. So when you have (pointer, pointer), you now can create trees.

Pointers don't always imply trees. Any Lisp object that won't directly fit in half a cons cell gets allocated separately and a pointer to it is placed in that half of the cons cell. Examples: Strings, symbols, bignums, etc.

> If you need "more" than this (ie: graphs), then perhaps its insufficient.

You can certainly represent cyclic graphs with cons cells. The *print-circle* mechanism in Common Lisp handles printing them, for example.

noosphr•2h ago

>Lisp at core is based on functional programing on lists.

Lisp is not, nor has it ever been, a functional programming language. It had lambda decades before other languages figured out it was a good idea, but it had set! for just as long.

It is also not a list based language, but a tree based one. The fundamental point of lisps is that you can manipulate the abstract syntax tree directly since it's pretty much the same thing as the parse tree.

>Deep Nesting is Rare

>The lisp's cons [...] works just fine when data structure used is simple. [...] vast majority of list usage is just simple flat list, sometimes 2 level of nesting [...] 3 levels of nesting is seldom used [...] Greater than 3 level is rarely seen. Systematic manipulation and exploitation of nested list, such as mapping to leafs, to particular level, transposition by permutation on level, or list structure pattern matching in today's functional langs, etc is hardly ever to be seen.

I guess the author has never used lisp to implement a DSL in lisp. Writing a lisp interpreted in lisp is the first real program you write if you take a lisp course at university and you need to handle arbitrarily nested expressions.

Real world usage also involves writing a specialist language to solve your problem for anything more complex than a utility script, e.g. Emacs is a text editing DSL implemented in elisp.

golly_ned•2h ago

The author is reknowned for eMacs lisp.

noosphr•2h ago

The author is renowned for flame wars on comp.lang.lisp which boil down to "Hey guys, why isn't lisp mathematica?".

rurban•1h ago

The author is used be in the killfile, so you don't have endure his endless wrongs

Ygg2•1h ago

What is a killfile?

Jtsummers•1h ago

From the old Usenet days, a killfile told your client to ignore all messages from particular people (those in the killfile). A block list, but with a more aggressive name.

agumonkey•1h ago

I'm not sure but I kinda recall some old lisper saying they added imperative features due to idioms outside of the lisp lab. `progn` was a way to appeal to fortran coders, i believe `set!` was in the same bag..

Jtsummers•1h ago

set! is from Scheme. Lisp has had set and setq since the beginning along with prog (I don't know when progn was added).

The LISP I manual describes the program feature as being "like a FORTRAN program with LISP statements", but that's not the same as saying it is there to appeal to Fortran programmers.

https://www.mirrorservice.org/sites/www.bitsavers.org/pdf/mi...

chubot•1h ago

Why not Clojure? It's based on immutable maps and vectors, so I feel like cons should be used much less in idiomatic Clojure code

kazinator•1h ago

Pretty sure this is not 2024.

Oh look, it says so right near the top: 2008.

Jtsummers•1h ago

It also says 2024. The new addition is the "Emacs Lisp Manual on Cons" section at the bottom.

lisper•1h ago

This author [1] has a fundamental misunderstanding of cons cells and what their purpose is. Cons cells are a primitive. They are best understood in contrast to the C model where memory is an array, i.e. an associative map from integers (or at least some abstract data type with a total ordering) onto values which are also integers (or at least sets of objects that can be mapped into finite sets of integers).

The key difference between the C primitive and the Lisp primitive is that the Lisp primitive does not require anything to be mapped onto integers whereas C does. You have to be able to increment and decrement pointers in C. You can have a fully functional Lisp with no numbers at all. In fact, this was the whole point of Lisp when it was invented: providing a model for computing on symbolic expressions rather than numbers. There is a reason that the original Lisp paper was called, "Recursive functions of symbolic expressions and their computation by machine" rather than "Cons cells: a new computational primitive".

All of the "problems" with cons cells are not problems with cons cells at all, they are problems with punning cons cells, i.e. using cons cells to implement higher-level abstractions without properly hiding the fact that the underlying implementation is a cons cell. The exact same thing happens in C when you run off the end of an array or cast a pointer to (void *). These are not problems with pointers per se, they are problems with using pointers to implement higher level abstractions (like finite arrays) without properly hiding the underlying implementation.

Punning cons cells and abusing pointers have similar consequences: segfaults in the case of C, and (typically) "NIL is not a valid argument to..." errors in Lisp. Both of these are the result of bad abstraction discipline, not a problem with the primitives.

---

[1] The author is Xah Lee, who is pretty well known in the Lisp community. He is widely considered a troll, an opinion which I personally share. I've tried to make my assessment of this post as dispassionate as I can, but I thought I should disclose the fact that I have a pretty strong bias.

kazinator•12m ago

In TXR Lisp, I abstracted car/cdr to work with non-conses.

  $ txr
  This is the TXR Lisp interactive listener of TXR 300.
  Quit with :quit or Ctrl-D on an empty line. Ctrl-X ? for cheatsheet.
  TXR is written, directed and produced by, not to mention starring, Kaz.
  1> [mapcar chr-toupper "abc"]
  "ABC"

There is an important function and concept called nullify. This deals with the fact that empty nonlists are not nil. nullify returns its argument (like identity) if the argument is a non-empty thing. If the argument is an empty sequence, it translates it to nil:

  2> (nullify "")
  nil

Thus:

  3> (cdr "a")
  nil
  4> (car "abc")
  #\a
  5> (car "a")
  #\a
  6> (cdr "a")
  nil

Here, nullify at work. (cdr "a") wants to be "", but gets nullified.

  7> (car "ab")
  #\a
  8> (cadr "ab")
  #\b
  9> (cddr "ab")
  nil

You can define a structure with car and cdr methods. Heck, rplaca and rplacd too!

  10> (defstruct kons ()
        bling blong
        (:method car (me) me.bling)
        (:method cdr (me) me.blong)
        (:method rplaca (me splat) (set me.bling splat))
        (:method rplacd (me plonk) (set me.blong plonk)))
  #<struct-type kons>
  11> (new kons bling 1 blong 2)
  #S(kons bling 1 blong 2)
  12> (car *11)
  1
  13> (cdr *11)
  2
  14> (rplaca *11 42)
  #S(kons bling 42 blong 2)

You can have a nullify method which informs the language how to nullify your object. Our above kons structure needs nullify if it is to be a complete emulation of conses that work as sequences. https://www.nongnu.org/txr/txr-manpage.html#N-F001A7B8

This is not the only way to make an object which serves as a sequence; you can instead (or in addition!) implement array-like indexing via the lambda and lambda-set methods to have it vector-like rather than list-like.

The empty function and nullify are almost inverses; anyway they are implemented in the same spot of a 16000+ line file. The difference is that empty errors out on objects that are not iterable, whereas nullify takes objects of all types:

  val nullify(val obj)
  {
    val self = lit("nullify");
    seq_info_t si = seq_info(obj);

    if (seq_iterable(si))
    {
      seq_iter_t iter;
      val elem;
      seq_iter_init_with_info(self, &iter, si);
      return if2(seq_peek(&iter, &elem), obj);
    }

    return si.obj;
  }

  val empty(val seq)
  {
    val self = lit("empty");
    val elem;
    seq_iter_t iter;
    seq_iter_init(self, &iter, seq);

    return tnil(!seq_peek(&iter, &elem));
  }

But, anyway, conses are good. TXR doubles down on conses.

Airpass – easily overcome WiFi time limits

Behind the scenes: Redpanda Cloud's response to the GCP outage

Scaling our observability platform by embracing wide events and replacing OTel

Using Microsoft's New CLI Text Editor on Ubuntu

Tell HN: Beware confidentiality agreements that act as lifetime non competes

Samsung embeds IronSource spyware app on phones across WANA

Microsoft suspended the email account of an ICC prosecutor at The Hague

Delta Chat is a decentralized and secure messenger app

Weave (YC W25) is hiring a founding AI engineer

Phoenix.new – Remote AI Runtime for Phoenix

YouTube's new anti-adblock measures

Harper – an open-source alternative to Grammarly

'Gwada negative': French scientists find new blood type in woman

AbsenceBench: Language models can't tell what's missing

Plastic bag bans and fees reduce harmful bag litter on shorelines

Life as Slime

Show HN: MMOndrian

Captain Cook's missing ship found after sinking 250 years ago

The Nyanja new PC-Engine/TurboGrafx 16-bit console game in development

Cosmoe: BeOS Class Library on Top of Wayland

Sega mistakenly reveals sales numbers of popular games

Visualizing environmental costs of war in Hayao Miyazaki's Nausicaä

Augmented Vertex Block Descent (AVBD)

Show HN: lambda-nat-proxy – Serverless proxy using Lambda and UDP NAT punching

Unexpected security footguns in Go's parsers

Show HN: Nxtscape – an open-source agentic browser

Tiny Undervalued Hardware Companions (2024)

Wiki Radio: The thrilling sound of random Wikipedia

College baseball, venture capital, and the long maybe

Learn you Galois fields for great good (2023)

Airpass – easily overcome WiFi time limits

Behind the scenes: Redpanda Cloud's response to the GCP outage

Scaling our observability platform by embracing wide events and replacing OTel

Using Microsoft's New CLI Text Editor on Ubuntu

Tell HN: Beware confidentiality agreements that act as lifetime non competes

Samsung embeds IronSource spyware app on phones across WANA

Microsoft suspended the email account of an ICC prosecutor at The Hague

Delta Chat is a decentralized and secure messenger app

Weave (YC W25) is hiring a founding AI engineer

Phoenix.new – Remote AI Runtime for Phoenix

YouTube's new anti-adblock measures

Harper – an open-source alternative to Grammarly

'Gwada negative': French scientists find new blood type in woman

AbsenceBench: Language models can't tell what's missing

Plastic bag bans and fees reduce harmful bag litter on shorelines

Life as Slime

Show HN: MMOndrian

Captain Cook's missing ship found after sinking 250 years ago

The Nyanja new PC-Engine/TurboGrafx 16-bit console game in development

Cosmoe: BeOS Class Library on Top of Wayland

Sega mistakenly reveals sales numbers of popular games

Visualizing environmental costs of war in Hayao Miyazaki's Nausicaä

Augmented Vertex Block Descent (AVBD)

Show HN: lambda-nat-proxy – Serverless proxy using Lambda and UDP NAT punching

Unexpected security footguns in Go's parsers

Show HN: Nxtscape – an open-source agentic browser

Tiny Undervalued Hardware Companions (2024)

Wiki Radio: The thrilling sound of random Wikipedia

College baseball, venture capital, and the long maybe

Learn you Galois fields for great good (2023)

Fundamental Problems of Lisp, the Cons Cell (2024)

Comments