frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

What is better: a lookup table or an enum type?

https://www.cybertec-postgresql.com/en/lookup-table-or-enum-type/
52•todsacerdoti•2mo ago

Comments

sublinear•2mo ago
Basically ugly no matter what.

In a lot of web apps this need tends to be related to validation, so many just do these lookups and simple comparisons in their app logic and based on static values from config files long before any db query is made. Sometimes you just don't need to involve the database and the performance would be better for it anyway.

systems•2mo ago
well uniformity and homoiconicity are very important in an ideal db management system (a.k.a a true rdbms) everything should be represent as a relation and use the same set of operators to be manipulated

separations of types and relations should be limited to core atomic type, string, int , date etc ... (althought date is debatable as is not usually atomic in most cases, and many dbs end up with one more date relations)

anyway, always use a table .. when its a choice

netcraft•2mo ago
couldn't have said it better myself.

Data should be data, queryable, relational. So often I have had to change enums into lookup tables - or worse, duplicate them into lookup tables - because now we need other information attached to the values. Labels, descriptions, colors, etc.

My biggest recommendation though is that if you have a lookup table like this, make the value you would have made an enum not just unique, but _the primary key_. Now all the places that you would be putting an ID have the value just like they would with an enum, and oftentimes you wont need to join. The FK makes sure its valid. The other information is a join away if you need it.

I do wish though that there were more ways to denote certain tables as configuration data vs domain data, besides naming conventions or schemas.

Edit to add: I will say there is one places where I have begrudgingly used enums and thats where we have used something like prisma to get typescript types from the schema. It is useful to have types generated for these values. Of course you can do your own generation of those values based on data, but there is a fundamental difference there between "schema" and "data".

systems•2mo ago
well, if DDL (data definition language) and DML (data manipulation language), were unified and both operated on relation , manipulating meta data would have been a lot simpler, and more dynamics

you can always created data dictionary relation, where you stored the code for table creation, add meta data, and use dynamic sql to execute the DML code stored in the DB, i worked somewhere where they did this ... sort of

mamcx•2mo ago
Yeah, that is what I think on https://tablam.org, where I consider everything could be a relation, so like

    "hello world" ? where #chars != " " == ["h", "e", ...]
9rx•2mo ago
> everything should be represent as a relation

> always use a table .. when its a choice

Everything should be represented as relations (sets of tuples) but you should always use tables (multisets of tuples) when possible? That seems a little contradictory.

systems•2mo ago
how do you want to represent relations in a DBMS, an enum or a table ?
psychoslave•2mo ago
with foreign keys?
9rx•2mo ago
If said DBMS is relational, with relations.

If said DBMS is tablational, like SQL, then you would have to approximate them using tables and constraints.

If said DBMS is of an another paradigm, like a document database, there may be no way to represent relations within the DBMS.

An enum is a construct that numbers things. There is no way to represent a set of tuples with an integer[1]. I'm not sure where you are trying to go with that one. Inversely, you could hold an enum generated value within a relation. Is that what you mean?

[1] Yes, technically you could break up the individual bits such that they form a set of tuples, but that wouldn't be useful beyond a very narrow use-case and doesn't generalize the way relation implies.

CuriouslyC•2mo ago
From a maintainability standpoint lookup tables are miles ahead, but from a DX perspective there are a few cases where enums are nice. Honestly I probably would never use enums again, I feel like it's caused pain every time I've done it.
tucnak•2mo ago
Enums are great if you're into json/jsonb custom logic and aggregates. It's quite cool to use the constraint system to impose checks on various JSON fields, especially if you're doing extension development, or packaging up procedures for downstream consumption.
aksss•2mo ago
Table with a thread-safe read-through cache in code, imo. But there are places where enums make sense. For instance, things that are specifically in the code's domain.
nlitened•2mo ago
I also love the approach of ClickHouse with LowCardinality(String). Flexible, clear semantics, high performance
veltas•2mo ago
Who was child 12
teddyh•2mo ago
Who was child 12̣
unwind•2mo ago
I don't database, but I like to think I have some kind of intuition for storage space requirements, and this article was very confusing.

Ignoring the indexes and just focusing on the main table sizes reported, we have:

- String ("The frequent repetition of these names inflates the size of the table"): 392 MB

- Enum data type ("Internally, an enum type is stored as four-byte floating point number. So it saves space in the table [...]"): 338 MB

- Lookup table ("Also, since a smallint only occupies two bytes, the person_l table can potentially use less storage space than the other solutions"): 338 MB.

I just can't make sense of the numbers, especially given the authors comments that I've quoted.

Is this some kind of typo/editing fail?

leononame•2mo ago
I'm also wondering about that. But maybe this could be it?

> Surprisingly, the table is just as big as with the enum type above, even though an enum uses four bytes. The reason is that each table row is aligned at a memory address divisible by eight, so PostgreSQL will add six padding bytes after the smallint. If we had more columns and could arrange them carefully, we could see a difference.

This could be the explanation. If the row is padded to 8, bigint is 8, then smallint or enum also use 8. The entries in the string table will be 8 or 16 due to the string length. So one row in person_e and person_l is 16, one row in person_s could be about 20 on average, that is a bit closer to the reality than my intuition, although the storage savings are still less than what I would have expected.

edit:

I did also try out the test and dropped the primary key on the table to compare only enum and string size:

  SELECT PG_SIZE_PRETTY(PG_RELATION_SIZE('person_e')), PG_SIZE_PRETTY(PG_RELATION_SIZE('person_s'))

  277 MB,330 MB
Does not look like an amazing saving either.
gdevenyi•2mo ago
> Enum type 4-byte floating point number

This is why the storage is weird. Why would you use a float for distinct number storage!

Joker_vD•2mo ago
Honestly, the storage use would probably be the last thing of my mind when designing for "what should state/region/district/bundesland/etc. be modelled as". Sometimes those things get renamed, sometimes they are merged, and sometimes they are split. Which means that you may end up in an awkward state when e.g. Mecklenburg-Vorpommern gets split back into Mecklenburg and Western Pomerania, and some of your customers have updated their addresses, and some haven't. You have to store all of that anyway because remember: your DB doesn't represent the current state of the world, it represents your knowledge about the current state of the world (which is where the whole impetus for NULL originated: "I know that the customer has an address, I just don't know what it is", and all related problems with it: compare "I know that the customer actually does not have any address at all", and "I know that this address just can't be correct no longer but I have no new knowledge about what it can be").
Backslasher•2mo ago
To me, since the DB is there to serve the app (which is there to serve the user), the lookup/enum decision mostly depends on whether the list is defined before build time (> enum) or after (> lookup). US states are probably a solid "before", so you get the added value of easily materializing a validator in the app code. Children IDs sound a bit more dynamic.
pavel_lishin•2mo ago
Not super familiar with the internals of postgres, but what sort of performance would something like this have?

  SELECT person_id
  FROM person_l
  WHERE state_id = (SELECT id FROM state_l WHERE name = 'Burgenland');
mannyv•2mo ago
Whatever needs the least number of joins is the best.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
411•klaussilveira•5h ago•92 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
764•xnx•10h ago•463 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
29•SerCe•1h ago•24 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
136•isitcontent•5h ago•14 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
127•dmpetrov•6h ago•53 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
35•quibono•4d ago•2 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
240•vecti•7h ago•114 comments

A century of hair samples proves leaded gas ban worked

https://arstechnica.com/science/2026/02/a-century-of-hair-samples-proves-leaded-gas-ban-worked/
61•jnord•3d ago•4 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
307•aktau•12h ago•152 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
308•ostacke•11h ago•84 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
167•eljojo•8h ago•123 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
384•todsacerdoti•13h ago•217 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
313•lstoll•11h ago•230 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
47•phreda4•5h ago•8 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
103•vmatsiiako•10h ago•34 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
177•i5heu•8h ago•128 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
13•gfortaine•3h ago•0 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
230•surprisetalk•3d ago•30 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
967•cdrnsf•15h ago•414 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
139•limoce•3d ago•79 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
39•rescrv•13h ago•17 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
34•lebovic•1d ago•11 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
7•kmm•4d ago•0 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
76•antves•1d ago•56 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
34•ray__•2h ago•10 comments

The Oklahoma Architect Who Turned Kitsch into Art

https://www.bloomberg.com/news/features/2026-01-31/oklahoma-architect-bruce-goff-s-wild-home-desi...
17•MarlonPro•3d ago•3 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
38•nwparker•1d ago•8 comments

Claude Composer

https://www.josh.ing/blog/claude-composer
100•coloneltcb•2d ago•69 comments

How virtual textures work

https://www.shlom.dev/articles/how-virtual-textures-really-work/
25•betamark•12h ago•23 comments

The Beauty of Slag

https://mag.uchicago.edu/science-medicine/beauty-slag
31•sohkamyung•3d ago•3 comments