The One Where I Abandon UUID Identifiers

I’m in the starting phase of a coding project where I need unique IDs that can be exposed to end users that don’t expose the number of objects in the database the way a monotonically incrementing integer will, and also remain unchanged for the life of the object (unlike, say, the email address on a user object). A common solution to this problem is to use a cryptographically secure UUID (often the UUIDv4 standard) as the public ID.

Today I stumbled across a T3 video—that’s almost precisely a year old—in which he reviews a PlanetScale blog post about the harms of using UUIDs as primary keys in database tables. Serendipity!

For those that don’t know, T3 (Theo; T3 is a nickname) is a software dev who writes and posts YouTube videos about software development, mostly from a web development perspective. I don’t personally get much from his web-specific content, but he covers a lot of things generally applicable to writing software, and I feel like he’s got a viewpoint I can learn from.

The TLDR of it all is that UUIDs are particularly bad for the performance of updating B+ Tree data structures, which are key in index-generation in many databases.

You can watch the video here or read the PlanetScale blog post

I think what I’m probably going to do instead is set the public_id of my database objects to use something like cuid2, or more likely nanoid. They both look like they suffer from similar B Tree efficiency problems to UUID, but at least they’re smaller. There’s a Python implementation of nanoid that looks useful … it hasn’t been updated in a while, but it’s the sort of code that should be very stable, so that doesn’t bother me much.

#100DaysToOffload article 2 of 100

The One Where I Abandon UUID Identifiers

Tags

Social