What's the benefit of having the primary key before the row is stored in the database?
Maybe concurrent inserts to multiple data stores so you don't need to wait for an initial ID from the database. You'd have to trust the client to give you a “good” UUID as well as the normal distributed problems of one of the RPCs failing. I know both Twitter (called Snowflake iirc) and Google have unique ID services for this use case.
> What’s the benefit of having the primary key before the row is stored in the database?
If I have a set of linked records in a relational schema representing a complex object, I can create them all with one round trip rather than multiple.
Also, large numbers of clients can insert in that way without contention, whereas if you use a sequence generator for PKs, it becomes a resource around which there is contention when creating rows.
> You’d have to trust the client to give you a “good” UUID
Sure, this lets you scale, e.g., backend service instances (which are db clients) without (as much as otherwise) contention in the db layer, its not usually something you would do with external, untrusted clients.
> I know both Twitter (called Snowflake iirc) and Google have unique ID services for this use case.
Snowflake was one of the inspirations for the newer (draft) UUID versions [0] (though, unlike them, it had a design constraint of fitting into 64 instead of 128 bits.)
> I can create them all with one round trip rather than multiple.
I see. I'd lean towards using common table expressions. The first insert statement returns the primary key and other inserts can depend on the key. I do understand it's not a panacea and composing queries can be problematic.
> a sequence generator for PKs, it becomes a resource around which there is contention
As I understand, Postgres sequences don't block which leads to a different problem [1].
Maybe concurrent inserts to multiple data stores so you don't need to wait for an initial ID from the database. You'd have to trust the client to give you a “good” UUID as well as the normal distributed problems of one of the RPCs failing. I know both Twitter (called Snowflake iirc) and Google have unique ID services for this use case.