Binary Large OBject storage (BLOB) is a topic that comes up in almost every conversation about Oracle to PostgreSQL migration. I will discuss it here with enough detail to make architectural choices coming from the Oracle perspective.
A basic understanding of the dataype is available in the PostgreSQL documentation. A bit more detailed discussion is also in the documentation. I suggest you read these articles lightly, and come back here for further discussion.
Ready? Ok, let’s continue.
The previous articles touched on a discussion about why the binaries are placed in the database in the first place. In addition to the argument that binaries may need atomicity with other data, I’d also like to suggest some practical reasons.
- They get backed up with appropriate transaction integrity.
- They get transported with physical streaming replication.
- They do not require a separate connection stream to the caller.
So, that’s settled then, right? We’ll just stuff all the binaries into the database.
Whoa, hoss.
Here are a few reasons why that may not be such a great idea.
- PostgreSQL does not actually UPDATE rows. Due to concurrency semantics, it makes a new row with the new data, and marks it as the current row. Whoopsie, you just doubled your storage requirement for blobs.
- PostgreSQL can’t really do any rational SQL things with the blob, other than store and retrieve it. None of the operators, castings, aggregations… well, you get the idea. INSERT and SELECT is all you get, UPDATE is effectively limited to replacement.
- Blobs can have a terrible impact on garbage collection operations (VACUUM).
- Blobs have the highest potential to make you unexpectedly run out of space on a volume, which is insanely critical for PostgreSQL.
- Blobs storage functions and libraries stink a little bit like 1965.
- Blobs make other operations on the row run slowly, even if they don’t have anything to do with the blob data. This is mostly due to how PostgreSQL likes to ferry things around in 8k blocks from memory to disk to memory.
- Blobs don’t cache, and will discourage row caching.
- Blobs don’t participate in Logical replication.
- Blobs make the backups slow.
Ok, so by now you’ve decided that blobs are horrendous, and should never be used for anything.
Hang on there, pardner.
These arguments basically boil down to the intellectual vs. the practical. In an ideal world, we would just store everything in PostgreSQL, and the world would be at peace again. Selah.
But the truth is that there are physical limitations on how this storage should be handled to provide the expected experience. Let me give you some more suggestions than were provided in the official answers to the question.
- Store blobs in associative tables using a referential id to the corresponding row data. Use the additional columns recommended in the article linked above in that table. Put the blob column last in this table. Do this even if it is a 1:1 cardinality table. Just do it. Don’t argue with me or yourself. You’ll thank me later.
- In fact, always put the blobs and clobs columns dead last in any table, ever.
- Don’t update the blob unless it actually changes. PostgreSQL will obediently duplicate the row due to the concurrency semantics, and will not be intelligent enough to do a “no-op” update.
- Provide a huge amount of temp space, and put it on an amazingly fast NVME LVM drive array or SAN. Put the Write Ahead Log (WAL) files on this storage also. Make it big. Did you wince when you opened your wallet? No? Make it bigger.
- Increase your
work_mem
setting on a per-connection basis to help prevent thrashing blobs to disk. - Increase the amount of RAM until you can handle all the
work_mem
multiplied by the number of connections you plan to support. - Write your application to do separate calls to the database for the row data and the blob data, which can then be loaded lazily. It doesn’t really help, but it makes people feel like it helped, because they get a little instant gratification.
- Be realistic about whether the atomicity of the blob data is truly critical in a practical way for your application.
- Use XFS/ZFS or similar filesystem that can handle large files well.
- Keep hoping that Moores Law holds out for about 2 or 3 more iterations. With enough hardware, a lot of expensive things become trivial.
Now, with all of these mitigating strategies in place, a moderate amount of blob retention can be handled well by PostgreSQL. This requires that you think like the trickster, able to hold opposing points of view in your own head at the same time without going crazy.
The alternative is another kind of database. We call it a file system, and it’s been around for a very long time. PostgreSQL has get() and put() functions for direct file operations. You could use those in a transaction to manipulate the binary files with transactional integrity. Some (a lot) of assembly required, batteries not included. Doesn’t participate in the backup, unless you customize it. May I suggest Barman?
I hope this puts a few more ideas into your thinking about how to handle binary data in PostgreSQL effecively. This is one of those areas where having a database directly tied to a piece of hardware would help in a practical sense. I’m confident that Moores Law will catch up, and this article will eventually become stale. Somebody please wake me up when that happens, and give me my evening whiskey.