The MERGE Command Has Finally Arrived in Postgres 15!

October 27, 2022

MERGE is the kind of thing we heard about for a long time before it was really implemented into PostgreSQL. According to Postgres wiki, the first time a contributor asked about this feature is as old as November 2005! The topic was also suggested by a student who worked during Google Summer of Code 2010. We thought for a while we would see it in Postgres 14, but it was finally pushed back to Postgres 15.

 

How does it work?

The MERGE statement is used to, well, merge one table into another. The basic syntax is:

MERGE INTO target
USING source ON <condition>
…

But it goes deeper. MERGE can mean different things, depending on what you would like to do. For example, what do we do with the rows when ids match? Do we want to update the other values? Do we want to remove them? Do we want something specific? Do we want special treatment when the ids do not match? That's the WHEN clause.

Let's create the following two tables and have a look.

CREATE TABLE target (id bigint, value text);
​
INSERT INTO target (id, value)
    VALUES (1, 'aspect'),
           (2, 'concept'),
           (5, 'mall'),
           (6, 'storage');;
CREATE TABLE source (LIKE target);
​
INSERT INTO source (id, value)
    VALUES (1, 'height'),
           (2, 'response'),
           (3, 'fishing'),
           (4, 'sister');

We want to merge the source into the target. The first thing we need to know is how they match up based on their ids. We can write the following query for that.

SELECT t.id, t.value, s.id, s.value
FROM target AS t
RIGHT OUTER JOIN source AS s ON s.id = t.id
ORDER BY COALESCE(t.id, s.id);

And the result will be something like this:

1   aspect  1   height
2   concept 2   response
        3   fishing
        4   sister

Notice that this merging process is not symmetrical. Only the source data is being merged into the target data and therefore the target's ids 5 and 6 are not present. We can see that ids 1 and 2 are *matched* and ids 3 and 4 are *not matched*.

MERGE INTO target AS t
USING source AS s ON s.id = t.id
​
WHEN NOT MATCHED THEN
	INSERT (id, value) VALUES (s.id, s.value)
;

We could have done that with this query:

​INSERT INTO target (id, value)
	SELECT s.id, s.value
	FROM source AS s
	WHERE NOT EXISTS (
    	    SELECT 1
    	    FROM target AS t
    	    WHERE t.id = s.id);

​

Or, we could want to remove everything from the target that exists in the source.

MERGE INTO target AS t
USING source AS s ON s.id = t.id
​
WHEN MATCHED THEN
	DELETE
;

That would traditionally be done like so:

​DELETE FROM target AS t
WHERE t.id IN (SELECT id FROM source);

​

More likely, we would want to update the values when we find a match, so let's do that.

MERGE INTO target AS t
USING source AS s ON s.id = t.id
​
WHEN MATCHED THEN
	UPDATE SET value = s.value
;

After the MATCHED (or NOT MATCHED) clauses, we can also add any boolean expression we like. We can also have as many of these clauses as we wish. Only one will be executed, in the order they are written.

 

Conclusion

The MERGE command is something we have waited for a long time and we are very excited to finally get it!

The SQL Standard (and therefore PostgreSQL's implementation of it) missed a trick. It is not possible to delete rows from the target based on their absence in the source. A second, traditional DELETE query is required for that.

Let's change the Standard first and we will be able to implement that change into Postgres right after!
 

Share this