PostgreSQL Internals - Understanding exception handling with PG_TRY/PG_CATCH

September 22, 2025

Recently, while working on edb_otel, which is a PostgreSQL extension for opentelemetry, I came across an interesting problem. The server crashes when the apis are called from a pl/pgsql block, specifically in such a way that one call succeeds before a subsequent call fails. This extension exposes user-defined functions to send metrics and traces to an otel collector endpoint. These functions are implemented using C external functions.

Let me show a simple reproduction of the issue that I was facing. Here is a simple example program that implements a PostgreSQL C external function that adds 1 to a given value:

#include "postgres.h"
#include "fmgr.h"

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(add_one);

Datum
add_one(PG_FUNCTION_ARGS)
{
   PG_TRY();
   {
       int32 arg = 0;
       if (PG_ARGISNULL(0))
           PG_RETURN_NULL();
       arg = PG_GETARG_INT32(0);
       PG_RETURN_INT32(arg + 1);
   }
   PG_CATCH();
   {
       ereport(LOG,
               (errmsg("add_one failed")));
       PG_RE_THROW();
   }
   PG_END_TRY();
}

 

If you create a shared object from this program and copy it to the Postgresql lib folder:

cc -fPIC -c foo.c -I<POSTGRESQL_INCLUDE_DIR>
cc -shared -o foo.so foo.o
cp foo.so $(pg_config --pkglibdir)

 

and define the equivalent SQL function:

CREATE FUNCTION add_one(integer) RETURNS integer
     AS 'foo', 'add_one'
     LANGUAGE C STRICT;


then you can call add_one:

postgres@ 420745 =# select add_one(1);
 add_one 
---------
       2
(1 row)

 

Now if you try the following anonymous block:

postgres@ 420745 =# DO $$
DECLARE
        result int := 0;
BEGIN
        select add_one(NULL) into result;
        select add_one(1);
END $$;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

Trying to take a coredump and looking at the stack trace did not reveal anything useful, and on the contrary created more confusion. However, from inside a debug session, the root cause of the issue was identified, and here I take the opportunity to explain these details, since it also sheds more light on the fundamentals of exception handling inside PostgreSQL.

Let’s take a look at the macros PG_TRY, PG_CATCH and PG_ENDTRY:

Here’s the code for PG_TRY

#define PG_TRY(...)  \
   do { \
       sigjmp_buf *_save_exception_stack##__VA_ARGS__ = PG_exception_stack; \
       ErrorContextCallback *_save_context_stack##__VA_ARGS__ = error_context_stack; \
       sigjmp_buf _local_sigjmp_buf##__VA_ARGS__; \
       bool _do_rethrow##__VA_ARGS__ = false; \
       if (sigsetjmp(_local_sigjmp_buf##__VA_ARGS__, 0) == 0) \
       { \
           PG_exception_stack = &_local_sigjmp_buf##__VA_ARGS__

Here we see references to sigjmp_buf and sigsetjmp. These are a subset of some C programming constructs used to create a non-local jump, which is a way to transfer program control from one function to another without using the normal function return mechanism. Another one is the siglongjmp.

sigsetjmp initially returns zero, allowing to enter into the body of the if statement:

if (sigsetjmp(_local_sigjmp_buf##__VA_ARGS__, 0) == 0)

However, if trying to execute the body of the if statement results in a siglongjmp, then the control returns back to the if statement, but this time a non-zero value is returned, so that the control can reach to the body of the else statement, if any. siglongjmp is called by the postgresql code when an error is reported using the ereport function.

 

PG_exception_stack is a global variable, and whenever a PG_TRY block is used, the current value of PG_exception_stack is saved, and a new value is assigned to it to be able to handle the current exception context.

 

Now let’s look at PG_CATCH. This has the body of the else statement for the if statement calling sigsetjmp. So if there were a call to siglongjmp from within the body of the if statement, then the control would directly reach this else block, as explained earlier. We see here that we are restoring the exception stack to its previous state by assigning the saved values back to the PG_exception_stack and error_context_stack.

#define PG_CATCH(...)   \
       } \
       else \
       { \
           PG_exception_stack = _save_exception_stack##__VA_ARGS__; \
           error_context_stack = _save_context_stack##__VA_ARGS__

and PG_ENDTRY also does the same, except that it rethrows the error if required.

#define PG_END_TRY(...)  \
       } \
       if (_do_rethrow##__VA_ARGS__) \
               PG_RE_THROW(); \
       PG_exception_stack = _save_exception_stack##__VA_ARGS__; \
       error_context_stack = _save_context_stack##__VA_ARGS__; \
   } while (0)


Now let’s look at these details in the context of the failing program which I had introduced in the beginning.

When the first SELECT statement in the PL/PGSQL body was executed

select add_one(NULL) into result;

PG_TRY saved the previous environment and assigned a new environment. However, since NULL was passed to add_one, the control returns from the function without restoring the previous environment. As a result PG_exception_stack points to a value that is no longer valid.
The subsequent statement

select add_one(1);

is an invalid statement within a PL/PGSQL block since it does not have the INTO clause. So the PL/PGSQL layer attempts to report an error, and the error reporting infrastructure calls siglongjmp which sends the control to the PG_TRY/PG_CATCH block that handles this error. The body of the PG_CATCH tries to use the PG_exception_stack that points to an invalid memory and the program crashes.

There is a postgresql-hacker thread that talks very briefly about this issue here.

And here, we discuss in detail the internal workings of postgresql exception handling, and understand why it’s not a good idea to exit a PG_TRY block with a return statement.

 

This example code illustrates how the issue can be resolved by moving the return statements outside of the exception block.

#include "postgres.h"
#include "fmgr.h"

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(add_one);

Datum
add_one(PG_FUNCTION_ARGS)
{
   bool        is_null = false;
   int32       arg = 0;
   PG_TRY();
   {
       if (PG_ARGISNULL(0))
           is_null = true;
       arg = PG_GETARG_INT32(0);
   }
   PG_CATCH();
   {
       ereport(LOG,
               (errmsg("add_one failed")));
       PG_RE_THROW();
   }
   PG_END_TRY();

   if (is_null)
       PG_RETURN_NULL();
   PG_RETURN_INT32(arg + 1);
}
Share this