We all love PgBouncer. It’s a great way to multiplex tens, hundreds, or even thousands of client connections to a small handful of Postgres sessions. What isn’t necessarily so endearing, is that it can’t pass authentication from itself to Postgres, as each Postgres session may exist before the connection to PgBouncer is established. Or can it? Let’s explore how things have evolved, but the news never really got out.
Tell you what I got in mind
As a proxy, PgBouncer authenticates on a per user/database basis. Since Postgres authenticates sessions before they connect, PgBouncer used to have no way to re-auth its own connections. The old, and probably most prevalent way to circumvent this, was to build a userlist.txt
file that contained every user and password hash that should be allowed to connect through PgBouncer.
But is that what we really want? This means there’s a static record of every username and password combination on disk. It also means any time a password in the list changes, we must regenerate that file. In a world of databases, this kind of denormalization isn’t ideal.
Luckily we can fix it, though there are several steps involved. We should also note that the default value of auth_query
is a direct query to pg_shadow
. This is generally bad practice, and the official documentation includes a more secure example using a callable function. We will be using a derived example for this demonstration.
What I have, I knew was true
To keep things simple, we’ll assume there’s a local pgbouncer running as the postgres
OS user. One benefit to this, is that we can lock down Postgres itself to only allow local connections as well. This prevents users from connecting to Postgres directly, even if they had the full connection string to do so.
For a setup like this, we might find this line or something similar in the pgbouncer.ini
file, under the [databases]
heading:
[databases]
* = host=localhost auth_user=pgbouncer
This particular line means any PgBouncer session will connect to the Postgres server running on 127.0.0.1
or ::1
only. It also makes use of the new auth_user
syntax that makes all of this magic work properly. With that enabled, there’s only one more change we need to make to PgBouncer itself.
Won’t you ever set me free?
In the [pgbouncer]
section of the pgbouncer.ini
file, we need to specify a query. In the documentation, they use an example that directly interrogates the pg_shadow
view. This view, and the underlying pg_authid
table where Postgres stores authentication, are only available to superusers by default.
Do we really want PgBouncer to operate as a superuser? Of course not! If we use a set-returning function instead, PgBouncer can obtain credentials for comparison without being a superuser. This is how we set it up in our configuration under the [pgbouncer]
section of the config:
[pgbouncer]
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
auth_query = SELECT * FROM pgbouncer.get_auth($1)
Once we’ve made these changes, we just need to supply two things:
- The
pgbouncer
Postgres user. - The
get_auth
set-returning function.
But to me there’s no surprise
First, let’s start by creating the pgbouncer
Postgres user. This allows PgBouncer to connect as itself and invoke the get_auth
function.
CREATE USER pgbouncer WITH PASSWORD 'changeme';
Note how we did nothing but create the user and give it a password. It has no special privileges, and currently no extra grants. Now as the postgres
OS user, we can export this information to the userlist.txt
file that used to be required for all users.
cat <<EOF | psql -q -d postgres
COPY (SELECT usename, passwd
FROM pg_shadow
WHERE usename='pgbouncer')
TO '/etc/pgbouncer/userlist.txt'
WITH (FORMAT CSV, DELIMITER ' ', FORCE_QUOTE *)
EOF
We also need to ensure this line is somewhere near the top of our pg_hba.conf
file for Postgres itself:
host all pgbouncer 127.0.0.1/32 md5
This ensures PgBouncer can only connect locally, and only via the password we created.
This waitin’ ’round’s killin’ me
This still leaves the necessary authentication retrieval function. This is the tricky part that can easily go wrong. Let’s start with the function itself:
CREATE SCHEMA pgbouncer AUTHORIZATION pgbouncer;
CREATE OR REPLACE FUNCTION pgbouncer.get_auth(p_usename TEXT)
RETURNS TABLE(username TEXT, password TEXT) AS
$$
BEGIN
RAISE WARNING 'PgBouncer auth request: %', p_usename;
RETURN QUERY
SELECT usename::TEXT, passwd::TEXT FROM pg_catalog.pg_shadow
WHERE usename = p_usename;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
To keep things organized, we placed the function in its own schema. This will need to exist in any database where PgBouncer will proxy clients. Note that we also specifically prepended the pg_shadow
view with the pg_catalog
schema. This is necessary to prevent an attacker from supplying a substitute search path and obtaining data from an arbitrary table. Even though the pgbouncer
user is not a superuser, it’s good to be thorough.
We even raise a warning any time PgBouncer authenticates instead of Postgres. This will log the attempt to the Postgres logs in most cases for auditing purposes. We use WARNING
here because that’s the default setting for log_min_messages
. In systems that use a lower value like NOTICE
, our function could follow suit.
Things go wrong, they always do
Despite these safeguards, eagle-eyed readers are probably already cringing. Why? Let’s connect as the pgbouncer
user and call our function.
SELECT * FROM pgbouncer.get_auth('postgres');
usename | passwd
----------+-------------------------------------
postgres | md54aeec1a9950d60e0d3e98a5b136222f0
Isn’t that what we wanted? Yes and no. We created the function as the postgres
superuser, and we never granted use of the function to anyone. Yet the pgbouncer
user can invoke it. This is due primarily to how Postgres implements function security. While functions can execute SQL, they’re more commonly associated with calculations. Think set theory: f(x) = y. Thus functions are automatically granted access to PUBLIC
, an alias for all users.
So the function that circumvents the security of pg_shadow
is now callable by every user in the database. Oops!
To really lock this down, we need to explicitly revoke permissions, and then grant them only to the pgbouncer
user.
REVOKE ALL ON FUNCTION pgbouncer.get_auth(p_usename TEXT) FROM PUBLIC;
GRANT EXECUTE ON FUNCTION pgbouncer.get_auth(p_usename TEXT) TO pgbouncer;
Once we’ve done that, this is what we should see if we use any user other than pgbouncer
:
SELECT * FROM pgbouncer.get_auth('postgres');
ERROR: permission denied FOR FUNCTION get_auth
That was a close one!
Everything that’s serious lasts
Once we reload Postgres and PgBouncer, all of these changes should activate. Now we never have to update userlist.txt
unless we change the pgbouncer
password itself. We also log any password auth attempt to Postgres in case the pgbouncer
user is compromised and someone invokes a dictionary attack against our function. Can we go even further? Sure:
- Exclude superusers from being returned in the auth function at all. Elevated privilege users or roles should not connect through the proxy. This prevents accidentally or maliciously exposing superuser password hashes to PgBouncer.
- Add a
WHERE
clause so the function can only auth against specific users. This ensures only front-end connections can operate through the proxy and are thus compatible with the auth function. - Join against
pg_auth_members
and only allow users within certain groups. This is the same as the previous point, but more generally useful. Users within ause_proxy
group for example, would allow us to control authentication with a simple GRANT statement, rather than modifying the function.
With all of those in mind, a more secure version of our query might look like this:
SELECT u.rolname::TEXT, u.rolpassword::TEXT
FROM pg_authid g
JOIN pg_auth_members m ON (m.roleid = g.oid)
JOIN pg_authid u ON (u.oid = m.member)
WHERE NOT u.rolsuper
AND g.rolname = 'use_proxy'
AND u.rolname = p_username;
There really is no limit to how we can secure or otherwise enhance this function. However, if we truly had our preferences, PgBouncer would call a function and pass in the username and password values as parameters. This would allow the function to return merely a True or False answer if authentication is allowed, and not directly expose password hashes to a tertiary layer.
Unfortunately that by itself would be a security flaw. In many production systems full query logging is enabled, which writes a query and its parameters to the Postgres log. Giving an in-memory binary access to hashes is much different than directly exposing unsalted passwords in a plain-text log. Even if the function accepted a salted and hashed password, having these in the log would still be unnecessarily permissive.
But beyond that, we still have access to a far more versatile approach than before. Hopefully it starts to catch on instead of languishing in relative obscurity. PgBouncer is too good of a tool to allow one of its best features to go unused. If you’re not already using auth_user
and auth_query
, why not give it a try?