Using Docker Hub PostgreSQL images

Craig Ringer

May 17, 2019

Docker Hub carries Docker images for PostgreSQL, based on Debian Stretch or Alpine Linux. These are not official PostgreSQL Development Group (PGDG) images from postgresql.org, they’re maintained in the Docker Library on Github. But as Docker adoption grows these are going to become more and more people’s first exposure to PostgreSQL.

I tried these images out as I needed some convenient and simple Docker images for PostgreSQL in a continuous integration job. It wasn’t entirely fuss-free, but overall the images worked well. I’d like to share some tips.

Ephemeral by default

WARNING: by default these images make it very easy to lose your data forever. Personally I think that’s a significant defect and they should refuse to run without either a mounted volume for the data or a env-var passed to “docker run”. It’s discussed in “Caveats” in the docs provided, but not IMO nearly prominently enough.

If you want persistent storage you really need to supply a Docker volume to use when you first run the new container:

docker run -v my_pgdata:/var/lib/postgresql/data postgres:11

Docker will auto-create the volume my_pgdata if it doesn’t exist. Or you can use a host-path to use a bind mount; see the postgres docker image documentation under “Where to Store Data”.

The Docker postgres images, by default, create a PostgreSQL instance under /var/lib/postgresql/data, which is in the container’s private storage unless mapped to a volume. When the container exits you lose access to the container storage. The data directory is not immediately deleted, but will be lost forever if you clean up exited docker containers with docker rm, docker system prune, etc.

To recover access to the container’s data once the container exits but before such permananent removal has occurred, find the container id with docker ps -a. You might find docker ps -f ancestor=postgres:11 -a useful. Once you have the container ID you can docker start the container to start it again, perhaps to pg_dump its data. It’ll start with any port bindings preserved. Alternately, you can docker cp the data directory in /var/lib/postgresql/data out of the stopped container so you can copy it into a persistent Docker volume and attach that to a new instance of the container.

Data safety

WARNING: Docker’s default storage configuration on most installs may perform poorly and isn’t necessarily crash-safe. The Docker documentation says as much..

If you intend to use anything like this in production make sure you use on of the production-supported Docker storage backends.

This is particularly important for PostgreSQL because of issues relating to how Linux handles failure of flush (fsync) requests to full thin-provisioned volumes. Avoid putting a PostgreSQL data directory on dm-thin / lvmthin backed storage. You can check your storage backend with:

docker info|grep -i '^Storage Driver:'

Locale, encoding and collation

The Debian based images default to the en_US.UTF-8 locale. If you want a different locale see the documentation for the 3 images under “Initialization scripts”.

Root by default

Docker likes to run everything as root and the postgres images are no exception. However, it’ll switch to the internal postgres user with id 22 to run postgres when run as root.

If you try to run with a different --user with a Docker-managed volume you will usually get an error like:

fixing permissions on existing directory /var/lib/postgresql/data ... initdb: could not change permissions of directory "/var/lib/postgresql/data": Operation not permitted

and will need to use the workaround given in the documentation under “arbitrary users”.

This issue won’t arise with a volume bind-mounted into the host workspace, so long as the user has the appropriate datadir permissions, e.g.:

cd $HOME
mkdir pgdata
chmod 0700 pgdata
docker run --user "$(id -u)":"$(id -g)" -v "$(pwd)/pgdata":/var/lib/postgresql/data postgres:11

Connections between containers

The docs cover how to connect to the container from the Docker host well enough, using published ports. But often you’ll want to use the postgres container from another container and that’s rather less simple. Especially as the container lacks a convenient way to set up pg_hba.conf.

To do inter-container connections you can create a user-defined network for your containers and connect between them by container-name. It’s not necessary to override the default pg_hba.conf since it defaults to wide open. For example, to start the server on the default bridge network with an IP arbitrarily assigned by Docker with a datadir bind-mounted into the container from the workdir:

docker network create pgdemo
mkdir resources pgdata
chmod 0700 resources pgdata
echo 'secretpassword' > resources/pgpassword

# launch postgres container to run in the background
#
# note that we do NOT need to "--publish" any ports here
#
PGCONTAINER_ID=$(docker run \
   --network bridge \
   --name postgres \
   --volume $(pwd)/pgdata:/var/lib/postgresql/data:rw \
   --volume $(pwd)/resources:/resources:ro \
   --env POSTGRES_INITDB_ARGS="--auth-host=md5 --auth-local=peer" \
   --env POSTGRES_USER="postgres" \
   --env POSTGRES_PASSWORD_FILE="/resources/pgpassword" \
   --user $(id -u):$(id -g) \
   --detach \
   postgres:11 \
   )

# It started up ok?
docker logs $PGCONTAINER_ID

# there is no port-mapping to the host configured here
docker port postgres

Then with the bridge network you can get the postgres container’s IP address and connect to it from your other container. Or use the --add-host option to docker run to bind it to a hostname in Docker’s managed DNS like in the following example:

# How to re-find the container-id later:
PGCONTAINER_ID=$(docker ps --filter name=postgres -q)

# Container address for mapping
pgcontainer_ip=$(docker inspect -f {{.NetworkSettings.IPAddress}} $PGCONTAINER_ID)

echo "postgres is running on $pgcontainer_ip on default docker bridge"

# Make a simple image to test the container link
mkdir pgtest
cat > pgtest/Dockerfile <<'__END__'
FROM debian:stretch
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y -qq update && apt-get -y -qq install postgresql-client
CMD psql "host=postgres user=postgres password=secretpassword" -c "SELECT 'successfully queried postgres container';"
__END__

docker build -t pgtest:latest pgtest

# Start the container with the setup script. Runs as root for simplicity;
# If you just use --user then libpq will fail because it doesn't like
# running as a user that doesn't exist inside the container.
docker run \
  --name throwaway \
  --network bridge \
  --add-host postgres:${pgcontainer_ip} \
  --volume $(pwd)/test.sh:/test.sh:ro \
  pgtest:latest

This should emit:

                ?column?                 
-----------------------------------------
 successfully queried postgres container
(1 row)

Later you can docker stop postgres. Or you can docker kill postgres; docker rm postgres and then re-docker run a new copy of the container with the same data directory later. Either is fine, though changing the parameters you pass to docker run on subsequent runs with the same data volume will have no effect.

Don’t forget to docker rm the container(s) you created for testing then docker volume rm the volume(s) you created for testing if you don’t want them anymore, e.g.

docker kill postgres pgtest
docker rm postgres pgtest
docker docker rmi pgtest

Container link with user defined network

Instead of using --network default you can docker network create mynetwork and use --network mynetwork with both containers.

If you do so you don’t have to look up the postgres container’s ip address manually and --add-host it. Docker will automatically resolve container names and container IDs with its managed DNS when using a user defined network.

Troubleshooting

"no route to host": make sure the postgres container is still running. Check your iptables rules.
"local user with ID 1001 does not exist": libpq needs the current user-id to correspond to a properly defined user name, i.e. id -n must report a valid result. Verify that id -un reports a username not an error like id: cannot find name for user ID 1001.

Basic configuration only

The container supports limited setup via a few env vars like --env POSTGRES_PASSWORD=foobar at docker run time, but not much more than that. The entrypoint script adds a wildcard pg_hba.conf entry for all users and DBs to accept connections from all addresses. This is set to trust (for all users) if no POSTGRES_PASSWORD is set, otherwise it’s set to md5.

WARNING: This means that if you don’t specify a POSTGRES_PASSWORD, setting a password for the postgres user later will have no effect. Similarly, passwords will be ignored for any other users you create. You really should specify POSTGRES_PASSWORD at docker run time.

WARNING: If you use POSTGRES_PASSWORD the password will be visible in docker inspect output for the container, so consider using a temporary one and changing it once the container is launched. Or use POSTGRES_PASSWORD_FILE to point to a file you bind-mount into the container or add with a derived image.

It’d be really nice if the entrypoint script provided a way to supply configuration files to append/replace in a simple manner. For example it should probably enable directory includes for postgresql.conf so you could just drop your own snippets in there. It should ideally support generating a simple pg_hba.conf or for more complex needs copying one from a path specified as an env-var, probably pointing to a read-only host bind mount. It could copy SSL files etc in a similar manner. But for now, you’ll need a setup hook script and/or derived container as discussed below.

Configuration hook scripts

If you want to provide your own non-trivial pg_hba.conf, append some entries to postgresql.conf, add a pg_ident.conf, drop in SSL certificates and keys, etc, there’s a hook for that. Provide a setup script that the existing entrypoint can run after initdb and after postgres has been started. To avoid creating a derived container you can read-only bind-mount the script directory into the container. The same approach works for things like SSL certificate files.

For example I have a docker-entrypoint-initdb.d/001-setup.sh script. I bind-mount this directory into /docker-entrypoint-initdb.d/ in the container at docker run time, and I also bind-mount a directory containing SSL certificates the script needs to copy into the data directory. When the container first starts the script copies everything into place.

I chose to put my pg_hba.conf and postgresql.conf changes inline in the setup script, but you could just as well ADD them to the container and copy or append them from your script. Here’s an abbreviated version of mine, which relies on the SSL cert, key and root cert being bundled in the docker directory used by docker build:

#!/bin/sh
#
# First run setup script, run after initdb + pg start by Docker
# entrypoint script.
#

set -e -u

if [ $(id -un) != "postgres" ]; then
  echo "Expected to run as user 'postgres' but got user id '$(id -un)' instead"
  exit 1
fi

# replace pg_hba.conf entirely.
cat > "$PGDATA/pg_hba.conf" <<'__END__'
# note that the container runs on a private network so we don't do
# address filtering here. If you wanted you could detect the subnet
# and add it instead, or pass a --env to Docker to substitute, but
# it's pretty pointless to do so.
#
# 
# TYPE      DATABASE       USER           ADDRESS              METHOD
local       all            all                                 peer
host        all            all            0.0.0.0/0            md5
host        all            all            ::0/0                md5
__END__

# Append to postgresql.conf
cat >> "$PGDATA/postgresql.conf" <<'__END__'
# For logical rep protocol tests
wal_level = 'logical'
max_replication_slots = 8
max_wal_senders = 8
ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'
ssl_ca_file = 'root.crt'
__END__

# create regression test user
psql -c "create user test with password 'test'"
psql -c "create database test with owner 'test';"

# install SSL certs
for f in root.crt server.crt server.key; do
  cp "/docker-entrypoint-initdb-resources/$f" "${PGDATA}"
  chmod 0600 "${PGDATA}/$f"
  chown postgres:postgres "${PGDATA}/$f"
done

I can then invoke this without needing to docker build any derived image using something like:

docker run \
  --name=my_postgres \
  --env POSTGRES_PASSWORD=mysecretpassword \
  --volume $(pwd)/docker-entrypoint-initdb-resources:/docker-entrypoint-initdb-resources:ro \
  --volume my_postgres_data:/var/lib/postgresql/data:rw

… where the current directory contains:

  docker-entrypoint-initdb.d/
    001-setup.sh
  docker-entrypoint-initdb-resources/
    root.crt
    server.crt
    server.key

You could use a similar model for copying over a new pg_hba.conf etc on first run.

Note that after initial docker run these files have no effect, and only the ones copied to the Docker volume my_postgres_data (mounted as /var/lib/postgresql/data within the container) matter.

Use a consistent –user

If your runtime changes the uid you use when you run the container when you run it multiple times using the same volume, the postgres docker images may get upset and confused at you. The entrypoint does not do any permissions fixups. Use a consistent uid or just live with Docker’s bizarre indifference to using root for everything. The container will still switch to an unprivileged user to run postgres.

Beware of libc collations when moving data directories around

This isn’t a Docker-specific issue but becomes more prominent with the portability of Docker containers.

Don’t try to run a data directory created with the alpine-based images on a Debian based image. Similarly, avoid mounting and running a volume used in these Debian based containers in some other OS or other Debian version. That’s because the GNU C library occasionally updates the collations (sorting order) definitions for some languages. When collation changes are made the on-disk structure of PostgreSQL’s indexes no longer matches the ordering produced at runtime by the C library. PostgreSQL may stop looking for data prematurely or experience a variety of other surprising behaviours relating to index searches and maintenance.

Stick to one container.

Resource Feature Callout 1