SUMMARY: This article describes how to save PostgreSQL data in a Docker Image using a Dockerfile.

1. Environment details

2. Considerations

3. Dockerfile

4. Image creation using the Dockerfile

5. Container initialization

 

Docker is a tool used for virtualization. It's a command-line program that can be set as a  daemon. With Docker the resources can be isolated, and resource utilization can be limited. The image defines what is being specified in the dockerfile.

A Dockerfile holds the software configurations, permissions, dependencies, and data volumes.

The images have the details of what you would like to have in the environment of the container.

 

Environment details

For this demonstration we will be using the following software versions: 

Host platform: CentOS 7.x

Docker: docker-engine1.13

Docker image: CentOS 7

PostgreSQL version 12

 

Considerations 

Due consideration should be given to the Dockerfile meant for the PostgreSQL database. The volumes should be created for the data to be persistent, because the volumes cannot be resized. In view of the best practices, separate volumes should be created for the archives and the backups. Assuming the deployment to be operations or production, the IP address should be static; otherwise, a separate network should be configured. If the requirement is for a single Docker container, then we can have a single Dockerfile to build and run the container. However, when running multiple containers, Docker Compose can help in orchestration.

 

Dockerfile

FROM centos:7

ENV PATH $PATH:/usr/pgsql-12/bin; export $PATH

ENV PGDATA /var/lib/pgsql/12/data

ENV POSTGRES_USER postgres

ENV POSTGRES_PASSWORD postgres

ENV POSTGRES_DB postgres

ENV LANG en_US.utf8

ENV  LANGUAGE="en_US.UTF-8"

ENV LC_ALL="en_US.UTF-8"

ENV CONF_FILE=/var/lib/pgsql/12/data/postgresql.conf

RUN mkdir -p /etc/selinux/targeted/contexts/

RUN echo '<busconfig><selinux></selinux></busconfig>' > /etc/selinux/targeted/contexts/dbus_contexts

#Import the Keys for Centos and Postgres

RUN \

  rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7                 && \

  rpm --import http://yum.postgresql.org/RPM-GPG-KEY-PGDG-12         && \

  yum -y updateinfo                                                  && \

  yum -y install \

  yum-utils \

  yum-plugin-ovl \

  httpd \

  sshd \

  epel-release \

  yum clean all 

# Postgresql Repo for CentOS7

RUN yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

#Expose Ports

EXPOSE 80 22 5432

#Install postgresql 

RUN yum install -y postgresql12-server

# Password reset for root and postgres

RUN echo "root" | passwd --stdin root

RUN mkdir -p /var/lib/pgsql/12/data/ && chmod 700 /var/lib/pgsql/12/data/

RUN echo "postgres" | passwd --stdin postgres

# Initialize the database cluster with postgres user and start the database cluster with some desired configuration parameter values

RUN su - postgres -c "/usr/pgsql-12/bin/initdb -D /var/lib/pgsql/12/data/  -E 'UTF-8' \

--lc-collate='en_US.UTF-8' \

--lc-ctype='en_US.UTF-8';"

RUN su - postgres -c "/usr/pgsql-12/bin/pg_ctl -D /var/lib/pgsql/12/data/ -l logfile start " 

RUN sed -i "s/#max_wal_senders = 0/max_wal_senders = 5/g" $CONF_FILE; \

    sed -i "s/#wal_level = minimal/wal_level = hot_standby/g" $CONF_FILE; \

    sed -i "s/#hot_standby = off/hot_standby = on/g" $CONF_FILE; \

    sed -i "s/#archive_mode = off/archive_mode = on/g" $CONF_FILE; \

    sed -i "/^#archive_command/ s:#archive_command = '':archive_command = '/bin/true':g" $CONF_FILE

#Allow remote connections to database

RUN echo "host all  all    0.0.0.0/0  trust" >> /var/lib/pgsql/12/data/pg_hba.conf

RUN echo "local all all    172.17.0.1/32 trust" >> /var/lib/pgsql/12/data/pg_hba.conf

RUN echo "local all all    127.0.0.1/32 trust" >> /var/lib/pgsql/12/data/pg_hba.conf

RUN echo "host replication       all             ::1/28         trust" >> /var/lib/pgsql/12/data/pg_hba.conf

RUN echo "host replication all ::/0  trust"  >> /var/lib/pgsql/12/data/pg_hba.conf

RUN echo "listen_addresses='*'" >> /var/lib/pgsql/12/data/postgresql.conf

# Add volumes

VOLUME  ["/usr/pgsql-12/bin/" , “/var/lib/pgsql/data” , "/data02/archive/", "/data01/backup/"]

# Postgresql to be started on each container initialization

RUN systemctl enable postgresql-12.service

CMD [“/usr/sbin/init”]

 

 

Image creation using the dockerfile

[root@vm-test]# docker build  --no-cache -t pgdg12.2 -f ./pg12.dockerfile .

 

Container initialization

[root@vm-test]# docker run -p 5432:5432 --name pgdg12 --privileged -it dc0915fc2b86  /bin/bash 

 

For databases, high availability is of the utmost importance, with the data being available within the specified Recovery Time Objective (RTO)—the amount of time it takes to restore regular business processes after a natural disaster or emergency situation.

The Docker Container is not capable of having the data stored implicitly. The data within the container exists within the container only. On restart or on the deletion of the container, the data within that container will be deleted. We have to explicitly tell Docker that we want this data to be retained by creating volumes.

Defining volumes gives the user liberty to use and mount the desired volumes from the host and remote machines. When linked to the PostgreSQL container, volumes can retain the data. Having separate volumes with backups will assure that the data is available for restoration.Volumes can be updated and shared across containers. The volumes are specific to the containers; they are only managed by Docker, whereas the bind mounts can be modified or updated by other processes also.