How to Get the Best Out of PostgreSQL Logs

September 03, 2020
How to Get the Best Out of PostgreSQL Logs

As a modern RDBMS, PostgreSQL comes with many parameters for fine-tuning. One of the areas to consider is how PostgreSQL should log its activities. Logging is often overlooked in Postgres database management, and if not ignored, usually wrongly set. This happens because most of the time, the purpose of logging is unclear. Of course, the fundamental reason for logging is well-known, but what is sometimes lacking is an understanding of how the logs will be used. 

Each organization’s logging requirements are unique, and therefore how PostgreSQL logging should be configured will be different as well. What a financial service company needs to capture within its database logs will be different from what a company dealing with critical health information needs to record. And in some cases, they can be similar too.

In this article, I will cover some fundamental practices to get the best out of PostgreSQL logs. This blog is not a hard and fast rule book; readers are more than welcome to share their thoughts in the comments section. To get the best value out of it though, I ask the reader to think about how they want to use their PostgreSQL database server logs:

  • Legal compliance reason where specific information needs to be captured
  • Security auditing where specific event details need to be present
  • Performance troubleshooting where queries and their parameters are to be recorded
  • Day-to-day operational support where  a set number of metrics are to be monitored

With that said, let’s start.

Don’t Make Manual Changes to postgresql.conf

Any changes in the postgresql.conf file should be made using a configuration management system like Puppet, Ansible, or Chef. This ensures changes are traceable and can be safely rolled back to a previous version if necessary. This holds true when you are making changes to the logging parameters.

DBAs often create multiple copies of the postgresql.conf file, each with slightly different parameters, each for a different purpose. Manually managing different configuration files is a cumbersome task if not prone to errors. On the other hand, a configuration management system can be made to rename and use different versions of the postgresql.conf file based on a parameter passed to it. This parameter will dictate the purpose of the current version. When the need is completed, the old config file can be put back by changing the same input parameter. 

For example, if you want to log all statements running on your PostgreSQL instance, a config file with the parameter value “log_statement=all” can be used. When there is no need to record all statements – perhaps after a troubleshooting exercise – the previous config file could be reinstated.

Use Separate Log Files for PostgreSQL 

I recommend enabling PostgreSQL’s native logging collector during normal operations. To enable PostgreSQL native logging, set the following parameter to on:

logging_collector = on

There are two reasons for it:

First of all, in busy systems, the operating system may not consistently record PostgreSQL messages in syslog (assuming a nix-based installation) and often drop messages.  With native PostgreSQL logging, a separate daemon takes care of recording the events. When PostgreSQL is busy, this process will defer writing to the log files to let query threads to finish. This can block the whole system until the log event is written. It is therefore useful to record less verbose messages in the log (as we will see later) and use shortened log line prefixes.

Secondly – and as we will see later – logs should be collected, parsed, indexed, and analyzed with a Log Management utility. Having PostgreSQL record its events in syslog will mean creating an extra layer of filtering and pattern-matching in the Log Management part to filter out all the “noise messages”. Dedicated log files can be easily parsed and indexed for events by most tools.

Set Log Destination to stderr

Let’s consider the “log_destination” parameter. It can have four values:

log_destination = stderr | syslog | csv | eventlog

Unless there is a good reason to save log events in comma-separated format or event log in Windows, I recommend setting this parameter to stderr. This is because with a CSV file destination, a custom “log_line_prefix” parameter value will not have any effect, and yet, the prefix can be made to contain valuable information.

On the flip side though, a CSV log can be easily imported to a database table and later queried using standard SQL. Some PostgreSQL users find it more convenient than handling raw log files. As we will see later, modern Log Management solutions can natively parse PostgreSQL logs and automatically create meaningful insights from them. With CSV, the reporting and visualization has to be manually done by the user. 

Ultimately it comes down to your choice. If you are comfortable creating your own data ingestion pipeline to load the CSV logs into database tables, cleanse and transform the data, and create custom reports that suit your business needs, then make sure the “log_destination” is set to CSV.

Use Meaningful Log File Names

When PostgreSQL log files are saved locally, following a naming style may not seem necessary. The default file name style is “postgresql-%Y-%m-%d_%H%M%S.log” for non-CSV formatted logs, which is sufficient for most cases.

Naming becomes important when you are saving log files from multiple servers to a central location like a dedicated log server, a mounted NFS volume, or an S3 bucket. I recommend using two parameters in such case:

log_directory
log_filename

To store log files from multiple instances in one place, first, create a separate directory hierarchy for each instance. This can be something like the following:

/<Application_Name>/<Environment_Name>/<Instance_Name>

Each PostgreSQL instance’s “log_directory” can then be pointed to its designated folder.

Each instance can then use the same “log_filename” parameter value. The default value will create a file like 

postgresql_2020-08-25_2359.log

To use a more meaningful name, set the “log_filename” to something like this:

log_filename = "postgresql_%A-%d-%B_%H%M"

The log files will then be named like:

postgresql_Saturday-23-August_2230

Use Meaningful Log Line Prefix

PostgreSQL log line prefixes can contain the most valuable information besides the actual message itself. The Postgres documentation shows several escape characters for log event prefix configuration. These escape sequences are substituted with various status values at run time. Some applications like pgBadger expect a specific log line prefix. 

I recommend including the following information in the prefix:

  • The time of the event (without milliseconds): %t
  • Remote client name or IP address: %h
  • User name: %u
  • Database accessed: %d
  • Application name: %a
  • Process ID: %p
  • Terminating non-session process output: %q
  • The log line number for each session or process, starting at 1: %l

To understand what each field in the prefix is about, we can add a small literal string before the field. So, process ID value can be preceded by the literal “PID=”, database name can be prefixed with “db=” etc.  Here is an example:

log_line_prefix = 'time=%t, pid=%p %q db=%d, usr=%u, client=%h , app=%a, line=%l '

Depending on where the event is coming from, the log line prefix will show different values. Both background processes and user processes will record their messages in the log file. For system processes, I have specified %q, which will suppress any text after the process ID (%p). Any other session will show the database name, user name, client address, application name, and a numbered line for each event.

Also, I included a single space after the log line prefix. This space separates the log event prefix from the actual event message. It does not have to be a space character- anything like a double colon (::), hyphen (-), or another meaningful separator can be used.

Also, set the “log_hostname” parameter to 1:

log_hostname = 1

Without this, only the client IP address will be shown. In production systems, this will typically be the address of the proxy, load balancer, or the connection pooler. Unless you know the IP addresses of these systems by heart, it may be worthwhile to log their hostnames. However, the DNS lookup will also add extra time for the logging daemon to write to the file.

Another parameter that should be set along with the “log_line_prefix” is “log_timezone”. Setting this to the local timezone of the server will ensure logged events are easy to follow from their timestamp instead of converting to local time first. In the code snippet below, we are setting the log_timzeone to Australian Eastern Standard Timezone:

log_timezone = 'Australia/Sydney'

Log Connections Only

Two parameters control how PostgreSQL records client connections and disconnections. Both parameters are off by default. Based on your organization’s security requirements, you may want to set one of these to 1 and the other to 0 (unless you are using a tool like pgBadger – more on that later).

log_connections = 1
log_disconnections = 0

Setting log_connections to 1 will record all authorized connections as well as attempted connections. This is obviously good for security auditing: a brute force attack can be easily identified from the logs. However, with this setting enabled, a busy PostgreSQL environment with thousands, or even hundreds of short-lived valid connections could see the log file getting inundated. 

Nevertheless, it can also identify application issues that may not be obvious otherwise. A large number of connection attempts from many different valid client addresses may indicate the instance needs a load balancer or connection pooling service in front of it. A large number of connection attempts from a single client address may uncover an application with too many threads that need some type of throttling.

Log DDL and DML Operations Only

There is a lot of debate around what should be recorded in the Postgres log – i.e., what should be the value of the “log_statement” parameter. It can have three values:

log_statement = 'off' | 'ddl' | 'mod' | 'all'

It may be tempting to set this to “all” to capture every SQL statement running on the server, but this may not always be a good idea in reality.

Busy production systems mostly run SELECT statements, sometimes thousands of those per hour. The instance might be running perfectly well, without any performance issues. Setting this parameter to “all” in such cases would unnecessarily burden the logging daemon as it has to write all those statements to the file.

What you want to capture, though, is any data corruption, or changes in the database structure that caused some type of issue. Unwanted or unauthorized database changes cause more application issues than selecting data; that’s why I recommend setting this parameter to “mod”. With this setting, PostgreSQL will record all DDL and DML changes to the log file.

If your PostgreSQL instance is moderately busy (dozens of queries per hour), feel free to set this parameter to “all”. When you are troubleshooting slow-running SELECT queries or looking for unauthorized data access, you can also set this to “all” temporarily. Some applications like pgBadger also expect you to set this to “all”.

Log “Warning” Messages and Up

If the “log_statement” parameter decides what type of statements will be recorded, the following two parameters dictate how detailed the message will be:

log_min_messages
log_min_error_statement

Each PostgreSQL event has an associated message level. The message level can be anything from verbose DEBUG to terse PANIC. The lower the level, the more verbose the message is. The default value for the “log_min_messages” parameter is “WARNING”. I recommend keeping it to this level unless you want informational messages to be logged as well.

The “log_min_error_statement” parameter controls which SQL statements throwing error will be logged. Like “log_min_message”, any SQL statement having an error severity level equal or above the value specified in“log_min_error_statement” will be recorded. The default value is “ERROR”, and I recommend keeping the default.

Keep Log Duration Parameters to Default

Then we have the following two parameters:

log_duration
log_min_duration_statement

The “log_duration” parameter takes a boolean value. When it is set to 1, the duration of every completed statement will be logged. If set to 0, statement duration will not be logged. This is the default value, and I recommend keeping it to 0 unless you are troubleshooting performance problems. Calculating and recording statement durations makes the database engine do extra work (no matter how small), and when it is extrapolated to hundreds or thousands of queries, the savings can be significant.

Lastly, we have the “log_min_duration_statement” parameter. When this parameter is set (without any units, it’s taken as milliseconds), the duration of any statement taking equal to or longer than the parameter value will be logged. Setting this parameter value to 0 will record the duration of all completed statements. Setting this to -1 will disable statement duration logging. This is the default value, and I recommend keeping it so.

The only time you want to set this parameter to 0 is when you want to create a performance baseline for frequently run queries. Bear in mind though, if the parameter “log_statement” is set, the statements that are logged will not be repeated in the log messages showing durations. So you will need to load the log files in a database table, then join the Process ID and Session ID values from the log line prefixes to identify related statements and their durations.

Whatever the means, once you have a baseline for each frequently run query, you can set the “log_min_duration_statement” parameter to the highest of the baseline values. Now, any query running longer than the highest baseline value will be a candidate for fine-tuning.

Keep Error Message Verbosity to Default

The “log_error_verbosity” parameter can have three possible values:

log_error_verbosity = terse | standard | verbose

This parameter controls the amount of information PostgreSQL will record for each event recorded in the log file. Unless debugging a database application, this parameter is best to keep to “default”. The verbose mode will be useful when you need to capture the file or function name and the line number there that generated the error. Setting this to “terse” will suppress logging the query, which may not be useful either.

Find a Log Rotation Policy That Works for Your Use Case

I recommend creating a log rotation policy based on either the size or age of the log file, but not both. Two PostgreSQL configuration parameters dictate how old logs are archived and new logs are created:

log_rotation_age = <number of minutes>
log_rotation_size = <number of kilobytes>

The default value for “log_rotration_age” is 24 hours, and the default value for “log_rotation_size” is 10 megabytes.

In most cases, having a size rotation policy does not always guarantee the last log message in the archived log file is completely contained in that file only.

If the “log_rotation_age” is kept to its default value of 24 hours, each file can be easily identified and individually examined, as each file will contain a day’s worth of events. However, this, too, does not guarantee that each file will be a self-contained unit of logs of the last 24 hours. Sometimes a slow-performing query can take more than 24 hours to finish; events could be happening when the old file is closed, and the new one is generated. This can be the case during a nightly batch job,  resulting in some parts of the queries recorded in one file and the rest in another.

Our recommendation is to find a log rotation period that works for your use case. Check the time difference between two consecutive periods of lowest activity (for example, between one Saturday to the next). You can then set the “log_rotation_age” value to that time difference, and during one of those periods, restart the PostgreSQL service. That way, PostgreSQL will rotate the current log file when the next lull period happens. However, if you need to restart the service between these periods, the log rotation will again be skewed. You will have to repeat this process then. But as like many other things in PostgreSQL, trial and error will dictate the best value. Also, if you are using a log management utility, the log rotation age or size will not matter because the log manager agent will ingest each event from the file as it is added.

Use Tools Like pgBadger

pgBadger is a powerful PostgreSQL log analyzer that can create very useful insights from Postgres log files. It is an open-source tool written in Perl with a very small footprint in the machine where it runs. The tool is run from the command line and comes with a large number of options. It will take one or more logs as its input and can produces a HTML report with detailed statistics on:

  • Most frequently waiting queries.
  • Queries generating most temporary files or the largest temporary files
  • Slowest running queries
  • Average query duration
  • Most frequently run queries
  • Most frequent errors in queries
  • Users and application who run queries
  • Checkpoints statistics.
  • Autovacuum and autoanalyze statistics.
  • Locking statistics
  • Error events (panic, fatal, error and warning).
  • Connection and session profiles (by database, user, application)
  • Session profiles
  • Query profiles (query types, queries by database/application)
  • I/O statistics
  • etc.

As I mentioned before, some of the log-related configuration parameters have to be enabled to capture all log events so pgBadger can effectively analyze those logs. Since this can produce large log files with many events, pgBadger should be only used to create benchmarks or troubleshoot performance issues.. Once the detailed logs have been generated, the configuration parameters can be changed back to their original values. For continuous log analysis, it’s best to use a dedicated log management application.

If you are more comfortable doing things in command prompt and making use of system views, you would want to use pg_stat_statements. In fact, this should be enabled in any production PostgreSQL installation.

pg_stat_statements is a PostgreSQL extension and with the default installation now. To enable it, the “shared_preload_libraries” configuration parameter should have pg_stat_statements as one of its values. It can be then installed like any other extension using the “CREATE EXTENSION” command. The extension creates the pg_stat_statement view which provides valuable query-related information.

Use a Log Management Application to Gain Insight

There are many log management utilities in the market, and most organizations use one or more these days. Whatever tool is in place, I recommend making use of it to collect and manage PostgreSQL logs. 

There are a few reasons for it:

It is much easier to parse, analyze, and filter out noise from log files with an automated tool.  Sometimes, an event can span multiple log files based on the duration of the event, and the log rotation age or size. Having a log manager makes it much simpler to have this information presented as a whole.

Log management solutions today typically come with built-in ability to parse PostgreSQL logs. Some also come with dashboards that can show the most common metrics extracted from these logs.

Most modern log management applications also offer powerful search, filter, pattern matching, event correlation, and AI-enabled trend analysis features. What’s not visible to ordinary eyes can be easily made evident by these tools.

Finally, using a log manager to store PostgreSQL logs will also mean the events are saved for posterity, even if the original files are deleted accidentally or maliciously.

Although there are obvious advantages of using a log management application, many organizations have restrictions about where their logs can live. This is a typical case with SaaS-based solutions where logs are often saved outside an organization’s geographic boundary – something that may not be compliant with regulatory requirements.

In such cases, I recommend choosing a vendor with a local data center presence – if possible – or using a self-managed log manager hosted in the organization’s network, such as an ELK stack.

Final Words

PostgreSQL server logs can be a gold-mine of information when appropriately configured. The trick is to determine what to log and how much to log, and more importantly, test if the logs can deliver the right information when needed. It will be a matter of trial and error, but what I have discussed here today should give a pretty decent starting. As I said at the beginning, I would be more than happy to hear about your experience of configuring PostgreSQL logging for optimal results.

Share this

More Blogs