Contributed by Douglas Hunley
When a well-prepared DBA begins to consider their Postgres rollout, they should think about their database strategy sooner rather than later. Thinking about key database variables and questions in advance will give you the inside knowledge you need to make you the IT hero as you reveal a solid deployment plan that directly supports the goals of the business.
Part 2 of this two-part blog series addresses the second set of 5 in the top 10 list of things to consider before undertaking a Postgres rollout.
6) How much data can I afford to lose? If you wait until everything is up and running before you start thinking about recovery, you may find that you’ve painted yourself into a corner. You need to ask yourself two key things: 1) What is the loss to the business per minute that the application is down, and 2) What is the loss to the business for every transaction I can’t recover from my database? The answers to these questions will tell you if you need to architect for a highly available system, or a highly recoverable system.
If it’s HA, you need to start thinking about things that are going to keep the dataset functioning and the application up and running even if it’s wiped out in a disaster, like offsite replication, automatic failover, and multi-master replication. If, on the other hand, you’re more concerned with recovering transactions, and it’s okay for the app to go down for five or ten minutes while you recover data, then you need to perhaps start thinking about synchronous replication, continuous WAL archiving, and PITR.
7) Is ASYNC enough or do I need SYNC?
When building a replication system, in addition to determining availability vs. recoverability, another thing you have to consider is what type of replication you’ll need. Postgres replication is asynchronous by default, so the replication replicas may be one or more transactions behind. This usually isn’t the case; they usually stay up-to-date with the master. However, if they’re very geographically dispersed and the network between them has some amount of latency, or if you’re doing reporting on the replica, you can get a situation where that replica is a second or a fraction of a second behind production.
If that scenario is not acceptable, you can reconfigure your replication to use synchronous replication, which is immediately consistent. But because of that, there is a measurable performance impact on the master. When you as a user or the application commit anything to the master’s database, the transaction is passed over to the replica before that commit is returned back to you as completed. The replica has to commit it, and then send a message back saying that it’s been committed and verified before the master will return it to you. We may be talking about milliseconds here, but there’s a potential for it to add up throughout the life of a page or screen in your application and become a noticeable throughput impact on your application. Another thing to keep in mind with synchronous replication is that there can only be one synchronous replica active at a given time.
8) Cascaded or Fanned Replication?
There are two modes as far as building the replication structure: Cascaded and Fanned. Cascaded replication is essentially multi-level replication. You have a master that feeds into a replica, and that replica in turn feeds into its own replica. The nice thing about cascaded replication is that if the master should die, you still have a replication tier. The downside to that is that the second-level replica can potentially be further behind the master. Fanned replication doesn’t have that problem, but if the master should go down, all of the fanned replicas stop replicating. They essentially sit there with a snapshot of the data as it looked the moment the master died.
Fortunately, Postgres doesn’t have any limitations for mixing and matching these. That is also true for mixing and matching synchronous and asynchronous. So you can have them arranged in any combination that meets your needs.
9) How do I upgrade?
You might be thinking, “Why am I thinking about upgrades? I haven’t even deployed yet!” But upgrades are a fact of life, and they will potentially cause you some amount of downtime. You need to properly architect things so as to make upgrades as smooth a process as possible. If a new major version of Postgres is released (9.2 to 9.3, 9.3 to 9.4, etc.), the Postgres community does not guarantee that data stored in the database is stored in a backwards compatible representation. So for every major version change, you’re going to have to take a cycle where you extract your data, do the upgrade, and re-insert your data. Upgrades to minor versions, on the other hand, do guarantee backwards compatibility. All you need to do to upgrade a minor version (9.35 to 9.36, etc.) is to replace the binaries on your system and restart your database. Knowing this allows you to architect the system to work around the limitations of the upgrade process.
10) Why upgrade?
With Postgres, minor version upgrades (9.3.5 to 9.3.6, etc.) only contain 1 of 3 things: Data corruption fixes, security fixes, or fixes for bugs that are highly visible across the Postgres install base. You won’t find changes in functionality or new functionality in a minor release. It’s in your best interest to make sure that you are at least keeping up with the minor releases and in fact you’re at more risk by not doing so, because you’re missing security patches and corruption fixes. Because the community understands that downtime is involved with major version upgrades, major versions are supported by the community for five years. If you're running EDB's Postgres Advanced Server product, you have a three-year support window for any major release (9.2, 9.3, etc). In either case, you have plenty of time to plan for major version upgrades.
Now that you have all of this figured out and you know what you’re going to deploy, we need to deal with the fact that things always change over time. Your load, hardware, software, business needs, and user base will change. As the architect of the system, you need to periodically review these questions. At a minimum, we suggest you do this annually, though you may need to do it more frequently depending on your specific objectives and needs.
Please Visit EnterpriseDB for more information about using Postgres in the enterprise or contact EDB to send an inquiry.
Doug Hunley is a Certified System Engineer and Certified Postgres Trainer at EnterpriseDB.