Headaches will come with just about any new project but taking a pause and considering the following steps before getting started with a new Postgres deployment should save you quite a few. Even seasoned DBAs can benefit from a refresh on some of these topics. So, in order to reduce risk, minimize potential post-deployment problems, and eliminate long-term maintenance challenges, let’s consider the following Top 10 steps to a smoother Postgres deployment.
The following are explored in greater detailed in a webcast, 10 Tips for an Effective Postgres Deployment, that was recorded on Nov. 19 and can be viewed here.
1. New vs. Existing Hardware
While servers are becoming more powerful, the price of RAM is decreasing. Should you focus on CPUs or cores? What about I/O? Can you really extend the useful life of the existing hardware and still keep up with the demands of the business? In many cases, the answer is yes. There are many variables to consider, and the right “mix” can yield some impressive results on a limited budget. We can offer some guidelines and suggestions to get you started.
2. I/O Partitioning
Once you have the hardware, you want to think about utilizing table spaces in your database. Hybrid storage solutions that have combined SSDs and HDDs work really well, but they don’t “understand” the concept of a database on top of them. Additionally, you want to avoid any storage system that doesn’t create a local device node on your system.
3. Application Needs
It is very easy to design a database solution that is optimal from your point of view but is not optimal for the application. So it’s vital to understand what the application wants. How does the application behave? Is it a few long-running transactions crunching lots of data? Or does the application constantly open small connections and fire off quick little queries, then close the connection? Does it do more reads or more writes? All of these things should inform your decisions about turning the system upon deployment.
4. Application-Database Compatibility
You can build a database system that’s highly available and redundant, and has transparent failover, but what happens if the application does not know how to properly recover? Since there’s actually no such thing as an instant failover, there’s always going to be some period of time, maybe less than a second, where your database is temporarily unavailable. Can your application handle this? How do plan for this?
5. High Availability vs. High Recoverability
When most people say “availability,” they really mean “recoverability.” However, they are still two distinct things. High Availability (HA) simply means that the application doesn’t go down, or if it does go down, it goes down for as little amount of time as possible. With High Recovery systems, it’s more critical that every transaction that was committed to the database stays committed and is recoverable. In this scenario, you’ll want to think about things like warm standby, WAL archiving, and Point-in-time Recovery (PITR). How do you determine what you need, or whether you need a combination or the two?
6. Plan Ahead for Data Loss
If you wait until everything is up and running before you start thinking about recovery, you may find that you’ve painted yourself into a corner. You need to ask yourself two key things: 1) What is the loss to the business per hour/minute/second that the application is down?, and 2) What is the loss to the business for every transaction I can’t recover from my database? We can help you answer these questions and architect for a highly available system, or a highly recoverable system.
7. Synchronization vs. Performance
When building a replication system, you have to consider what type of replication you’ll need. Postgres replication is asynchronous by default. Geographically dispersed systems have latency, and if you’re doing reporting on a replica, that replica could be slightly behind the master. If that scenario is not acceptable, you can reconfigure your replication to use synchronous replication. This form of replication is immediately consistent, but there is a measurable performance impact. So how do you determine the business need and/or priority regarding replication?
8. Cascaded vs. Fanned Replication
There are two modes for building the replication structure: Cascaded and Fanned. Cascaded replication is essentially multi-level replication. The nice thing about cascaded replication is that if the master should die, you still have a replication tier. The downside to that is that the second-level replica can potentially be further behind the master. Fanned replication doesn’t have that problem, but if the master should go down, all of the fanned replicas stop replicating. Why not have your proverbial cake and eat it too?
9. Planning Ahead for Upgrades
Upgrades are a fact of life so planning for the downtime they cause upfront by architecting things properly will spare a headache or three. For major Postgres version releases, the Postgres community does not guarantee that data stored in the database is stored in a backwards-compatible representation. Upgrades to minor versions, on the other hand, do guarantee backwards compatibility. Knowing this allows you to architect the system to work around the upgrade process.
10. Don’t Skip the Upgrades
With Postgres, minor version releases fix bugs, security issues or data corruption problems. Obviously, it’s important to keep up-to-date. Because the Postgres community understands that downtime is involved with major version upgrades, major versions are supported by the community for five years. EDB also provides five years of supports for its Postgres Plus Advanced Server product for any major release (9.2, 9.3, etc). In either case, you have plenty of time to plan for major version upgrades.
To learn more about these topics, plus additional tips and recommendations, view the webcast 10 Tips for an Effective Postgres Deployment. You may also contact us directly for more information on how Postgres can help transform your IT.
Phil Vacca is a Senior Database Consultant at EnterpriseDB.