5 Common PostgreSQL Challenges in the Cloud: AWS RDS & Aurora

Jamie Watt

August 18, 2020

“The cloud” feels like it’s been a thing for about, well, forever at this point. Yet, we see many customers who are just beginning to explore their options with the most popular providers. The catalyst is rarely a burning desire to move just your databases to a cloud provider; more often than not, it’s a result of a corporate initiative to move applications and infrastructure to a public cloud--the database is just along for the ride.

#1: Is DBaaS really PostgreSQL?

Customers come to EDB for PostgreSQL needs across the spectrum, including those on a Database-As-A-Service platform. When a DBA transitions a classic PostgreSQL estate to one of these providers, there are a few things to take into consideration, and the first question to ask is, “Where am I going?”

When it comes to installing PostgreSQL directly onto cloud compute, you escape the “Is it PostgreSQL or is it PostgreSQL?” identity crisis; there are infrastructure and design considerations to take into account, but we can exclude the database from those challenges.

However, when it comes to DBaaS offerings, the answer depends on the provider and the product.

What does Amazon Web Services (AWS) offer as a PostgreSQL database service?

AWS offers PostgreSQL in two major flavors:

A basic form with some adjustments for their infrastructure (Amazon RDS for PostgreSQL)
A more robust form with major differences for their architecture (Amazon Aurora with PostgreSQL compatibility).

The former offers much of what you need: a quick-to-configure, easy-to-manage implementation of PostgreSQL. Bit-for-bit, it’s not the same as PostgreSQL, but functionally-speaking, it’s a strong equivalent for the common use case. Features like Multi-AZ and snapshots are accomplished easily here and out-of-the-box here. Performance is consistent. IOPS make throughput needs straightforward for low-to-mid demand platforms.

Amazon RDS for PostgreSQL

There are, of course, some caveats to consider. RDS does not allow you to directly edit the postgresql.conf or pg_hba.conf files. It does allow you to use the DB parameter groups and database grants to handle these respective needs, but that means that any automation you use for a database in any other environment will require change. Performance is also more limited, even if RDS does “make it easy”, because tweaks like WAL storage changes, pgpool or pgbouncer are out if you’re on RDS. Finally, and more commonly-known, you don’t have access to the compute itself. For many, these aren’t showstoppers, but they’re worth knowing before you make your choice.

Amazon Aurora with PostgreSQL Compatibility

Aurora’s PostgreSQL-Compatible Edition is a bit different. It’s highly modified for performance and resilience. The advantages over plain RDS PostgreSQL are straightforward: storage is highly resilient, because even just a single Aurora instance carries six copies of the data, and these copies are distributed across three availability zones. You retain the benefits of RDS PostgreSQL, but acquire a version of PostgreSQL which has a highly-modified storage model which strongly favors applications which perform a high volume of queries and transactions at the same time. This is because the engine itself does not push down modified pages to the storage layer, and instead relies on an in-memory page caching model. This is often touted as a stronger model for recovery times as well as a reduced traffic of IOPS, but it’s worth noting that the cost saved by this change is often overshadowed by the cost of the traffic of the six copies themselves. Aurora’s PostgreSQL flavor retains the same other differences we noted for RDS PostgreSQL itself as well.

What does Microsoft Azure offer as a PostgreSQL database service?

Azure Database for PostgreSQL has similar features and caveats as Amazon’s offerings, though it also has Hyperscale as a deployment option. The product of Microsoft’s acquisition of Citus Data, Hyperscale allows for more powerful growth model with horizontal scalability as a built-in deployment option.

One key consideration that every DBA should listen to intently, though, is that none of these options are purely community-distributed PostgreSQL, nor are they solely database-focused versions. They are modified for, and are dependent on, their environments. In some cases, these modifications may mean that PostgreSQL itself is forked to support optimizations to best make use of the cloud provider's architecture. If there is a fork, then any defects or changes made in the core distribution of PostgreSQL code itself would often take an unknown period of time to be merged back from these community versions into a provider's fork. While an option like EDB’s Advanced Server addresses these changes in step with the community, this may lag by months in the cloud. This is crucial for environments which can’t allow for even weeks or months of additional exposure to defects than would be available elsewhere. Which leads us to #2.

#2 Ground Control To Major Versions

If you’re still undecided on the advantages of DBaaS vs IaaS, be sure to consider your needs for features in later versions of PostgreSQL— because their availability “in the cloud” is a classic YMMV situation.

Amazon RDS PostgreSQL does a fair job, lagging only about five months behind the community’s release in the latest round with Version 12. Amazon Aurora, however, lags a year or more behind— PostgreSQL Version 11 debuted in October 2018, while Version 11 didn’t reach Aurora until November 2019.

Azure Database for PostgreSQL fared slightly better, releasing support for Version 11 in June of 2019.

Google’s Cloud SQL fared poorly in the past— 11 months to GA for Version 11, and more dubiously, 29 months for Version 10. However, they’ve gotten faster in recent times, with the release of Version 12 launched already in May of 2020.

While it seems like just a numbers comparison, take into consideration the changes that came in these versions and evaluate whether or not it’s meaningful for you to have to wait a year or more in order to have them available to you. Version 11 saw partitioning performance, stored procedure transactions, query parallelism, and JIT compilation. Version 12 saw advancements in JSON, indexing performance and functionality, collations, LDAP supportability, and the merging of the recovery.conf into the postgresql.conf. Pretty big stuff to just sit around and wait for, isn’t it?

#3 IaaS Changes Everything

Oftentimes, we see customers who have paid a handsome cost to assess their cloud transition, so while I’ve seen horror stories with application performance failures or the absence of calculating geographical distances between application and database, we won’t focus on that here.

Instead, we’ll highlight some of the more common database failures we see customers experience in the most basic of cloud strategies— moving a virtual or physical “on premise” design to a cloud provider’s infrastructure.

Firstly, know the limitations of your provider.

What are the Amazon Web Services storage limitations?

Amazon, as an example, has limitations in storage both related to the volume size itself as well as the CLI features you can make use of:

16TB maximum volume size on EBS
5TB maximum size for a single stored object
5GB maximum size for any PUT object
Storage performance *and availability will vary greatly based on Storage Classes

What are the Microsoft Azure VM limitations?

Similarly, Azure VM’s have their own limitations based on their design principles:

IOPS are directly tied to the VM itself
VIPs require special handling due to Azure’s IP model
Use of Azure’s load balancer is strongly recommended (and documented)

Going a layer deeper to the compute itself, remember to take into consideration whether or not you want to use an industry-wide distribution, like RHEL or Debian, or you plan to make use of the provider’s distribution, like with Amazon Linux.

What are the benefits or disadvantages of using a cloud provider’s OS like Amazon Linux?

Using the provider’s OS flavor often means more seamless upgrades and less demand for a typical Linux administration role, but it also means that libraries and functionality can be upgraded or deprecated out of step with what you might expect. It also means that your method of installation and upgrades may be equally sensitive to this difference; as an example, we’ve seen customers who are trying to make use of Amazon Linux have yum installs just plain fail as a result of these library differences.

#4 Deployment and Automation

Many IT and DevOps teams have made the shift from basic shell and perl scripts to the modern deployment and automation tools now prevalent: Chef, Puppet and Ansible are those we see most frequently in traditional environments. However, when you begin to move to extend your domain to the cloud, you have to consider whether or not what works today will work tomorrow.

Chances are, it won’t. API’s and CLI’s vary between the providers. If you’re not careful, you face a multiple of effort in both creating scripts that work best for a given environment as well as the cost of maintaining those scripts. You also see the need for pre- and post-promotion scripts, which serve as glue where there are more crucial differences from one platform to the next. Examples in the past have included prewarmed EBS volumes on AWS, or VIP handling for failover in Azure VM’s.

Using Terraform to Deploy Across Cloud Providers

Terraform has become a dominant player for these transitions and extensions, and it’s easy to see why: with extensibility for existing automation platforms like those named above, you can use Terraform to translate deployment across a number of providers. You do have to take on the consideration of the burden of a new tool, but this tool allows you to traverse not only cloud providers, but also VMware. A newer player in the market, Pulumi, has begun to take it a step further—it allows for a wide array of syntaxes to be used, whether you’re used to Go or Python or Shell.

Taking these tools and marrying them to your traditional PostgreSQL needs takes work, and if you aren’t careful, you’ll end up with a suite of manual actions instead of a smooth, efficient operation— whether it be cluster creation, backups, replication, monitoring configuration or other activities, these are important considerations. Luckily, there are script projects out there, such as EDB’s deployment project here, that help to take the guesswork out of your efforts.

#5 Easy Entry & Hard Departure

Wherever you plan to take your databases, remember: it’s easier to get where you’re going than it is to leave where you’ve gone. Migration tools are readily available, such as Amazon’s DMS tool or Azure’s Database Migration Service. But if you find yourself moving again, either to protect against single-provider failure or as a consequence of the “next big thing”, your effort to leave varies greatly based off of your prior decisions.

Database Migration Tools or Services When Moving Out from Cloud Providers

As an example, DBaaS options don’t have a tool or service to leave the platform. All of the hard work you’ve done to address the migration to your original target will now have to be done all over again, and it won’t end there. You may need to redesign your architecture and data structure depending on whether you intend to span many or migrate entirely; and where you intend to span more than one, remember that the “ease of adoption” isn’t just a technical event.

Consider the Cost of Outbound and Inbound Traffic when Using Cloud Providers

The cost of outbound traffic, as an example, is massively higher than inbound traffic. So if you happen to want to design a DR solution that allows for a warm failover capability, all of the consistency that’s maintained by a replication design appears to be free when you’re doing it all within one cloud. Moving to a multi-cloud design means that all of that traffic amounts to a lot of Lego pieces you’ll step on as soon as you see your monthly bill.

Some may opt for containerized designs, which can allow for a design which is the most portable and easily moved from one environment to another— so if you’re already carrying this knowledge within your teams, this may be an option strongly worth considering if your databases are designed to be performant in a container deployment model.

So now what?

These examples are challenges we’ve seen many customers face, and the outcome is most often successful if these considerations are faced early and often. Facing these challenges as problems and symptoms once the solution is already in place is a far more harsh place to do discovery. We strive to enable users of PostgreSQL to be successful, and while we focus on guidance and help every day with our customers in Technical Support and Remote DBA, when it comes to problems, the best experience is no experience. If you’re moving to a cloud provider, or incorporating another provider into your existing design, make sure to not only design a comprehensive solution, but to test it early and often.

Want to learn more? Tune in to our recent Postgres Pulse podcast recording to hear the experts discuss how to overcome challenges if you're new to PostgreSQL in the cloud.

Resource Feature Callout 1

5 Common Challenges For New Arrivals to PostgreSQL in the Cloud