このブログは、 Ahsan Hadiと共同で執筆しました。
「水平スケーラビリティ」の考え方は、今年オタワで開催されたPGCon 2014にて、その詳細が議論されました。PostgreSQLが水平にスケーラブルな機能をデータベースのコアに組み込む方法について、コミュニティがそのアイデアを議論するのは素晴らしいことです。これは時期尚早ですが、Foreign Data Wrapperインフラストラクチャを使用して拡張し、shardingを提供するためにPostgresのパーティショニング機能を強化する話があります。私たちが最終製品を手に入れるまでの道のりは、間違いなく長いものになるでしょうが、正しい方向への動き出したのは良いことだと言えます。
In this post, we are going to discuss pgpool-II’s load balancing / statement routing and high availability features, explore how suitable these features are for production usage and examine some real world scenarios.
Pgpool-II allows users to create a master/slave configuration, having a single master with multiple slaves. With master/slave configuration, pgpool-II will route all the write queries to the master server and scale the read queries across the slave nodes. Pgpool-II can be configured to use a "backend_weight" parameter if you don’t want any read queries to go to the master node. While the writes are being sent to the master, the read queries are being load balanced to the slave nodes, providing scalability with mixed and read-intensive workloads.
In order to test the system’s scalability, we created a pgpool-II cluster configuration with one master and five slave nodes using large size instances in the Amazon cloud. The purpose of the test was to see how the system scaled compared to standalone Postgres when read replicas were added to pgpool-II.
Pgbench was used as the benchmarking tool for conducting the stress test with read intensive workload. The performance (measured in TPS) of pgpool-II with up to five slave nodes was compared with Postgres running on a single machine, as well as Postgres with pgpool-II but with no replicas. The tests were carried out with PostgreSQL 9.3.2 with pgpool-II 3.3.3 for 100, 300, 1,000, 2,000 scale factor and 100, 300, 1,000 concurrent clients with the duration of the test constant at 20 minutes. The results of the tests are listed below.
In summary, we see that with a relatively low number of concurrent clients and workloads that will fit in the memory, Postgres performs better in a standalone environment. However, when we increase the number of concurrent clients and the scale factor such that the size of the tables is larger than the memory of the single machine, we can achieve good scalability by adding pg-pool II and replicas to the environment.
As you can see in the line graphs for 100 and 300 concurrent clients, the lines for 1,000 and 2,000 scale factor have a positive slope. The same holds true with 1,000 concurrent clients, but we also notice a slight dip when going from four to five replicas. All the graphs show the decrease in performance when going from "PG only” to “pgpool-II no replicas,” which is expected, as there is overhead from having pgpool-II with no replicas as compared to standalone Postgres.
The conclusion from the tests is that we can scale the system for larger workloads and a high number of concurrent clients with read intensive or mixed workloads. As we look ahead to pgpool-II 3.5, we plan to explore the areas of query latency and performance levels with prepared queries to see where there may be room for more improvement.
Now let’s switch focus to look at another important area where Postgres doesn’t have an answer out of the box: High availability (HA) and single point of failure. HA is an important aspect of mission critical systems, ensuring no single point of failure and automatic failover for clients connecting to the application.
Postgres doesn’t have any built-in features for HA and automatic failover, hence users have to rely on external tools for this functionality. EnterpriseDB provides EDB Failover Manager product, currently supported on Red Hat Enterprise Linux & CentOS platforms, to provide HA. EDB has also worked with customers who configure other cluster solutions like Red Hat Cluster Suite or Vertias Cluster Server.
Pgpool-II also provides a capability in this area known as the ‘watchdog feature.’ It provides high availability for Postgres by using a virtual IP. The user can configure watchdog for pgpool-II cluster by using the watchdog configuration parameters used in the pgpool.conf (configuration) file. The main functionalities provided by watchdog include:
- Life checking of pgpool service
- Mutual monitoring of watchdog processes
- Changing active/standby state in case of certain faults detected
- Automatic virtual IP address assigning synchronous to server switching
- Automatic registration of a server as standby in recovery
Here at EDB, we have tested the watchdog feature for the following failover scenarios using two pgpool-II nodes both configured with watchdog:
- Disconnecting the network cable from one of the server
- Shutting down pgpool-II service on one of the servers
- Abrupt switch down of a pgpool-II server
For each of the above scenarios, the secondary node became the primary node by reassigning the virtual IP. Any new client connections would now seamlessly connect to the secondary node. In addition to our internal testing, the PostgreSQL consortium in Japan has carried out extensive tests of the watchdog feature and found it works as expected.
EDB is planning to announce official support for pgpool-II 3.3+ watchdog features to provide High Availability.
A new version of pgpool-II 3.4 will be released later this fall, and it is expected to give additional performance enhancements due to integrating Postgres memory and exception manager in pgpool-II. This work was done by EDB’s Muhammad Usama, head of EDB’s initiative pgpool-II and who supports the open source PostgreSQL project as a committer for pgpool-II.
Ahsan Hadi is Senior Director, Product Development, EnterpriseDB.
Jason Davis is Director, Product Management at EnterpriseDB.