This blog post discusses the storage and RAID options that can be used with PostgreSQL and EDB Postgres Advanced Server (EPAS).
In this post, we’ll discuss the following options:
- Direct Attached Storage (DAS)
- Fiber Channel - Storage Area Network (SAN)
- iSCSI
- Network Attached Storage (NAS) and Network File System (NFS)
How does Postgres use storage systems to store data?
Before we start discussing storage options, we need to understand some of the technical backgrounds of how PostgreSQL and EDB Postgres Advanced Server use storage systems.
Almost every Postgres transaction is written to disk, at minimum to the Write Ahead Log (WAL). Writes to the WAL are sequential in nature, and it is generally only read during recovery or when needed replication. These operations need to be synced to disk in a way that ensures that data integrity is not lost. Postgres lets you specify which method it should use when writing to a WAL file through the use of the wal_sync_method configuration parameter.
Storage for WAL should typically be optimized for sequential writing and reading, and because the speed directly affects the time it takes for a transaction to commit, they should have the minimum latency possible.
Postgres writes its data to the heap and indexes it after it's been written to WAL, usually during a checkpoint operation. The heap is also where data is read from. Heap and index data is cached in the shared buffers, and possibly by the operating system kernel. Heap and index I/O are often random in nature unless sequential scans of data are required.
Storage for the heap should be optimized for fast reads like a general rule, particularly with large databases that cannot fit into the shared buffers. This will help ensure that read queries (e.g. SELECT) can execute as quickly as possible. Postgres data files are a group of 1GB or smaller files, which are randomly accessed. This is also another point of consideration while tuning the file systems.
Over the past 10 years, storage systems have become more intelligent, which results in improved performance with less manual design work. However, storage systems may still be a pain point if not correctly chosen, since there are some trade-offs with each solution.
Postgres RAID Levels for Optimized Performance
RAID is used with the vast majority of storage options with Postgres as it can provide redundancy, protecting against failure of a disk, and increased I/O throughput by striping data across multiple drives in chunks which can be accessed in parallel across the devices. Typical wisdom with Postgres has been to use RAID 10 (mirrored and striped) for the best performance and redundancy. However, that option can be expensive as you need twice the number of disks in order to mirror each one. RAID 5 (N disks + 1 parity disk) and 6 (N disks + 2 parity disks) have been found to offer poor performance in the past, though more recently much better results have been reported by users. The advantage with RAID 5 and 6 is that fewer drives are typically required.
Postgres Physical Storage Options: Compare Costs and Performance
DAS
DAS (Direct Attached Storage) refers to connecting your drives to the server, usually using the slots which are integrated into the server inside the drive cage. Unlike SAN (as discussed in section 3.2), there is usually no external cable involved - all drives are attached directly to the motherboard or a RAID adaptor within the chassis. Basic usage of DAS is the desktop machines at home/work. There are exceptions to this; some systems may have external drive enclosures that are directly attached to the server via SCSI or fiber channel, though they are less commonly seen.
One of the major advantages of DAS is the cost: DAS is cheaper.
If there is more than one interface on DAS, data can be shared with multiple other servers. This is easier to do with SAN and NAS though. DAS storage can usually provide redundancy; most modern servers with internal storage will include a RAID adaptor that can be configured with mirroring.
For DAS storage (and with hardware RAID cards), RAID 1 or (preferably) RAID 10 provides much better performance, especially when there are significantly more reads than writes. If this is not the case, the Storage Area Network should be evaluated as well.
Storage Area Network (SAN)
A Storage Area Network (SAN) is essentially a group of drives connected to at least one server providing simultaneous access to the drives to one or more servers on the network. The major use case of SAN is combining large number drives in one or more “backplanes”. One of the major benefits of using SAN is the total number of drives that can be attached. Even though modern servers can hold up to 24 drives (DAS), with SAN you can use more spindles to handle the data.
SANs typically also provide redundancy options for the entire storage system - drives, power supplies, network interfaces and controllers, which are suitable for any enterprise company. SANs will provide highly reliable centralized storage for many of your servers.
Now, let's discuss the features of a SAN:
- When compared to DAS, SANs have slight access latency, even when a fiber channel is used. This might result in a write bottleneck. However, modern SAN solutions like EMC offer extremely high performance in general. For performance reasons, iSCSI solutions should not be considered for PostgreSQL; we cannot afford any network latency or breakage here. This will be discussed further in the NAS/NFS section. Please note that adding more spindles to SANs will solve almost all of these problems, albeit at a potentially significant cost. If using a SAN, it is recommended that technologies such as Fiber Channel (FC), Fiber Channel over Ethernet (FCoE), or Infiniband be considered for optimal performance.
- SANs typically have more cache than the RAID controllers which are used along with DAS. EMC (or HP, or other) storage systems have hundreds of gigabytes of cache, where a decent RAID controller has 2 GB of cache. This will give a performance boost, especially on write operations when there is a large write-back cache. Care must be taken to ensure that the cache is non-volatile, however; when the database server acknowledges that a transaction has been committed, it must have been written to some form of non-volatile storage where we can be certain that it won't be lost if there's a power failure for example.
- SANs are intelligent: They can optimize read/write ordering themselves. From a Linux point of view, for example, it is better to use none scheduler with major SAN vendors, instead of using the mq-deadline scheduler.
- Unlike DAS, SANs have the ability to mirror the data to another location or to share a backup/snapshot data to another server. This is a useful technique for ensuring that a backup is readily available in the event of storage failure, but it is not a substitute for regular normal backups; if data is incorrectly written or modified on the primary storage, the issue will be blindly mirrored to the secondary copies.
- The management features of SANs are well beyond those of DAS: Using their built-in interfaces, one can manage and control the redundancy of the SAN relatively easily, as well as make use of APIs to automate the provisioning and management of storage.
- The Host Bus Adapter (HBA) cards used to interface with the SAN must be of good quality to ensure performance and reliability.
Finally, SANs are more expensive than DAS, including maintenance costs.
In recent multi-tier architectures, the storage arrays do not provide RAID level 10 (the device must be reconfigured from scratch to have this RAID option, which won’t be feasible). Recent All-Flash-Array (AFA) devices provide at least RAID 5 or usually RAID 6 (double parity). Such “double parity” devices provide Postgres a latency usually in the order of 1ms maximum with hundreds of thousands of random read and write IOPS, with 2 GB/sec sequential write and read. These numbers are provided continuously, not just during sudden bursts.
Postgres works very well with SAN storage using RAID 6.
NAS and NFS
Another alternative for storing data externally is using Network Attached Storage (NAS). NAS uses the network for accessing the storage usually utilizing protocols such as NFS or iSCSI, unlike SANs, which typically use fiber channel or similar technologies such as Infiniband instead.
Like SAN, NAS has bottlenecks when compared to DAS: Since it transfers data over the network, there will be latency when compared to DAS, and may also saturate the network under heavy load. The network is not the only bottleneck: An indirect bottleneck is caused by the CPU cycles spent to convert block requests to retrieve the data from NAS.
Here is a paragraph in the official PostgreSQL documentation.
We are not aware of any NFS issues when properly configured, as mentioned in the link above.
iSCSI
iSCSI (Internet Small Computer Systems Interface) is based on TCP/IP and is a storage system that depends on a network. One of the benefits of using iSCSI compared to SAN is that iSCSI storage can be installed to a remote location since all it depends on is a normal network interface, where SAN uses fiber channel or other purpose-designed specialist storage network interfaces.
However, all concerns that we discussed in “Section 3.3: NAS” are also valid here. The latency in the network will cause slowdowns while fetching data from or writing data to storage. iSCSI can be useful for file storage, etc., however, it is not good for applications that are sensitive to performance like Postgres.
Conclusion: Which storage system is best for Postgres?
There is no single choice between the storage options. If the budget is limited, or if you want performance at a relatively reasonable price, DAS would be your choice. SAN users often have other business drivers for it: Using more spindles, storage level replication, centralized storage units, etc. DAS will beat SAN under light load, because of the latency involved. However, under high I/O, SAN will beat DAS.
Want to learn more? Check out our free Postgres training courses, where you can learn everything from the basics of PostgreSQL to advanced enterprise architecture and performance tuning.