Speeding Bulk Data Loads in Postgres

April 26, 2016

Contributed by Jason Davis 

EnterpriseDB® (EDB™) developed a high-performance tool for loading data in bulk called EDB*Loader that is twice as fast as PostgreSQL COPY and outperforms standard SQL statement for bulk loading of JSONB data. The tool is part of EDB Postgres™ Advanced Server. The recent 9.5 release of EDB Postgres Advanced Server included a new ‘FREEZE’ parameter to increase performance for EDB*Loader even further.

The EDB*Loader command line utility loads data from an input source, typically a file, into one or more tables using a subset of the parameters that are compatible with those offered by Oracle’s SQL*Loader tool.

EDB*Loader offers many performance features, including direct and parallel direct path load to ensure data is loaded into the system in the fastest way possible. A direct path load writes the data directly to the database pages, which is then synchronized to disk. The insert processing associated with a conventional path load is bypassed, resulting in performance improvement. A direct path load is faster than a conventional path load, but requires the removal of most types of constraints and triggers from the table. Primary key, not null constraints, and index constraints are permitted, but triggers, check constraints, and foreign key constraints are not.

Given the restrictions on direct loading, some customers have data that cannot be loaded into the system unless they use the conventional loading method. EDB developed the FREEZE parameter for the conventional loading method to improve performance for these customers.

Setting FREEZE to TRUE indicates that the data should be copied with the rows frozen. A row that is guaranteed to be visible to all current and future transactions is marked as frozen to prevent transaction ID wraparound. This shows a performance gain of almost 50% as there is no need to perform a vacuum.

Continuing to enhance tools like EDB*Loader that ease the tasks of database administrators and increase performance are core to EDB’s mission of ensuring success with enterprise Postgres in mission-critical environments. EDB also develops enhancements for the core PostgreSQL code while working to increase performance of EDB Postgres. 

Jason Davis is Senior Director, Product Management, at EnterpriseDB. 

Share this

Relevant Blogs

EDB Reference Architectures

PostgreSQL and EDB Postgres Advanced Server (EPAS) are incredibly flexible database servers that offer an almost infinite variety of options for deployment. This can be daunting for those that are...
September 18, 2020

More Blogs