The diskquota module allows WarehousePG administrators to limit the amount of disk space used by schemas, roles, or tablespaces in a database.
This topic includes the following sections:
- Installing and Registering the Module (First Use)
- About the diskquota Module
- Understanding How diskquota Monitors Disk Usage
- About the diskquota Functions and Views
- Configuring the diskquota Module
- Using the diskquota Module
- Known Issues and Limitations
- Notes
- Upgrading the Module
- Examples
Installing and Registering the Module (First Use)
The diskquota module is installed when you install WarehousePG.
Before you can use the module, you must perform these steps:
Create the
diskquotadatabase. Thediskquotamodule uses this database to store the list of databases where the module is enabled.$ createdb diskquota;
Add the
diskquotashared library to the WarehousePGshared_preload_librariesserver configuration parameter and restart WarehousePG. Be sure to retain the previous setting of the configuration parameter. For example:$ gpconfig -s shared_preload_libraries Values on all segments are consistent GUC : shared_preload_libraries Coordinator value: auto_explain Segment value: auto_explain $ gpconfig -c shared_preload_libraries -v 'auto_explain,diskquota-2.2' $ gpstop -ar
Register the
diskquotaextension in each database in which you want to enforce disk usage quotas. You can registerdiskquotain up to ten databases.$ psql -d testdb -c "CREATE EXTENSION diskquota"
Run the
diskquota.init_table_size_table()UDF in the database. In a database with many files, this can take some time. Thediskquotamodule cannot be used until the initialization is complete.=# SELECT diskquota.init_table_size_table();
Note You must run the
diskquota.init_table_size_table()UDF fordiskquotato work.
About the diskquota Module
The disk usage for a table includes the table data, indexes, toast tables, and free space map. For append-optimized tables, the calculation includes the visibility map and index, and the block directory table.
The diskquota module allows a WarehousePG administrator to limit the amount of disk space used by tables in schemas or owned by roles in up to 50 databases. The administrator can also use the module to limit the amount of disk space used by schemas and roles on a per-tablespace basis, as well as to limit the disk space used per WarehousePG segment for a tablespace.
Note A role-based disk quota cannot be set for the WarehousePG cluster owner (the user that creates the WarehousePG cluster).
You can set the following quotas with the diskquota module:
- A schema disk quota sets a limit on the disk space that can used by all tables in a database that reside in a specific schema. The disk usage of a schema is defined as the total of disk usage on all segments for all tables in the schema.
- A role disk quota sets a limit on the disk space that can be used by all tables in a database that are owned by a specific role. The disk usage for a role is defined as the total of disk usage on all segments for all tables the role owns. Although a role is a cluster-level database object, the disk usage for roles is calculated separately for each database.
- A schema tablespace disk quota sets a limit on the disk space that can used by all tables in a database that reside in a specific schema and tablespace.
- A role tablespace disk quota sets a limit on the disk space that can used by all tables in a database that are owned by a specific role and reside in a specific tablespace.
- A per-segment tablespace disk quota sets a limit on the disk space that can be used by a Greeplum Database segment when a tablespace quota is set for a schema or role.
Understanding How diskquota Monitors Disk Usage
A single diskquota launcher process runs on the active WarehousePG coordinator node. The diskquota launcher process creates and launches a diskquota worker process on the coordinator for each diskquota-enabled database. A worker process is responsible for monitoring the disk usage of tablespaces, schemas, and roles in the target database, and communicates with the WarehousePG segments to obtain the sizes of active tables. The worker process also performs quota enforcement, placing tablespaces, schemas, or roles on a denylist when they reach their quota.
When a query plan for a data-adding query is generated, and the tablespace, schema, or role into which data would be loaded is on the denylist, diskquota cancels the query before it starts executing, and reports an error message that the quota has been exceeded.
A query that does not add data, such as a simple SELECT query, is always allowed to run, even when the tablespace, role, or schema is on the denylist.
Diskquota can enforce both soft limits and hard limits for disk usage:
By default,
diskquotaalways enforces soft limits.diskquotachecks quotas before a query runs. If quotas are not exceeded when a query is initiated,diskquotaallows the query to run, even if it were to eventually cause a quota to be exceeded.When hard limit enforcement of disk usage is enabled,
diskquotaalso monitors disk usage during query execution. If a query exceeds a disk quota during execution,diskquotaterminates the query.Administrators can enable enforcement of a disk usage hard limit by setting the
diskquota.hard_limitserver configuration parameter as described in Activating/Deactivating Hard Limit Disk Usage Enforcement.
There is some delay after a quota has been reached before the schema or role is added to the denylist. Other queries could add more data during the delay. The delay occurs because diskquota processes that calculate the disk space used by each table run periodically with a pause between executions (two seconds by default). The delay also occurs when disk usage falls beneath a quota, due to operations such as DROP, TRUNCATE, or VACUUM FULL that remove data. Administrators can change the amount of time between disk space checks by setting the diskquota.naptime server configuration parameter as described in Setting the Delay Between Disk Usage Updates.
Diskquota can operate in both static and dynamic modes:
When the number of databases in which the
diskquotaextension is registered is less than or equal to the maximum number ofdiskquotaworker processes,diskquotaoperates in static mode; it assigns a background worker (bgworker) process to monitor each database, and the bgworker process exits only when thediskquotaextension is dropped from the database.When the number of databases in which the
diskquotaextension is registered is greater than the maximum number ofdiskquotaworker processes,diskquotaoperates in dynamic mode. In dynamic mode, for every monitored database everydiskquota.naptimeseconds,diskquotacreates a bgworker process to collect disk usage information for the database, and then stops the bgworker process immediately after data collection completes. In this mode,diskquotadynamically starts and stops bgworker processes as needed for all monitored databases.Administrators can change the maximum number of worker processes configured for
diskquotaby setting thediskquota.max_workersserver configuration parameter as described in Specifying the Maximum Number of Active diskquota Worker Processes.
If a query is unable to run because the tablespace, schema, or role has been denylisted, an administrator can increase the exceeded quota to allow the query to run. The module provides views that you can use to find the tablespaces, schemas, or roles that have exceeded their limits.
About the diskquota Functions and Views
The diskquota module provides user-defined functions (UDFs) and views that you can use to manage and monitor disk space usage in your WarehousePG deployment.
The functions and views provided by the diskquota module are available in the WarehousePG schema named diskquota.
Note You may be required to prepend the schema name (
diskquota.) to any UDF or view that you access.
User-defined functions provided by the module include:
| Function Signature | Description |
|---|---|
| void init_table_size_table() | Sizes the existing tables in the current database. |
| void set_role_quota( role_name text, quota text ) | Sets a disk quota for a specific role in the current database. > Note A role-based disk quota cannot be set for the WarehousePG cluster owner. |
| void set_role_tablespace_quota( role_name text, tablespace_name text, quota text ) | Sets a disk quota for a specific role and tablespace combination in the current database. > Note A role-based disk quota cannot be set for the WarehousePG cluster owner. |
| void set_schema_quota( schema_name text, quota text ) | Sets a disk quota for a specific schema in the current database. |
| void set_schema_tablespace_quota( schema_name text, tablespace_name text, quota text ) | Sets a disk quota for a specific schema and tablespace combination in the current database. |
| void set_per_segment_quota( tablespace_name text, ratio float4 ) | Sets a per-segment disk quota for a tablespace in the current database. |
| void pause() | Instructs the module to continue to count disk usage for the current database, but pause and cease emitting an error when the limit is exceeded. |
| void resume() | Instructs the module to resume emitting an error when the disk usage limit is exceeded in the current database. |
| status() RETURNS table | Displays the diskquota binary and schema versions and the status of soft and hard limit disk usage enforcement in the current database. |
Views available in the diskquota module include:
| View Name | Description |
|---|---|
| show_fast_database_size_view | Displays the disk space usage in the current database. |
| show_fast_role_quota_view | Lists active quotas for roles in the current database. |
| show_fast_role_tablespace_quota_view | List active quotas for roles per tablespace in the current database. |
| show_fast_schema_quota_view | Lists active quotas for schemas in the current database. |
| show_fast_schema_tablespace_quota_view | Lists active quotas for schemas per tablespace in the current database. |
| show_segment_ratio_quota_view | Displays the per-segment disk quota ratio for any per-segment tablespace quotas set in the current database. |
Configuring the diskquota Module
diskquota exposes server configuration parameters that allow you to control certain module functionality:
- diskquota.naptime - Controls how frequently (in seconds) that
diskquotarecalculates the table sizes. - diskquota.max_active_tables - Identifies the maximum number of relations (including tables, indexes, etc.) that the
diskquotamodule can monitor at the same time. - diskquota.hard_limit - Activates or deactivates the hard limit enforcement of disk usage.
- diskquota.max_workers - Specifies the maximum number of diskquota worker processes that may be running at any one time.
- diskquota.max_table_segments - Specifies the maximum number of table segments in the cluster.
You use the gpconfig command to set these parameters in the same way that you would set any WarehousePG server configuration parameter.
Setting the Delay Between Disk Usage Updates
The diskquota.naptime server configuration parameter specifies how frequently (in seconds) diskquota recalculates the table sizes. The smaller the naptime value, the less delay in detecting changes in disk usage. This example sets the naptime to ten seconds and restarts WarehousePG:
$ gpconfig -c diskquota.naptime -v 10 $ gpstop -ar
About Shared Memory and the Maximum Number of Relations
The diskquota module uses shared memory to save the denylist and to save the active table list.
The denylist shared memory can hold up to one million database objects that exceed the quota limit. If the denylist shared memory fills, data may be loaded into some schemas or roles after they have reached their quota limit.
Active table shared memory holds up to one million of active tables by default. Active tables are tables that may have changed sizes since diskquota last recalculated the table sizes. diskquota hook functions are called when the storage manager on each WarehousePG segment creates, extends, or truncates a table file. The hook functions store the identity of the file in shared memory so that its file size can be recalculated the next time the table size data is refreshed.
The diskquota.max_active_tables server configuration parameter identifies the maximum number of relations (including tables, indexes, etc.) that the diskquota module can monitor at the same time. The default value is 300 * 1024. This value should be sufficient for most WarehousePG installations. Should you change the value of this configuration parameter, you must restart the WarehousePG server.
Activating/Deactivating Hard Limit Disk Usage Enforcement
When you enable enforcement of a hard limit of disk usage, diskquota checks the quota during query execution. If at any point a currently running query exceeds a quota limit, diskquota terminates the query.
By default, hard limit disk usage enforcement is deactivated for all databases. To activate hard limit enforcement for all databases, set the diskquota.hard_limit server configuration parameter to 'on', and then reload the WarehousePG configuration:
$ gpconfig -c diskquota.hard_limit -v 'on' $ gpstop -u
Run the following query to view the hard limit enforcement setting:
SELECT * from diskquota.status();
Specifying the Maximum Number of Active diskquota Worker Processes
The diskquota.max_workers server configuration parameter specifies the maximum number of diskquota worker processes (not including the diskquota launcher process) that may be running at any one time. The default number of maximum worker processes is 10, and the maximum value that you can specify is 20.
You must set this parameter at WarehousePG server start time.
Note Setting
diskquota.max_workersto a value that is larger thanmax_worker_processeshas no effect;diskquotaworkers are taken from the pool of worker processes established by that WarehousePG server configuration parameter setting.
Specifying the Maximum Number of Table Segments (Shards)
A WarehousePG table (including a partitioned table’s child tables) is distributed to all segments as a shard. diskquota counts each table shard as a table segment. The diskquota.max_table_segments server configuration parameter identifies the maximum number of table segments in the WarehousePG cluster, which in turn can gate the maximum number of tables that diskquota can monitor.
The runtime value of diskquota.max_table_segments equals <max_number_tables> ⌈ {<number_segments> + 1} ⁄ {100} ⌉ 100. The default value is 10 * 1024 * 1024.
Using the diskquota Module
You can perform the following tasks with the diskquota module:
- View the diskquota Status
- Pause and Resume Disk Quota Exceeded Notifications
- Set a Schema or Role Disk Quota
- Set a Tablespace Disk Quota for a Schema or Role
- Set a Per-Segment Tablespace Disk Quota
- Display Disk Quotas and Disk Usage
- Temporarily Deactivate Disk Quota Monitoring
Viewing the diskquota Status
To view the diskquota module and schema version numbers, and the state of soft and hard limit enforcement in the current database, invoke the status() command:
SELECT diskquota.status();
name | status
------------------------+---------
soft limits | on
hard limits | on
current binary version | 2.0.1
current schema version | 2.0 Pausing and Resuming Disk Quota Exceeded Notifications
If you do not care to be notified of disk quota exceeded events for a period of time, you can pause and resume error notification in the current database as shown below:
SELECT diskquota.pause(); -- perform table operations where you do not care to be notified -- when a disk quota exceeded SELECT diskquota.resume();
Note The pause operation does not persist through a WarehousePG cluster restart; you must invoke
diskquota.pause()again when the cluster is back up and running.
Setting a Schema or Role Disk Quota
Use the diskquota.set_schema_quota() and diskquota.set_role_quota() user-defined functions in a database to set, update, or delete disk quota limits for schemas and roles in the database. The functions take two arguments: the schema or role name, and the quota to set. You can specify the quota in units of MB, GB, TB, or PB; for example, '2TB'.
The following example sets a 250GB quota for the acct schema:
SELECT diskquota.set_schema_quota('acct', '250GB');This example sets a 500MB disk quota for the nickd role:
SELECT diskquota.set_role_quota('nickd', '500MB');To change a quota, invoke the diskquota.set_schema_quota() or diskquota.set_role_quota() function again with the new quota value.
To remove a schema or role quota, set the quota value to '-1' and invoke the function.
Setting a Tablespace Disk Quota
Use the diskquota.set_schema_tablespace_quota() and diskquota.set_role_tablespace_quota() user-defined functions in a database to set, update, or delete per-tablespace disk quota limits for schemas and roles in the current database. The functions take three arguments: the schema or role name, the tablespace name, and the quota to set. You can specify the quota in units of MB, GB, TB, or PB; for example, '2TB'.
The following example sets a 50GB disk quota for the tablespace named tspaced1 and the acct schema:
SELECT diskquota.set_schema_tablespace_quota('acct', 'tspaced1', '250GB');This example sets a 500MB disk quota for the tspaced2 tablespace and the nickd role:
SELECT diskquota.set_role_tablespace_quota('nickd', 'tspaced2', '500MB');To change a quota, invoke the diskquota.set_schema_tablespace_quota() or diskquota.set_role_tablespace_quota() function again with the new quota value.
To remove a schema or role tablespace quota, set the quota value to '-1' and invoke the function.
Setting a Per-Segment Tablespace Disk Quota
When an administrator sets a tablespace quota for a schema or a role, they may also choose to define a per-segment disk quota for the tablespace. Setting a per-segment quota limits the amount of disk space on a single WarehousePG segment that a single tablespace may consume, and may help prevent a segment's disk from filling due to data skew.
You can use the diskquota.set_per_segment_quota() function to set, update, or delete a per-segment tablespace disk quota limit. The function takes two arguments: the tablespace name and a ratio. The ratio identifies how much more of the disk quota a single segment can use than the average segment quota. A ratio that you specify must be greater than zero.
You can calculate the average segment quota as follows: