Visualizing hardware performance
Maintain peak system health and identify resource saturation by monitoring hardware and database telemetry across the WarehousePG (WHPG) cluster. Use the System Metrics panel on the left sidebar to diagnose performance degradation, evaluate host health, and perform long-term capacity planning.
Identifying and resolving real-time hardware bottlenecks
Detect immediate hardware saturation that could be impacting query response times with the System Metrics tab.
- Check the CPU Usage % Over Time to idenfity if specific nodes are hitting 100% utilization. If one node is consistently higher than others, investigate the data distribution for skew issues that are overloading a single segment host.
- Compare the Available Memory and Cached Memory metrics. If the available memory is low and cached is also shrinking, the OS is under pressure and might be swapping. Reduce concurrent workloads or increase memory allocation to prevent significant performance degradation.
- Review the Disk I/O and Network Traffic graphs. High disk read rates during unexpected times often indicate inefficient queries that are forcing full table scans instead of using indexes.
- Observe the 1m, 5m, and 15m load averages. If the 15-minute load consistently exceeds the number of available CPU cores, the host is over-provisioned. Reschedule non-critical tasks to lower the OS-level queue.
Analyzing historical utilization and capacity trends
Identify long-term patterns and prepare for cluster expansion by using the Historical Trends tab.
- Compare average and peak usage for CPU and memory over the last 30 days. If your peak usage is steadily climbing toward your total capacity, plan for node expansion to maintain stability.
- *Review the Per-Host Statistics table. Use the standard deviation Std Dev metric. Nodes that behave differently than the rest of the cluster often indicate failing hardware or localized configuration issues that require manual intervention.
- Use the Combined I/O Activity graph to determine if spikes in network traffic coincide with disk writes. This pattern usually points to massive data redistributions or heavy background maintenance tasks that must be scheduled during off-peak hours.
Correlating database activity with physical resource load
Use the Database Metrics tab to bridge the gap between SQL execution and physical resource consumption.
- Review the status bar for Idle in Txn and Blocked sessions. Terminate sessions that remain Idle in Txn, as they prevent the database from cleaning up old data and lead to table bloat and wasted disk space.
- Check the Cache Hit % for each database. If this number drops significantly below 90%, it means your data working set is too large for the current memory allocation. Increase memory or optimize your queries to reduce reliance on slow disk reads.
- Compare Database Sizes and Query Activity Trend. If a small database generates a disproportionately high number of temporary files, add indexes or increase the memory available for sorting operations.
- Monitor the ratio of Rollbacks and Deadlocks in the Database Statistics table. A sudden spike in rollbacks often indicates application-level errors or network instability between that requires coordination with your development team.
Responding to hardware-driven storage alerts
If disk utilization metrics or historical trends indicate that storage is running low, perform these steps to restore headroom:
- Navigate to the Data analysis panel to identify which specific tables are consuming the most space. Focus on those with the highest growth rates.
- Investigate if tables or indexes have accumulated excessive bloat and run
VACUUMagainst these tables to reclaim space. - Move older, less frequently accessed data to cold storage or an archival schema to free up primary disk space.
- Consult your administrator to discuss hardware expansion or volume resizing if the current data growth is consistent and can't be mitigated by archiving or cleaning.
Could this page be better? Report a problem or suggest an addition!