System Configuration Best Practices

This document provides performance-oriented guidelines and recommendations for configuring SynxDB.

Configure the time zone

SynxDB selects a time zone to use from a set of PostgreSQL time zones stored internally. These time zones are derived from the IANA (Internet Assigned Numbers Authority) time zone database. When the IANA database is updated, SynxDB also updates its built-in time zone list accordingly.

SynxDB determines which time zone to use by matching PostgreSQL time zones with the user-specified time zone. If the user does not specify one, it matches the host system’s time zone. For example, when choosing a default time zone, SynxDB uses the host system’s time zone files and a specific algorithm to find the best-matching PostgreSQL time zone. If the host system time zone contains leap second information, SynxDB might not be able to precisely match the corresponding PostgreSQL time zone. In this case, SynxDB calculates a best-match entry based on the host system information.

The best practice is to configure SynxDB and the host system to use a clear and supported time zone. This ensures a unified time zone for the Coordinator and all Segment instances and avoids recalculating the “best match” due to changes in the time zone files (which might have been updated by IANA since the last restart) every time the cluster restarts.

You can use the gpconfig tool to view and set the time zone for SynxDB. For example, the following commands display the current time zone and set it to US/Pacific:

# gpconfig -s TimeZone
# gpconfig -c TimeZone -v 'US/Pacific'

After changing the time zone, you must restart SynxDB for the change to take effect. Use the gpstop -ra command to restart the database.

To query the time zones supported by SynxDB, check the pg_timezone_names system view.

File system

Use the XFS file system for the SynxDB data directory. For mount options, see the “Configure your system” section.

Port configuration

For more details, see Prepare to deploy – Port.

Set ip_local_port_range appropriately to avoid conflicts with the ports used by SynxDB. For example, set the following range in the /etc/sysctl.conf file:

net.ipv4.ip_local_port_range = 10000 65535

Then set the base ports for SynxDB as follows:

PORT_BASE = 6000
MIRROR_PORT_BASE = 7000

For more details, see Prepare to deploy – Set system parameters.

I/O configuration

For devices that store the data directory, set the read-ahead size to 16384. The following command sets this value for the /dev/sdb device:

/sbin/blockdev --setra 16384 /dev/sdb

Use the following command to view the current read-ahead size for /dev/sdb:

# /sbin/blockdev --getra /dev/sdb
16384

It is recommended to set the deadline I/O scheduler for all devices that host the data directory.

# cat /sys/block/sdb/queue/scheduler
noop anticipatory [deadline] cfq

Additionally, increase the operating system limits for file descriptors and processes in /etc/security/limits.conf.

* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072

Operating system memory settings

The Linux kernel parameters vm.overcommit_memory and vm.overcommit_ratio affect the operating system’s memory allocation strategy for processes. For more details, see Prepare to deploy – Set system parameters.

The vm.overcommit_memory parameter defines the kernel’s memory allocation policy. For database applications, the only safe setting is 2.

Note

Background information on memory overcommitment:

  • https://en.wikipedia.org/wiki/Memory_overcommitment

  • https://www.kernel.org/doc/Documentation/vm/overcommit-accounting

vm.overcommit_ratio defines the percentage of physical RAM available to applications. On Red Hat systems, the default value is 50.

Shared memory settings

SynxDB uses shared memory to buffer data pages and to enable communication among postgres processes within the same database instance.

Configure shared buffers

Properly configuring the size of the shared memory buffer (that is, the shared_buffers GUC) is critical to cluster performance.

First, understand what types of database objects reside in the shared buffer:

  • Heap tables: including user-created heap tables, system catalog tables, TOAST tables, and auxiliary tables for append-optimized (AO) tables.

  • Temporary heap tables: Data in temporary tables needs to be shared across slices within a query, so they are stored in the backend process’s local memory rather than the shared buffer. There is one exception: during a join, if the build side is a temporary table, its data resides in the backend local memory buffers.

Note

The above objects (specifically heap tables) are effectively double-buffered in memory, meaning they reside both in the shared memory buffer and in the operating system page cache.

Second, understand which objects do not reside in the shared memory buffer of the Coordinator and Segments:

  • Temporary objects (such as temporary tables)

  • Spill files

Third, operations such as joins, aggregations, and VACUUM require temporary process-local memory to manage short-lived objects like hash tables. An effective memory allocation strategy is critical to avoiding a single process exhausting system memory.

Therefore, setting an appropriate size for the shared memory buffer is crucial to database performance:

  • If set too small, the system might need to read from or append to disk more frequently.

  • If set too large, it might reduce the local memory available to other processes in the system.

When tuning the size of the shared memory buffer, consider the following factors:

  • Typical query plan characteristics: In most cases, query plans rarely cause disk spill files, and process-local memory usage remains relatively small. This guideline also applies to a standby Coordinator.

    • Recommended starting point: Set shared_buffers to a reasonable size without trying to cover the entire partitioned table. Ensuring that system catalog tables can be cached in Segment memory can significantly improve workload throughput.

  • Use of append-optimized (AO) tables: Workloads that primarily operate on AO tables differ. Because AO table data does not go through the shared buffer, excessively increasing shared_buffers provides no benefit and might even degrade performance by increasing contention for the OS page cache.

    • For large system catalogs: Segment nodes might require larger shared_buffers to improve catalog access performance.

    • Resource manager: Also account for memory allocated to the resource manager.

    • Recommended starting point: For primary and mirror segments running on dedicated hosts, a good starting value for shared_buffers is (0.10 * host_total_memory) / number_of_segments_on_host. You can start from the default of 125 MB and increase gradually until the sum of shared_buffers across all segments reaches 15%–25% of the host’s total memory.

Note

  • Use the bg_buffer_cache extension to observe shared buffer usage in real time during peak load.

  • AO tables have associated metadata heap tables, created under the pg_aoseg schema and stored on the segments. Keeping these tables in shared memory can improve performance.

  • The above recommendations might need to be adjusted based on memory allocated to components outside of SynxDB (for example, gpfdist). Also balance the memory needs of Coordinator and Segment hosts.

Kernel shmem settings

When configuring shared_buffers, you might also need to adjust the operating system parameters kernel.shmmax and kernel.shmall.

The kernel.shmmax parameter defines the maximum size (in bytes) of a single shared memory segment. Its value must exceed the size specified for shared_buffers.

shmem_size_per_db_node = shared_buffers + other_db_node_shmem

In addition to shared_buffers, each SynxDB instance consumes extra shared memory due to other configuration parameters. We refer to this as other_db_node_shmem.

Under the default SynxDB settings, this additional shared memory overhead is approximately:

  • 111 MB per Segment

  • 79 MB for the Coordinator

The kernel.shmall operating system parameter defines the upper limit on the total amount of shared memory available on the host (in pages). Its value must exceed the result of the following formula:

((num_segments_on_host * shmem_size_per_db_node) + other_app_shared_mem) / os_page_size

Where other_app_shared_mem represents the shared memory consumed by other applications on the host, and os_page_size is typically 4 MB on most platforms.

If you encounter shared memory allocation errors, try the following:

  • Increase the values of kernel.shmmax or kernel.shmall

  • Decrease the value of shared_buffers

  • Reduce max_connections (fewer connections reduce memory usage)

Regular huge pages

After tuning shared_buffers, consider enabling regular huge pages. Huge pages allow the OS to allocate larger contiguous memory blocks, reducing CPU overhead for page table management.

In SynxDB, regular huge pages apply only to the shared memory buffer. This means the performance gains from huge pages are limited to objects stored in shared memory. Therefore, the benefits are proportional to the size of shared_buffers.

You can configure and activate the number of regular huge pages via the proc filesystem parameter vm.nr_hugepages. Use the following formula to calculate the recommended value of vm.nr_hugepages for SynxDB:

vm.nr_hugepages =
(
  (
    (shmem_size_per_db_node + (block_size - (shmem_size_per_db_node % block_size)))
    / huge_page_size
  ) + 1
) * num_segments_on_host

Where:

  • shmem_size_per_db_node = shared_buffers + other_db_node_shmem

  • block_size is the SynxDB block size (default: 32 KB)

  • huge_page_size is the size of a single huge page (usually 2 MB), which can be found in /proc/meminfo

  • num_segments_on_host is the total number of SynxDB segment instances per host, including primary, mirror, coordinator, and standby

The logic behind the formula is as follows:

  • (shmem_size_per_db_node % block_size) computes the portion of memory not aligned to block_size.

  • Padding aligns the shared memory size to the nearest multiple of block_size.

  • Divide the aligned total by huge_page_size to obtain the required number of huge pages.

  • Add 1 to account for remainders or internal overhead pages.

  • Finally, multiply by num_segments_on_host to cover all segment instances on the host.

Examples

Example 1

shmem_size_per_db_node = 125MB  # Converted to bytes
block_size = 32KB  # Converted to bytes
huge_page_size = 2MB  # Converted to bytes
num_segments_on_host = 16
vm.nr_hugepages = 1008

Example 2

shmem_size_per_db_node = 1.6GB  # Converted to bytes; example where shared memory is not strictly aligned to block size
block_size = 32KB  # Converted to bytes
huge_page_size = 2MB  # Converted to bytes
num_segments_on_host = 16
vm.nr_hugepages = 13120

Number of segments per host

The number of segment instances running on each segment host has a decisive impact on overall system performance. All segment instances share the host’s CPU, memory, and network resources. Overestimating the number of segments a server can host is a common cause of poor performance.

When determining the number of segments, consider the following factors:

  • Number of CPU cores

  • Physical memory capacity of the server

  • Number and bandwidth of network cards

  • Storage capacity mounted on the server

  • Mixed deployment of primary and mirror segments

  • ETL process load running on the host

  • Other non-SynxDB processes running on the host