v4.2.0 Release Notes
Release date: October 2025
Version: v4.2.0
SynxDB v4.2.0 introduces a suite of advancements designed to improve data lakehouse integration, AI-readiness, query performance, storage optimization, and operational observability.
Data lakehouse integration: Expands data lake capabilities with support for reading Apache Iceberg tables directly from Amazon S3 through the Polaris Catalog. It also introduces granular configuration parameters for HDFS and Alibaba OSS access, along with enhanced local file protocol support.
AI-readiness: Features the new SynxDB MCP (Model Context Protocol) service, streamlining the integration of Large Language Models (LLMs) and other AI tools with the database.
Query performance & storage optimization: Implements runtime filter pushdown directly to the Table Access Method (AM), significantly reducing data scanned and accelerating query execution. For storage, column-level LZ4 compression is now available for PAX tables, offering a superior balance of compression ratio and decompression speed.
Observability & reliability: Enhances system management with CBDR (Continuous Backup and Disaster Recovery) for robust, continuous archiving and recovery. This release also includes multiple DBCC (Database Console Command) enhancements for deeper diagnostics and introduces new summary views that aggregate information from multiple
gp_*system views, simplifying cluster monitoring and management.
New features
Database
Category |
Feature |
User documents |
|---|---|---|
Data federation and lakehouse integration |
Supports reading Iceberg tables stored in S3 via Polaris Catalog (currently read-only). |
|
Data federation and lakehouse integration |
Adds HDFS/OSS related configuration parameters to optimize connection, routing, and context updates. |
|
Data federation and lakehouse integration |
|
|
AI-ready |
Provides SynxDB MCP service, a secure database interface for LLM applications. |
|
Query performance and data storage optimization |
Pushes down runtime filters to the table access method (AM), using PAX min/max statistics to accelerate scans. |
|
Query performance and data storage optimization |
Adds LZ4 compression algorithm support for PAX column storage, enhancing the compression algorithm matrix and optimizing read/write performance and storage usage. |
|
Observability & reliability |
Adds Summary System Views to provide cluster-wide aggregated views for better observability and capacity assessment. |
|
Observability & reliability |
Adds |
|
Observability & reliability |
Introduces CBDR for continuous archiving and recovery (PITR/Hot Standby), supporting cross-site incremental replication and read-only services. |
Interactive manager DBCC
Feature |
User documents |
|---|---|
Supports Standby Missing and VIP status alerts |
|
Enhanced SQL query monitoring (filtering, plan text, etc.) |
|
Enhanced cluster monitoring (Cluster Metrics time window) |
|
Enhanced table queries (fuzzy table names, sub-partition sizes) |
|
Supports modifying login names |
New feature details
Data federation and lakehouse integration
Read Iceberg tables on S3 via Polaris Catalog (read-only): Connect to an external Polaris Catalog to directly query Iceberg tables stored in S3 or compatible object storage within SynxDB. Currently supports
SELECTqueries. Suitable for read-only analysis and exploration of data lakes.See: Read Iceberg tables from object storage (Polaris Catalog)
HDFS/OSS access related GUC parameters: Adds several new parameters to optimize behaviors such as connections, routing, and context handling. For example:
pg_gophermeta.gphdfs_configure_router: Indicates whether to configure multiple routers.pg_gophermeta.gopher_hash_connect_hdfs_router: Hashes traffic by Segment ID in a multi-router scenario.pg_gophermeta.gopher_connect_hdfs_disable_getstate: Controls whether to disable thegetFsStatsRPC.pg_gophermeta.gopher_enable_update_oss_context: Controls whether to update OSS context information.
For more information, see: Parameter list
file://protocol supportsON COORDINATOR: Allows loading data from local files on the coordinator node into external tables, facilitating data import and debugging in single-process scenarios.Optimized datalake list operations: Changes the execution mode from distributed (per segment) to centralized (coordinator node), reducing unnecessary overhead/latency, improving performance, and lowering resource usage.
AI-ready
SynxDB MCP service: Provides a standardized and secure database interface for LLM applications. It includes built-in protection against SQL injection, parameterized queries, connection pooling, and sensitive table protection, making it suitable for intelligent data queries, automated operations, and AI-assisted development.
See: MCP service
Query performance and data storage optimization
Runtime filter pushdown to table access method (AM): When scanning PAX tables, pushed-down runtime filters use column-level
min/maxstatistics to skip non-matching data files before reading, significantly reducing I/O and the amount of data processed by subsequent operators. This provides significant benefits when queries meet conditions such as using a HashJoin, the outer table being a PAX table,minmaxstatistics being enabled, and the feature switch being on.PAX supports LZ4 compression: Adds LZ4 column-level compression to the existing zlib/zstd options, balancing compression ratio with read/write performance.
See: PAX table format
Observability and reliability
Adds and enhances multiple summary views based on Segment statistics (for example,
gp_stat_*_summaryandgp_statio_*_summaryseries). These cover usage and I/O metrics for activities, processes, archiving, databases, DDL operations,ANALYZE/CLUSTER/VACUUM/COPYprogress, and objects like tables, indexes, sequences, and system catalogs, facilitating observability analysis and capacity assessment from a cluster-wide perspective.See: System views
Additional cgroup v2 level and custom parent directory: Adds the
gp_resource_group_cgroup_parentparameter to customize the cgroup root directory name (defaults togpdb.service), adapting to different operating systems and runtime environments. A restart is required for changes to take effect.CBDR continuous archiving and recovery: Combines full backups with WAL archiving to achieve continuous data protection and Point-in-Time Recovery (PITR), supporting hot standby for read-only queries. It is suitable for business scenarios such as cross-site disaster recovery, incremental replication, and read-only traffic offloading.
Enhanced DBCC alert capabilities: Includes built-in event templates for Standby Missing, VIP disconnection, etc., supporting continuous maintenance and alert triggering.
Enhanced DBCC SQL query monitoring: Query History now supports filtering by
submitted_time,user, anddatabase. Query Details now supports displaying the Text plan. A PID column has been added to the list.Enhanced DBCC cluster monitoring: Cluster Metrics now supports custom time ranges for retrospective analysis.
See: View cluster status
Enhanced DBCC database table queries: Supports fuzzy search for table names and displaying properties like sub-partition sizes.
DBCC login name management: Supports modifying login names, improving user account management capabilities.
Product change information
Metadata
Adds the following system summary views to provide aggregated results for progress and statistics across the cluster:
gp_stat_progress_vacuum_summary: Aggregates the distributed progress fromgp_stat_progress_vacuum. For replicated tables (policytype='r'), count-based metrics are normalized bynumsegmentsto provide a cluster-wide view of VACUUM progress.gp_stat_progress_analyze_summary: Aggregates the distributed progress fromgp_stat_progress_analyze. Includes key metrics such as sampled blocks, extended statistics, and child table processing progress, with segment-wise normalization for replicated tables.gp_stat_progress_cluster_summary: Aggregates the distributed progress fromgp_stat_progress_cluster. Includes metrics such as command, phase, number of index rebuilds, heap tuples scanned/written, and block-level statistics, with segment-wise normalization for replicated tables.gp_stat_progress_create_index_summary: Aggregates the distributed progress fromgp_stat_progress_create_index. Includes metrics such as locking progress, and the number of blocks/tuples/partitions processed, with segment-wise normalization for replicated tables.
Metadata implementation supplement: Adds
system_views_gp_summary.sqlto centralize the definition of the above SynxDB summary views.
GUC configuration parameters
pax_enable_sparse_filteris renamed topax.enable_sparse_filter.pax_enable_row_filteris renamed topax.enable_row_filter.pax_scan_reuse_buffer_sizeis renamed topax.scan_reuse_buffer_size.pax_max_tuples_per_groupis renamed topax.max_tuples_per_group.pax_max_tuples_per_fileis renamed topax.max_tuples_per_file.pax_max_size_per_fileis renamed topax.max_size_per_file.pax_enable_toastis renamed topax.enable_toast.pax_min_size_of_compress_toastis renamed topax.min_size_of_compress_toast.pax_default_storage_formatis renamed topax.default_storage_format.pax_bloom_filter_work_memory_bytesis renamed topax.bloom_filter_work_memory_bytes.The default value of
pg_gophermeta.gopher_local_capacity_mbis changed from10240to1024000.
Components
Upgraded MADlib to version 2.1.0.
Updated Gopher version to v4.0.21.
Bug fixes
Query optimizer and executor
Fixed an issue where the Locus of a Shared Scan could be empty: When
gp_cte_sharingis enabled, the Locus type and parallelism of the Shared Scan are explicitly set to avoid aNULL locusin the query plan.Fixed parallel path worker allocation: Moves the assignment of
parallel_workersbefore the assertion and usespathnode->parallel_workersin the check, correcting an exception where the number of parallel workers was 0 for partial paths.Fixed locus issue for writable CTEs on replicated tables: In set operations (for example,
WITH ... RETURNINGandEXCEPT), ensures that the loci are correctly set toSingleQEorEntryto avoid errors caused by inconsistencies between replicated and partitioned loci.Fixed an issue where
make_grouping_rel()did not preserverelidandcdbpolicy: Correctly passes these fields frominput_reltogrouped_rel, preventing crashes in extensions that rely on these fields increate_upper_paths_hook.
Vectorized executor
Fixed memory leaks: Ensures consistent calls to
FreeVecExecuteState()inExecEndVecXXX()for all vectorized operators, ensuring that Arrow plans and related structures are correctly released.
Storage and access methods (Table AM / PAX / DataLake)
Fixed Table AM sampling logic: When a table access method implements
relation_acquire_sample_rows, this method is now called directly to get sample rows, ensuring correct logic.Fixed PAX related GUC issues: Corrects the handling of PAX GUC configuration items to ensure settings take effect and maintain system stability.
Fixed a segmentation fault when reading Iceberg tables: Allocates a buffer using
pallocwhen encountering an empty buffer while reading Iceberg data, preventing a crash.Fixed liboss2 curlopt timeout (set to 180s) to resolve 403 error codes.
Resource groups and cgroup management
Fixed unresponsiveness of
ALTER RESOURCE GROUP ... SET IO_LIMIT '-1': Cleans upio.maxand synchronously updatespg_resgroupcapabilityto ensure parameter updates take effect immediately.Fixed cgroup directory deletion logic: Deletes only the leaf directories of the Greenplum cgroup in
group-v2mode, avoiding cascading issues from failed deletions ingroup-v1mode.Fixed a double-free issue in the IO Limit callback: Correctly frees and resets the old
io_limitpointer inalterResgroupCallbackto avoid potential double-free problems.Fixed hardcoded cgroup root directory name in
gpcheckresgroupv2impl: Addsget_cgroup_parent()to dynamically read thegp_resource_group_cgroup_parentparameter, replacing hardcoded paths and improving error messages.Fixed instability from concurrent directory creation for resource group IO Limit: When multiple Segments on the same host create directories concurrently, it now catches and ignores “already exists” errors, improving stability.
Improved the robustness of IO Limit behavior:
Added a function to clean up
io.max, ensuring state consistency when changingIO_LIMIT.Added a check for IO limit associations when deleting a tablespace to prevent accidental deletion.
Downgraded some
parseioerrors toWARNINGduringInitResGroup/AlterResourceGroupto ensure the cluster can start smoothly in exceptional scenarios.
System views
Fixed adaptation issues in system views after merging upstream code: Ensures that relevant view functions and queries work correctly.
Fixed naming and structure of
pg_stat_all_tables/pg_stat_all_indexes: Restores the original definitions (single Segment statistics), adds the generation of cross-Segment aggregate viewsgp_stat_all_tables/gp_stat_all_indexes, and introduces summary versionsgp_stat_all_tables_summary/gp_stat_all_indexes_summary. Also addspg_stat_user_tables_summary/pg_stat_user_indexes_summaryand improves regression tests.
Processes and concurrency (Gang/Writer/Interconnect)
Fixed occasional unresponsiveness due to “writer proc entry not found”: Introduces a configurable retry duration
find_writer_proc_retry_timefor forking gangs in asynchronous mode, allowing for longer startup waits to accommodate complex environments and reduce jitter.
Compilation and build
Fixed
--disable-orcabuild failure: Removes the conditional compilation wrapper around theOptimizerOptionsstructure, ensuring successful compilation in configurations without ORCA.