SynxDB Documentation

Table of Contents

  • Product Overview
    • Feature Overview
      • Efficient queries in different scenarios
      • Polymorphic data storage
      • Multi-layer data security
      • Data loading
      • Multi-layer fault tolerance
      • Rich data analysis support
      • Flexible workload management
      • Advanced diagnostics and maintainability
      • Multiple compatibility
    • Product Architecture
      • Coordinator node
      • Segment node
      • Interconnect
      • MVCC
      • Kernel-level reliability
    • User Scenarios
    • Comparison with Greenplum Features
      • General features
      • Performance-related features
      • Security-related features
    • Release Notes
      • v4.5.0 Release Notes
        • New features
          • Database
        • New feature details
          • Query processing and optimization
          • Security
        • Product change information
          • GUC configuration parameters
        • Bug fixes and improvements
          • Query optimizer and executor
          • Storage and access methods
          • Processes and concurrency
          • Security
          • Tools and utilities
      • v4.4.0 Release Notes
        • New features
          • Database
          • Interactive manager DBCC
        • New feature details
          • Query processing and optimization
          • Database maintenance
          • Data federation and lakehouse integration
          • Storage
          • Observability and reliability
          • Tools and utilities
        • Product change information
          • GUC configuration parameters
          • Components
        • Bug fixes
          • Query optimizer and executor
          • Storage and access methods
          • Processes and concurrency
          • Security
          • Tools and utilities
          • Observability
      • v4.3.0 Release Notes
        • New features
          • Database Lightning
          • Interactive manager DBCC
        • New feature details
          • Data federation and lakehouse integration
          • Search and AI-ready
          • Query processsing and optimization
          • Storage
          • Observability and reliability
        • Product change information
          • GUC configuration parameters
          • Components
        • Bug fixes
          • Query optimizer and executor
          • Storage and access methods
          • Processes and concurrency
          • Security and TDE
          • System commands and utilities
          • DBCC
          • Compilation and build
      • v4.2.0 Release Notes
        • New features
          • Database
          • Interactive manager DBCC
        • New feature details
          • Data federation and lakehouse integration
          • AI-ready
          • Query performance and data storage optimization
          • Observability and reliability
        • Product change information
          • Metadata
          • GUC configuration parameters
          • Components
        • Bug fixes
          • Query optimizer and executor
          • Vectorized executor
          • Storage and access methods (Table AM / PAX / DataLake)
          • Resource groups and cgroup management
          • System views
          • Processes and concurrency (Gang/Writer/Interconnect)
          • Compilation and build
      • v4.1.0 Release Notes
        • New features
          • Database
          • Interactive manager DBCC
        • Product change information
          • Configuration parameters
          • Installation and deployment
          • Data lake and external integration
          • Removed features
        • Bug fixes
      • v4.0.0 Release Notes
        • Availability
        • New features
          • Database
          • Interactive manager DBCC
        • Upgrade path
        • Product change information
          • SQL syntax
          • Configuration parameters (GUCs)
          • Extensions and plugins
          • Removed features
        • Bug fixes
    • TPC-DS Benchmark Report
      • Test environment
        • Hardware
        • Software
      • Test program
      • Test steps
      • Test conclusion
        • Data loading performance
        • Performance testing of 99 queries
        • Throughput testing in TPC-DS scenario
    • Product Release Roadmap
      • 2024
      • 2025
        • Q1: Existing Greenplum replacement
        • Q2: SynxDB v4 release
        • Q3: SynxDB Elastic (Planned GA)
      • 2026 (Planned)
        • SynxDB v5
        • SynxDB Elastic
  • Deployment Guides
    • Deploy On-Premises
      • Software and Hardware Configuration
        • Hardware requirements
          • On-Premise servers (physical or virtual)
          • Storage
        • Data exchange network
        • Software requirements
          • Supported OS
          • SSH configurations
      • Prepare to Deploy On-Premises
        • Install system dependencies
        • Plan the deployment architecture
        • Modify server settings
          • Change hostname
          • Add gpadmin admin user
          • Disable SELinux and firewall software
          • Modify network mapping
          • Set system parameters
      • Deploy on Mulitiple Nodes
        • Step 1: Prepare server nodes
        • Step 2. Install the RPM package
        • Step 3. Configure mutual trust between nodes
        • Step 4. Initialize SynxDB
        • Step 5. Log into the Database
      • Deploy on Single Node
        • How to deploy
          • Step 1. Prepare to deploy
          • Step 2: Install SynxDB via RPM package
          • Step 3: Deploy SynxDB with a single computing node
        • Common issues
          • How to check the deployment mode of a cluster
          • Where is the data directory
        • How it works
        • User-behavior changes
      • Install Extensions and Plugins
        • Built-in plugins and extensions
        • Components requiring separate installation
  • Create and Prepare Databases
    • Create and Manage Databases
      • Templates and default databases
      • Create a database
        • Clone a database
        • Assign a different owner when creating a database
      • List databases
      • Alter a database
      • Drop a database
    • Start and Stop Databases
      • Start SynxDB
      • Restart SynxDB
      • Reload configuration file changes
      • Start the coordinator in maintenance mode
      • Stop SynxDB
      • Stop client processes
    • Connect to Databases
      • Connection parameters
      • Supported client applications
        • Client applications
        • Connect using psql
      • Connect using application interfaces
      • Common connection problems
  • Load Data
    • Data Loading Overview
      • Data loading process
      • Loading tools and scenarios
    • Load Data from Local Files
      • Load Data Using COPY
        • Load from a file
        • Load from STDIN
        • Load data using \copy in psql
        • Input format
      • Load Data Using gpfdist
        • Step 1. Install gpfdist
        • Step 2. Start and stop gpfdist
        • Step 3. Use gpfdist with external tables to load data
          • Example 1 - Run single gpfdist instance on a single-NIC machine
          • Example 2 — Run multiple gpfdist instances
          • Example 3 — Single gpfdist instance with error logging
          • Example 4 - Create a writable external table with gpfdist
        • About gpfdist
          • About gpfdist and external tables
          • About gpfdist setup and performance
      • Load Data Using the file:// Protocol
        • Usage examples
          • Load files from the coordinator host
      • Load Data Using gpload
        • To use gpload
    • Load External Data Using Foreign Table
      • Use foreign table
        • Create foreign table using the LIKE clause
      • Query a foreign table
      • Write data to a foreign table with ordered insertion
    • Load Data from Web Services
      • Command-based external web tables
      • URL-based external web tables
    • Load Data Using PXF
      • Install and configure PXF
        • Prerequisites
        • Step 1: Install the PXF package
        • Step 2: Set Up Your Environment
        • Step 3: Run PXF
      • Next steps
    • Load Data from Kafka Using Kafka FDW
      • Basic usage
      • Supported data formats
      • Query
      • Message producer
      • Data import
        • Scheduled import
    • Load Data from Object Storage and HDFS
      • Install the extension
      • Instructions
        • Load data from object storage
          • Load Iceberg table data from S3 (without an external metadata service)
          • Read Iceberg tables on S3 via Polaris Catalog
        • Load data from HDFS
          • Load HDFS data without authentication
          • Load HDFS data with Kerberos authentication
      • HDFS and OSS-related configuration parameters
    • Load Data from Hive Data Warehouse
      • Supported Hive file formats
      • Usage limitations
      • Steps
        • Step 1. Create a configuration file on database cluster
          • Configure Hive cluster information
          • Configure HDFS cluster information
        • Step 2. Create foreign data wrapper and Hive Connector plugin
        • Step 3. Create server and user mapping
        • Step 4. Sync Hive objects to the database cluster
          • Sync a Hive table
          • Sync a Hive database
      • Examples of syncing tables
        • Sync a Hive text table
        • Sync a Hive ORC table
        • Sync a Hive ORC partitioned table
        • Sync a sample Hive database
        • Sync tables in Iceberg and Hudi formats
          • Load Iceberg tables
          • Load Hudi tables
      • Data type mapping
      • Known issues
        • Solution
    • Load Data from MySQL Server Using MySQL_FDW
      • Prerequisites
      • Use MySQL FDW
        • CREATE SERVER options
        • CREATE USER MAPPING options
        • CREATE FOREIGN TABLE options
        • IMPORT FOREIGN SCHEMA options
        • TRUNCATE support
        • Functions
      • Usage Examples
        • Installing the Extension
        • Creating a Foreign Server
        • Granting Permissions to Use Foreign Server
        • Creating a User Mapping
        • Creating a Foreign Table
        • Importing a MySQL Database as a SynxDB Schema
    • Connect to a Remote SynxDB
      • Use cases
      • How to use
        • Prerequisites
        • Create a foreign table
        • Read data in parallel
        • Write data in parallel
        • Connection management
      • Important notes
    • Custom Multi-Character Delimiters for Reading and Writing External Tables
      • Compilation and installation
      • Usage example
        • Read external tables
        • Write to external tables
    • Load Data Using Apache NiFi
      • How it works
      • Key benefits of using NiFi with SynxDB
      • Example Use Case: Ingesting Log Files
  • Operate with Data
    • Operate with Database Objects
      • Basic Query Syntax
      • Create and Manage Tables
        • Create tables
          • Choose column data types
          • Set table and column constraints
      • Insert, Update and Delete Row Data
        • Insert rows
          • Insert rows into partitioned tables
          • Insert rows in Append-Optimized tables
        • Update existing rows
        • Delete rows
        • Truncate table
        • Vacuum database
      • Create and Manage Views
        • Create views
        • Drop views
        • Best practices when creating views
      • Create and Manage Materialized Views
        • Create materialized views
        • Refresh or deactivate materialized views
        • Drop materialized views
      • Create and Manage indexes
        • Index types
          • Bitmap indexes
        • Manage indexes
          • Cluster a table by index
          • Create indexes
          • Rebuild all indexes on a table
          • Drop indexes
        • Index-only scan and covering index
          • What is an index-only scan
          • Requirements
          • Additional note: MVCC visibility checks
          • Purpose of covering indexes
          • Dynamic index-only scan
        • Backward index scan
        • Check index usage
      • BRIN Indexes
        • Maintain BRIN indexes
      • Work with Transactions
        • Transaction isolation levels
          • Read uncommitted and read committed
      • Transaction Concurrency Control
        • Multi-version concurrency control
        • Lock modes
        • Lock optimization for SELECT … FOR UPDATE
        • Global deadlock detector
          • Global deadlock detector management of concurrent UPDATE and DELETE operations
      • Choose Table Storage Model
        • Heap and Append-optimized Table Storage Models
          • Heap storage
          • Append-optimized storage
          • To create a table with specified storage options
          • Choose row or column-oriented storage
        • PAX Storage Format
          • Applicable scenarios
          • Usage
          • Support for TOAST
          • Support for clustering
          • Bloom filter support
          • Sparse filtering
          • View data change records on PAX tables in WAL logs
          • Inspect PAX table internal structure
          • Limitations for PAX tables
          • PAX-related SQL options
          • PAX-related system parameters
          • Best practices
        • UnionStore Storage Format
          • Use cases
          • Prerequisites
          • Install UnionStore
          • Usage
          • Limitations
      • Cross-Cluster Federated Query
        • User scenarios
        • Usages
          • Prerequisites
          • Step 1. Create a foreign data wrapper
          • Step 2. Create a user mapping
          • Step 3. Create a foreign table and add shards
          • Step 4. Perform Cross-Cluster Federated Query
      • Use Tags to Manage Database Objects
        • What is a tag?
        • Features of tags
        • Usage scenarios
        • Use tags
          • Query existing tag information
          • Create tags
          • Delete tags
          • Modify tags
          • Assign tags to objects
          • Tag comments
        • System tables related to tags
          • pg_tag table
          • pg_tag_description table
          • Handle global and non-global objects
        • Common errors and tips
    • SQL Queries
      • Join Queries
        • Join types
          • LEFT OUTER JOIN
          • RIGHT OUTER JOIN
          • FULL OUTER JOIN
          • CROSS JOIN
          • NATURAL JOIN
        • Join conditions
          • ON clause
          • USING clause
        • LATERAL
        • Example
          • INNER JOIN example
          • LEFT OUTER JOIN example
          • RIGHT OUTER JOIN example
          • FULL OUTER JOIN example
          • CROSS JOIN example
    • Advanced Analytics
      • Use pgvector for Vector Similarity Search
        • Quick start
        • Store data
        • Query data
        • Index data
          • HNSW index
          • IVFFlat index
          • Use filters in indexes
          • Half-precision vectors
          • Half-precision index
          • Binary vectors
        • Hybrid search
        • pgvector performance
          • Exact search
          • Approximate search
      • Vectorization Query Computing
        • Enable vectorization
        • Usage
          • Enable or disable vectorization queries
          • Set vectorization batch size
          • Set hash join spill memory budget
          • Verify whether a query is vectorized
          • Features supported by vectorization
          • Threaded execution on single node
        • Performance evaluation
          • TPC-H
          • TPC-DS
      • Use MADlib for Machine Learning and Deep Learning
        • Add MADlib functions to the database
        • Uninstall MADlib from the database
        • Usage examples
          • Check the MADlib version
          • Scenario 1: Perform linear regression on a database table
          • Scenario 2: Use association rules
          • Scenario 3: Perform Naive Bayes classification
      • Directory Tables
        • Usage
          • Create a directory table
          • Create index on a directory table
          • View the field information of directory table
          • Upload file into directory table
          • Export files to local
          • Query and use the files managed by directory table
          • Add transaction lock to directory table
          • Delete the file managed by directory table
          • Delete a directory table
          • Back up and restore directory tables
      • Use RoaringBitMap Bitmap Operations
        • Usage
          • Load the plugin
          • Create tables
          • Create a bitmap
          • Bitmap union operations
          • Bitmap union and intersection aggregation operations
          • Statistical cardinality
          • Convert bitmap to SETOF integer
          • Conversion between bytea and roaringbitmap data types
    • Auto-Execute SQL commands
      • Syntax description
        • CREATE TASK
        • ALTER TASK
        • DROP TASK
      • View Create Task metadata
      • Parameter tuning description
      • Usage examples
    • Use Oracle Compatibility SQL via Orafce
      • Install and remove Orafce
      • SynxDB considerations
        • SynxDB implementation differences
      • Using Orafce
  • Optimize Performance
    • Optimize Query Performance
      • Query Processing Overview
        • Query planning and dispatch
        • Query plans
        • Parallel query execution
        • Interpret the EXPLAIN ANALYZE output for hash aggregation
      • Query Performance Overview
        • Dynamic partition elimination
        • Memory optimization
        • Numeric aggregation optimization
      • Use GPORCA Optimizer
        • GPORCA overview
          • Enable or disable GPORCA
          • Determine which query optimizer is used
        • GPORCA Features and Enhancements
          • Enhancements for partitioned table queries
          • Subquery optimization
          • Common table expression (CTE) optimization
          • DML operation optimization
          • Other optimization capabilities
        • GPORCA Performance Tuning
          • Quick reference
          • Accelerate queries with parallel execution
          • Reduce I/O with index optimizations
          • Improve join performance
          • Handle data skew in window functions
          • Optimize top-N-per-group queries with WindowAgg Top-K pushdown
          • Enable vectorized execution for partitioned tables
      • Update Statistics
        • Checking whether statistics are up to date
        • Generate statistics selectively
        • Skip unmodified relations
        • Improve statistics quality
        • When to run ANALYZE
        • Configure automatic statistics collection
      • Query Plan Hints
        • Quick example
        • Cardinality hints
        • Table access hints
        • Join type hints
        • Join order hints
        • Supported scope and limitations
        • Best practices for using query plan hints
      • Use Unique Indexes on AO Tables
        • How it works
        • How to add a unique index to an AO table
        • Usage examples
      • Use Automatic Materialized Views for Query Optimization
        • Usage scenarios
        • Implementation
        • Comparison with dynamic tables
        • Restrictions
        • Usage examples
        • Use materialized views to query external tables
        • Aggregate query support
          • Aggregate query support
          • HAVING clause processing
          • LIMIT and ORDER BY Support
          • Notes for aggregate query support
      • Use Incremental Materialized Views
        • Use cases
        • Usage example
        • Query performance comparison with regular views
          • Example of using a regular view
          • Example of using an incremental materialized view
          • TPC-H test result comparison
        • Usage restrictions and notes
      • Execute Queries in Parallel
        • Use cases
        • How to use
          • Parallel execution with GPORCA
          • Parallel index scan with GPORCA
          • Parallel index only scan with GPORCA
          • Parallel execution with Postgres query optimizer
          • Query heap tables in parallel
          • Query AO/AOCO tables in parallel
          • Query partitioned tables in parallel
          • Execute UNION operations in parallel
          • Execute DISTINCT operations in parallel
        • Parameter descriptions
        • Frequently asked questions
      • Use Index Scan on AO Tables
        • Use cases
        • Usage example
      • Push Down Aggregation Operations
        • Usage example
        • Use cases
          • Scenario one
          • Scenario two
        • Not recommended scenarios
          • Not applicable scenario one
          • Not applicable scenario two
        • Usage restrictions
          • Restriction one
          • Restriction two
      • Optimize HashJoin Query Performance
        • Use RuntimeFilter in the Postgres optimizer
        • Push down filters to dynamic scans in GPORCA
        • Push down runtime filters to Table Access Method (AM)
      • Just-in-Time Compilation (JIT)
        • What is JIT compilation
        • User scenarios
        • Principles of JIT compilation
          • JIT accelerated operations
          • In-line compilation (Inlining)
          • Optimization
        • How to use JIT compilation
          • Prerequisites
          • Configuration
          • Usage examples
        • Notes and limitations
        • Related reference information
    • Create Indexes Concurrently
      • CREATE INDEX CONCURRENTLY
        • Use cases
        • Usage
        • Usage restrictions
        • Common issue handling
      • REINDEX CONCURRENTLY
        • Use cases
        • How it works
        • Usage
        • Usage restrictions
        • Common issue handling
    • Create AO/AOCO Tables in Parallel and Refresh Materialized Views
      • Usage examples
        • Create AO/AOCO tables in parallel
        • Refresh materialized views in parallel
      • Performance improvement demonstration
    • Manage Resources Using Resource Groups
      • Role and component resource groups
      • Resource group attributes and limits
        • Transaction concurrency limit
        • Bypass limits and unassign from resource groups
        • CPU limits
          • Assign CPU resources by core
          • Assign CPU resources by percentage
        • Memory limits
        • Disk I/O limits
      • Configure and use resource groups
        • Prerequisites
          • Configure cgroup v1
          • Configure cgroup v2
      • Enable resource groups
      • Create resource groups
      • Configure automatic query termination based on memory usage
      • Assign a resource group to a role
      • Monitor resource group status
        • View resource group limits
        • View resource group query status
        • View resource group memory usage per host
        • View the resource group assigned to a role
        • View resource group disk I/O usage per host
        • View a resource group’s running and pending queries
        • Cancel a running or queued transaction in a resource group
      • Move a query to a different resource group
      • Frequently asked questions
    • Use Dynamic Tables to Speed Up Queries and Auto-Refresh Data
      • Use cases
      • Comparison with materialized views
      • Usage
        • Create a dynamic table
        • Refresh a dynamic table
        • View schedule information
        • Drop a dynamic table
        • View distribution key
      • Examples
        • Example 1: Accelerate external table queries in lake-house architecture
        • Example 2: Create an empty dynamic table
      • Notes
  • AI & ML (SynxML)
    • Deploy SynxML Platform
      • Steps to deploy SynxML Platform
        • Preparation
        • Step 1: Install the RPM packages
        • Step 2: Configure SynxML Platform
        • Step 3: Start the SynxML Platform service
        • Step 4: Access the SynxML Platform console in your browser
      • Manage SynxML Platform services
      • Additional notes
    • Use the SynxML Platform
      • Core features of SynxML Platform
      • Use cases for SynxML Platform
      • Use the SynxML Platform console
        • Log into the SynxML console
        • Train a model
          • Create a training task
        • View training tasks
        • Data exploration and insights
        • Model factory
          • Import a model
          • Deploy a model
          • View deployed services
        • Use notebook-based modeling
          • Create a notebook
          • Manage and operate notebooks
    • Use AIFun
      • Key features
      • Supported providers
      • Prerequisites
      • Install the extension
      • Get started
        • Configure your provider
        • Basic usage
          • Text generation and chat
          • Text analysis and processing
          • Vision and multimodal
          • Document parsing
        • Work with tables
        • Work with directory tables
      • Manage providers
      • Security model
      • List of functions
  • Manage System
    • View Monitoring Data Using the Web Console (DBCC)
      • Architecture and Principle of the Monitor Console
        • Core architecture
          • Architecture diagram
          • Main components
        • Workflows and principles
          • Agent registration (Push)
          • Metric collection (Pull)
          • Command execution (Proxy)
          • Alert (Push)
      • Install Monitoring Console
        • Prerequisites
        • Step 1: (Optional) Configure before installation
          • Customize ports
          • Configure coordinator auto-failover
          • Configure scan table task
          • Disable multi-language support
          • Modify default administrator credentials
          • Adjust monitoring sensitivity (Prometheus)
          • Adjust Agent data collection frequency
        • Step 2: Install DBCC components
          • Install Server
          • Install Agent
        • Next steps
        • Appendix: Daily maintenance and reference
          • Maintenance commands
          • Configuration file path reference
          • Default port reference
          • Troubleshooting
      • View Cluster Information
        • Steps
      • Manage Clusters
        • Access the cluster management page
        • Modify cluster name
        • Enable maintenance mode
        • Delete cluster
      • Cluster Status and Metrics
        • Access the pages
        • View the overall status and data of the cluster
          • Cluster CPU usage status
          • Cluster memory usage status
          • Cluster disk I/O rate
          • Network I/O rate
        • View the status and data of nodes and hosts
          • Cluster metrics
          • Host metrics
        • Bind virtual IP
      • View Database Object Information
        • View table objects
        • View partitioned table details
      • View SQL Monitoring Information
        • Access the page
        • View SQL execution status
          • View historical SQL information
          • View in-progress SQL
          • View SQL Details
          • Cancel SQL execution
        • View session status
      • View Storage Information
        • Steps
      • Configure Database
        • Access the page
        • Modify database parameters
        • Modify database authentication
          • View authentication rules
          • Add an authentication rule
          • Modify or delete a rule
          • Reorder rules
        • Manage historical versions
      • View Recommendations
        • Access the page
        • View bloat ratio
        • View skew rate
      • View and Create Alert Rules
        • Access the alert page
        • View existing alert rules
        • Create an alert rule
          • Select alert type
          • Set alert rules
          • Set notification policy
          • Save the configuration
      • View and Create Contact Groups
        • Access the contact group page
        • View existing contact groups
        • Create a contact group
    • Backup and Restore
      • Backup and Restore Overview
        • Parallel backup with gpbackup and gprestore
          • Command-line flags for gpbackup and gprestore
        • Non-parallel backup with pg_dump
        • Backup and recovery with CBDR at the WAL level
      • Perform Full Backup and Restore
        • Back up the full database
        • Restore the full database
        • Filter the contents of a backup or restore
          • Filter by schema
          • Filter by table
          • Filter with gprestore
          • Filter by leaf partition
        • Check report files
        • Configure email notifications
          • gpbackup and gprestore email file format
          • Email YAML file sections
          • Email examples
      • Perform Incremental Backup and Restore
        • About incremental backup sets
        • Use incremental backups
          • Example using incremental backup sets
          • Create an incremental backup with gpbackup
          • Restore from an incremental backup with gprestore
        • Incremental backup notes
      • CBDR
        • Full backup and restore procedure
          • Backup process
          • Restore process
        • Incremental backup and restore procedure
        • Continuous archiving recovery (PITR) and hot standby procedure
          • Steps for backing up primary cluster
          • Steps for restoring recovery cluster and setting up hot standby
        • Configuration file reference
        • Command usage
          • Configure commands
          • Cluster backup
          • View backup list
          • Restore command
          • Delete command
          • Continuous recovery and restore point commands
    • Expand and Shrink a Cluster
      • Shrink a cluster using gpshrink
      • Expand a cluster using gpexpand
    • Upgrade SynxDB
      • Compatibility
      • Upgrade methods
      • Quick guide
      • Script testing and validation
    • Upgrade Using gpupgrade
      • Before you begin
      • Supported versions
      • Configure gpupgrade
      • Perform the upgrade
      • Revert the upgrade
      • Other operations
      • Known limitations
    • Configure Security and Permission
      • Configure Password Policies
        • How it works
        • Set password policies using SQL syntax
          • CREATE PROFILE
          • ALTER PROFILE
          • DROP PROFILE
          • CREATE USER … PROFILE
          • ALTER USER … PROFILE
          • CREATE USER … ENABLE/DISABLE PROFILE
          • ALTER USER … ENABLE/DISABLE PROFILE
          • CREATE USER … ACCOUNT LOCK/UNLOCK
          • ALTER USER … ACCOUNT LOCK/UNLOCK
        • View password policy information in system tables
        • Default password policy
        • Usage examples
          • Create a password policy
          • Scenario 1: Set the maximum number of failed login attempts and the password lock time
        • Scenario 2: Set the number of historical password reuses
        • Scenario 3: Use the settings of the default profile
        • Scenario 4: Superuser locks or unlocks user accounts
        • Scenario 5: Enable profile for regular users
      • Check Password Security
        • Usage scenarios
        • How to use
          • Enable strict password policy
        • Limitations
      • Encrypt Data with pgcrypto
        • Notes on OpenSSL-related algorithm support
        • General hashing functions
          • digest()
          • hmac()
        • Password hashing functions
          • crypt()
          • gen_salt()
        • PGP encryption functions
          • pgp_sym_encrypt()
          • pgp_sym_decrypt()
          • pgp_pub_encrypt()
          • pgp_pub_decrypt()
          • pgp_key_id()
          • armor(), dearmor()
          • pgp_armor_headers
          • Options for PGP functions
          • Generate PGP keys with GnuPG
          • Limitations of the PGP code
        • Raw encryption functions
        • Random-data functions
          • gen_random_bytes()
          • gen_random_uuid()
        • Notes on random number generation
      • Anonymize Data Using Anon
        • How it works
        • Install Anon
        • User guide
          • Dynamic masking
          • Masking functions
      • Use TDE to Encrypt Data
        • Prerequisites
        • How it works
          • Basic concepts
          • Encryption algorithms
          • Implementation of transparent data encryption
        • How to use
        • How to verify
          • Verification process
        • Performance evaluation
          • Test environment
          • Test data
      • Log Audit with pgAudit
        • Compile and install
        • Enable pgAudit
        • pgAudit settings
          • pgaudit.log_class
          • pgaudit.log_catalog
          • pgaudit.log_client
          • pgaudit.log_level
          • pgaudit.log_parameter
          • pgaudit.log_relation
          • pgaudit.log_rows
          • pgaudit.log_statement
          • pgaudit.log_statement_once
          • pgaudit.role
        • Session audit log
          • Configuration
          • Example
        • Object audit log
          • Settings
          • Example
        • Format
        • Log visualization and retrieval
      • Manage Roles and Privileges in SynxDB
        • Create new roles (users)
          • Alter role attributes
        • Role membership
        • Manage object privileges
        • Security best practices for roles and privileges
        • Data confidentiality disclaimer
        • Encrypt data
    • Configure Database System
      • Coordinator and local parameters
      • Set configuration parameters
        • Set a local configuration parameter
        • Set a coordinator configuration parameter
          • Set parameters at the system level
          • Set parameters at the database level
          • Set parameters at the row level
          • Set parameters in a session level
      • View server configuration parameter settings
        • View parameters using the pg_settings view
  • Component Guides
    • Using ZomboDB to integrate with Elasticsearch
      • How it works
        • Create index and load data process
        • Query data process
      • Create the ZomboDB extension
      • Usage instructions
        • Create and use indexes
          • Create indexes
          • Modify indexes
          • Drop indexes
          • Syntax options
          • Usage examples
        • ZomboDB query syntax
          • Boolean queries
          • Field-specific queries
          • Value list queries
          • Proximity search
          • Elasticsearch JSON queries
        • Query DSL
        • ZomboDB query statements
          • Direct JSON
          • SQL Builder API
        • Aggregation functions
          • Arbitrary JSON aggregations
          • Single-value aggregations
          • Multi-row/column aggregations
        • Sorting and highlighting
        • SQL functions
      • ES _cat API
      • Vacuum support
    • Manage partitioned tables with pg_partman
      • Enable the pg_partman extension
      • Usage example
      • Main features
    • pg_search Full Text Search
      • Enable the pg_search extension
      • How to use
        • Use aggregate functions
      • Considerations and limitations
  • Reference Guides
    • SQL Statements
      • ABORT
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • ALTER DATABASE
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • ALTER RULE
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • ALTER TABLESPACE
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • CREATE DATABASE
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • CREATE INDEX
        • Synopsis
        • Description
        • Parameters
        • Index storage parameters
        • Notes
        • Examples
        • Compatibility
      • CREATE TABLESPACE
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • DROP DATABASE
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • DROP EXTENSION
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP EXTERNAL TABLE
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP INDEX
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP MATERIALIZED VIEW
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP ROLE
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP RULE
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP SCHEMA
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • DROP TABLE
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP TABLESPACE
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • DROP TYPE
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • DROP USER
        • Synopsis
        • Description
        • Compatibility
        • See also
      • DROP VIEW
        • Synopsis
        • Description
        • Parameters
        • Examples
        • Compatibility
      • END
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • ROLLBACK TO SAVEPOINT
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • ROLLBACK
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • SAVEPOINT
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • SET ROLE
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
      • SET TRANSACTION
        • Synopsis
        • Description
        • Notes
        • Examples
        • Compatibility
      • START TRANSACTION
        • Synopsis
        • Description
        • Parameters
        • Compatibility
      • TRUNCATE
        • Synopsis
        • Description
        • Parameters
        • Notes
        • Examples
        • Compatibility
    • System Utilities
      • gpdemo
        • How to use
          • Deploy with default settings
          • Deploy with customized settings
        • Command-line options
    • System Catalogs
      • System Views
        • gp_stat_progress_create_index
        • gp_stat_progress_create_index_summary
        • gp_stat_activity
        • gp_stat_all_indexes
        • gp_stat_all_indexes_summary
        • gp_stat_all_tables
        • gp_stat_all_tables_summary
        • gp_stat_archiver
        • gp_stat_archiver_summary
        • gp_stat_bgwriter
        • gp_stat_bgwriter_summary
        • gp_stat_database
        • gp_stat_database_summary
        • gp_stat_gssapi
        • gp_stat_operations
        • gp_stat_progress_analyze
        • gp_stat_progress_analyze_summary
        • gp_stat_progress_basebackup
        • gp_stat_progress_cluster
        • gp_stat_progress_cluster_summary
        • gp_stat_progress_copy
        • gp_stat_progress_vacuum_summary
        • gp_stat_slru_summary
        • gp_stat_sys_indexes_summary
        • gp_stat_sys_tables_summary
        • gp_stat_user_functions_summary
        • gp_stat_user_indexes
        • gp_stat_user_indexes_summary
        • gp_stat_wal_summary
        • gp_stat_xact_all_tables
        • gp_stat_xact_all_tables_summary
        • gp_stat_xact_sys_tables_summary
        • gp_stat_xact_user_functions_summary
        • gp_stat_xact_user_tables_summary
        • gp_statio_all_indexes
        • gp_statio_all_indexes_summary
        • gp_statio_all_sequences
        • gp_statio_all_sequences_summary
        • gp_statio_all_tables_summary
        • gp_statio_sys_indexes_summary
        • gp_statio_sys_sequences_summary
        • gp_statio_sys_tables_summary
        • gp_statio_user_indexes_summary
        • gp_statio_user_sequences_summary
        • gp_statio_user_tables
        • gp_stat_user_tables_summary
      • The gp_toolkit Administrative Schema
        • About the extension
        • Upgrade the extension
        • Check for tables that need routine maintenance
          • gp_bloat_diag
          • gp_stats_missing
        • Check for locks
          • gp_locks_on_relation
          • gp_locks_on_resqueue
        • Check append-optimized tables
          • __gp_aovisimap_compaction_info(oid)
          • __gp_aoseg(regclass)
          • __gp_aoseg_history(regclass)
          • __gp_aocsseg(regclass)
          • __gp_aocsseg_history(regclass)
          • __gp_aovisimap(regclass)
          • __gp_aovisimap_hidden_info(regclass)
          • __gp_aovisimap_entry(regclass)
          • __gp_aoblkdir(regclass)
          • get_column_size(oid)
          • gp_column_size
          • gp_column_size_summary
        • View server log files
          • gp_log_command_timings
          • gp_log_database
          • gp_log_coordinator_concise
          • gp_log_system
        • Check server configuration files
          • gp_param_setting(‘parameter_name’)
          • gp_param_settings_seg_value_diffs
        • Check for failed segments
          • gp_pgdatabase_invalid
        • Check resource group activity and status
          • gp_resgroup_config
          • gp_resgroup_role
          • gp_resgroup_status
          • gp_resgroup_status_per_host
          • gp_resgroup_status_per_segment
        • Check resource queue activity and status
          • gp_resq_activity
          • gp_resq_activity_by_queue
          • gp_resq_priority_statement
          • gp_resq_role
          • gp_resqueue_status
        • Check query disk spill space usage
          • gp_workfile_entries
          • gp_workfile_usage_per_query
          • gp_workfile_usage_per_segment
        • View users and groups (roles)
          • gp_roles_assigned
        • Check database object sizes and disk space
          • gp_size_of_all_table_indexes
          • gp_size_of_database
          • gp_size_of_index
          • gp_size_of_schema_disk
          • gp_size_of_table_and_indexes_disk
          • gp_size_of_table_and_indexes_licensing
          • gp_size_of_table_disk
          • gp_size_of_table_uncompressed
          • gp_disk_free
        • Check for missing and orphaned data files
          • gp_check_orphaned_files
          • gp_check_missing_files
          • gp_check_missing_files_ext
        • Move orphaned data files
        • Check for uneven data distribution
          • gp_skew_coefficients
          • gp_skew_idle_fractions
        • Maintain partitions
          • The gp_partitions view
          • User-defined functions for partition maintenance
    • Configuration Parameters
      • analyze_only_modified_relations
      • autovacuum_freeze_max_age
      • autovacuum_vacuum_cost_delay
      • autovacuum_vacuum_scale_factor
      • autovacuum_vacuum_threshold
      • checkpoint_timeout
      • datalake.enable_get_block_location
      • enable_lock_optimization
      • gp_appendonly_compaction_segfile_limit
      • gp_appendonly_insert_files
      • gp_autostats_lock_wait
      • gp_command_count
      • gp_dynamic_partition_pruning
      • gp_enable_runtime_filter_pushdown
      • gp_enable_statement_trigger
      • gp_max_partition_level
      • gp_resource_group_cgroup_parent
      • gp_resource_manager
      • gp_role
      • gp_session_id
      • gp_use_streaming_hashagg
      • hash_mem_multiplier
      • jit
      • jit_above_cost
      • jit_debugging_support
      • jit_dump_bitcode
      • jit_expressions
      • jit_inline_above_cost
      • jit_optimize_above_cost
      • jit_profiling_support
      • jit_provider
      • jit_tuple_deforming
      • krb_server_keyfile
      • log_checkpoints
      • max_connections
      • max_parallel_workers_per_gather
      • max_replication_slots
      • min_parallel_table_scan_size
      • optimizer_array_constraints
      • optimizer_array_expansion_threshold
      • optimizer_cost_model
      • optimizer_cost_threshold
      • optimizer_cte_inlining_bound
      • optimizer_damping_factor_filter
      • optimizer_damping_factor_groupby
      • optimizer_damping_factor_join
      • optimizer_discard_redistribute_hashjoin
      • optimizer_dpe_stats
      • optimizer_enable_derive_stats_all_groups
      • optimizer_enable_dynamicbitmapscan
      • optimizer_enable_dynamicindexonlyscan
      • optimizer_enable_dynamicindexscan
      • optimizer_enable_foreign_table
      • optimizer_enable_indexonlyscan
      • optimizer_enable_indexscan
      • optimizer_enable_orderedagg
      • optimizer_enable_parallel_append
      • optimizer_enable_parallel_groupagg
      • optimizer_enable_parallel_hashagg
      • optimizer_enable_parallel_part_selector
      • optimizer_parallel_union
      • optimizer_enable_push_join_below_union_all
      • optimizer_enable_query_parameter
      • optimizer_enable_right_outer_join
      • optimizer_force_partition_topk
      • optimizer_force_split_window_function
      • optimizer_force_three_stage_scalar_dqa
      • optimizer_nestloop_factor
      • optimizer_penalize_broadcast_threshold
      • optimizer_push_group_by_below_setop_threshold
      • optimizer_replicated_table_insert
      • optimizer_skew_factor
      • optimizer_sort_factor
      • optimizer_trace_fallback
      • optimizer_use_gpdb_allocators
      • optimizer_xform_bind_threshold
      • parallel_setup_cost
      • passwordcheck.strict_policy
      • pax.enable_iouring
      • pax.enable_prefetch
      • pax.enable_rle_batch_encoding
      • pg_gophermeta.gphdfs_configure_router
      • pg_gophermeta.gopher_connect_hdfs_disable_getstate
      • pg_gophermeta.gopher_enable_update_oss_context
      • pg_gophermeta.gopher_hash_connect_hdfs_router
      • pg_gophermeta.gopher_local_capacity_mb
      • pg_gophermeta.register_gophermeta
      • superuser_reserved_connections
      • track_io_timing
      • vector.hashjoin_spill_memory_mb
      • vector.winagg_spill_work_mem
      • wal_compression
      • wal_keep_size
      • work_mem
      • writable_external_table_bufsize
  • Tutorials
    • System Configuration Best Practices
      • Configure the time zone
      • File system
      • Port configuration
      • I/O configuration
      • Operating system memory settings
      • Shared memory settings
        • Configure shared buffers
      • Kernel shmem settings
      • Regular huge pages
        • Examples
      • Number of segments per host
  • Developer Guides
    • Develop Database Extensions Using PGRX
      • Requirements for development environment
        • Basic software environment
        • PostgreSQL dependencies
      • Quick start for PGRX
        • Set up and install PGRX
        • Create an extension
        • Install and use the extension
      • PGRX type mapping
        • Custom type conversions
        • Type mapping details
      • PGRX core features
        • Complete management for development environment
        • Automatic mode generation
        • Security first
        • UDF supports
        • Simple custom types
        • Server programming interface (SPI)
        • Advanced features
      • Considerations and best practices for PGRX
      • Debugging and development tips
      • Learning resources for PGRX
    • Procedural Languages and Run-time
      • Use PL/Container
        • About the PL/Container language extension
          • PL/Container architecture
        • Configure and use PL/Container
          • Prerequisites
          • Step 1: Install Docker
          • Step 2: Enable PL/Container
          • Step 3: Install and configure Docker images
          • Step 4: Test the installation
        • Write PL/Container functions
          • Function definition
          • Function examples
          • About PL/Python functions
          • Function limitations
        • Manage PL/Container
          • Manage configurations and containers in a session
        • Advanced topics
          • Resource management
          • Logging
          • Use CUDA for GPU acceleration
          • Configure remote PL/Container
        • Notes
    • SynxDB MCP Service
      • Core features
      • Use cases
      • Installation
        • Prerequisites
        • Installation steps
      • Usage guide
        • Quick start: Connect to a demo cluster
          • Step 1: Configure database connection permissions
          • Step 2: Create an environment configuration file
          • Step 3: Start the MCP service
          • Step 4: Configure the LLM client
        • Integrate with an LLM client
          • Claude Desktop
          • Cursor
          • Windsurf
          • VS Code (with the Cline extension)
      • Important notes and limitations
      • Reference guide
        • Configuration methods
        • API reference
          • Resources
          • Tools
          • Hints
        • Troubleshooting
          • Common issues
          • Debug mode
SynxDB Documentation
  • Component Guides
  • pg_search Full Text Search

pg_search Full Text Search

pg_search is an extension for SynxDB that provides powerful full-text search capabilities for database tables using the BM25 algorithm. It is built on Tantivy, a search engine library implemented in Rust, and integrates with SynxDB via the pgrx framework.

The main application scenarios and benefits of pg_search include:

  • High-performance full-text search: Provides efficient search capabilities for text data in the database, surpassing traditional LIKE queries.

  • Seamless integration with the database: Runs as an extension and integrates seamlessly with existing databases and table structures without requiring data migration.

  • Rich query functionality: Supports complex boolean logic queries, phrase queries, and aggregate function pushdown, which can accelerate data analysis and report generation.

Enable the pg_search extension

Follow these steps to enable the pg_search extension:

  1. Modify the database configuration: Add pg_search to shared_preload_libraries. This is a mandatory step; otherwise, creating an index will cause the database connection to crash or hang.

    gpconfig -c shared_preload_libraries -v pg_search
    

    After modifying the configuration, restart the database cluster.

  2. Enable the pg_search extension: Connect to the database and execute the following SQL command to enable the pg_search extension:

    CREATE EXTENSION pg_search;
    

How to use

The following example demonstrates how to use pg_search with a product information table.

  1. Create a sample table and insert some data:

    CREATE TABLE products (
        id SERIAL PRIMARY KEY,
        description TEXT,
        rating INTEGER,
        category TEXT,
        price NUMERIC,
        in_stock BOOLEAN
    );
    
    INSERT INTO products (description, rating, category, price, in_stock) VALUES
        ('Laptop with fast processor', 5, 'Electronics', 999.99, true),
        ('Gaming laptop with RGB', 5, 'Electronics', 1299.99, true),
        ('Toy laptop for kids', 3, 'Toys', 499.99, false),
        ('Wireless keyboard and mouse', 4, 'Electronics', 79.99, true),
        ('Mechanical keyboard RGB', 5, 'Electronics', 149.99, true),
        ('Running shoes for athletes', 5, 'Sports', 89.99, true),
        ('Winter jacket warm', 4, 'Clothing', 129.99, true),
        ('Summer jacket light', 3, 'Clothing', 59.99, true);
    
  2. Create a full-text search index: Use the USING bm25 clause to create a pg_search index for the table.

    CREATE INDEX products_idx ON products 
    USING bm25 (id, description, rating, category, price)
    WITH (
        key_field='id',
        text_fields='{"description": {}, "category": {"fast": true}}',
        numeric_fields='{"rating": {"fast": true}, "price": {"fast": true}}'
    );
    

    Parameters in the WITH clause:

    • key_field: Specifies the primary key field of the table.

    • text_fields: Defines the text fields to be indexed for full-text search.

      • "fast": true: Marks the field as a “fast field,” which allows direct access to field values in scenarios like aggregate functions, resulting in higher performance.

    • numeric_fields: Defines the numeric fields to be indexed. These can also be configured as “fast fields.”

  3. Execute queries. Use the @@@ operator to perform full-text searches. The syntax is similar to PostgreSQL’s LIKE operator but offers better performance.

    • Basic query. Query for products where the description field contains “laptop”:

      SELECT * FROM products WHERE description @@@ 'laptop';
      
    • Use boolean logic. Query for products where description contains “laptop” or “keyboard”, and the rating is greater than or equal to 4:

      SELECT * FROM products 
      WHERE ((description @@@ 'laptop') OR (description @@@ 'keyboard')) 
      AND (rating >= 4);
      

Use aggregate functions

pg_search supports pushing down aggregate functions (COUNT, SUM, AVG, MIN, MAX) to the index level for execution, which significantly improves query performance. This feature is only effective for columns configured as “fast fields.”

Example: Count the number of matching products and calculate their total price:

-- Query the count and sum of price for products where description contains 'laptop'
EXPLAIN (FORMAT TEXT, COSTS OFF, TIMING OFF, VERBOSE)
SELECT COUNT(*), SUM(price) FROM products WHERE description @@@ 'laptop';

The EXPLAIN output shows that the query plan uses a Custom Scan (ParadeDB Scan), indicating that the aggregation operation has been successfully pushed down.

Considerations and limitations

  • shared_preload_libraries configuration: For SynxDB, you must add pg_search to shared_preload_libraries and restart the database. This is a prerequisite for the extension to work correctly.

  • Cloud database services: On managed cloud database services (such as Amazon RDS), you may not be able to install pg_search unless the provider explicitly supports the extension.

  • Aggregate function limitations:

    • Aggregate pushdown only works for fields marked as "fast": true in the index.

    • DISTINCT aggregates are not supported. Queries containing DISTINCT will fall back to PostgreSQL’s native execution plan and will not benefit from the performance advantages of index acceleration.

  • Background process: pg_search relies on a background worker process to handle index writes. If this process does not start correctly, it may lead to database connection issues.

  • Query limitations: pg_search currently only supports single-table queries and does not support multi-table joins.

  • Partitioned tables: pg_search does not support partitioned tables.

Next Previous

© 2026, Synx Data Labs. All rights reserved.

SynxDB Documentation

On this page
  • pg_search Full Text Search
    • Enable the pg_search extension
    • How to use
      • Use aggregate functions
    • Considerations and limitations